Jonathan Wakely [Thu, 1 May 2025 21:41:40 +0000 (22:41 +0100)]
libstdc++: Make __gnu_test::default_init_allocator usable in constexpr
If we make this test allocator usable in constant expressions then we'll
get an error if the 'state' data member isn't initialized. This makes it
a more reliable check that allocators are correctly value-initialized
when they're required to be.
libstdc++-v3/ChangeLog:
* testsuite/23_containers/vector/allocator/default_init.cc:
Add a check using constant evaluation.
* testsuite/23_containers/vector/bool/allocator/default_init.cc:
Likewise.
* testsuite/util/testsuite_allocator.h (default_init_allocator):
Make all member functions and equality ops constexpr.
Jonathan Wakely [Thu, 10 Apr 2025 11:56:43 +0000 (12:56 +0100)]
libstdc++: Add some more makefile dependencies
Add more prerequisites for wchar and dual-abi targets in the src/c++11
directory, and simplify the existing ones (we don't need to add the main
xxx.cc source file as a prerequisite of xxx.o because that's implicit,
we only need to add the ones that Make can't determine on its own).
Also add similar prerequisites for the dual-abi targets in the src/c++17
directory.
libstdc++-v3/ChangeLog:
* src/c++11/Makefile.am: Simplify existing prerequisites for wchar and
dual-abi targets that are built from other sources. Add similar
prerequisites for more wchar and dual-abi files.
* src/c++11/Makefile.in: Regenerate.
* src/c++17/Makefile.am [ENABLE_DUAL_ABI]: Add prerequisites for
dual-abi targets that are built from other sources.
* src/c++17/Makefile.in: Regenerate.
Filip Kastl [Thu, 1 May 2025 13:32:36 +0000 (15:32 +0200)]
gimple: Switch bit-test lowering testcases for the more powerful alg
This patch adds 2 testcases. One tests that GCC is able to create
bit-test clusters of size 64. The other one contains two switches which
GCC wouldn't completely cover with bit-test clusters before the changes
from this patch set.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/switch-5.c: New test.
* gcc.dg/tree-ssa/switch-6.c: New test.
Filip Kastl [Thu, 1 May 2025 13:31:30 +0000 (15:31 +0200)]
gimple: Make bit-test switch lowering more powerful
A reasonable goal for bit-test lowering is to produce the least amount
of clusters for a given switch (a cluster is basically a group of cases
that can be handled by constantly many operations).
The current algorithm doesn't always give optimal solutions in that
sense. This patch should fix this. The important thing is basically
just to ask if a cluster is_beneficial() more proactively.
The patch also has a fix for a mistake which made bit-test lowering only
create BITS_IN_WORD - 1 big clusters. There are also some new comments
that go into more detail on the dynamic programming algorithm.
gcc/ChangeLog:
* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests):
Modify the dynamic programming algorithm to take is_beneficial()
into account earlier. To do this efficiently, copy some logic
from is_beneficial() here. Add detailed comments about how the
DP algorithm works.
(bit_test_cluster::can_be_handled): Check that the cluster range
is >, not >= BITS_IN_WORD. Remove the
"vec<cluster *> &, unsigned, unsigned" overloaded variant since
we no longer need it.
(bit_test_cluster::is_beneficial): Add a comment that this
function is closely tied to m_max_case_bit_tests. Remove the
"vec<cluster *> &, unsigned, unsigned" overloaded variant since
we no longer need it.
* tree-switch-conversion.h: Remove the vec overloaded variants
of bit_test_cluster::is_beneficial and
bit_test_cluster::can_be_handled.
Filip Kastl [Thu, 1 May 2025 13:30:52 +0000 (15:30 +0200)]
gimple: Merge slow and fast bit-test switch lowering [PR117091]
PR117091 showed that bit-test switch lowering can take a lot of time.
The algorithm was O(n^2). We therefore came up with a faster algorithm
(O(n * BITS_IN_WORD)) and made GCC choose between the slow and the fast
algorithm based on how big the switch is.
Here I combine the algorithms so that we get the results of the slower
algorithm in the faster asymptotic time.
PR middle-end/117091
gcc/ChangeLog:
* tree-switch-conversion.cc (bit_test_cluster::find_bit_tests_fast):
Remove function.
(bit_test_cluster::find_bit_tests_slow): Remove function.
(bit_test_cluster::find_bit_tests): We don't need to decide
between slow and fast so just put the modified (no longer) slow
algorithm here.
Jennifer Schmitz [Wed, 12 Mar 2025 07:37:42 +0000 (00:37 -0700)]
aarch64: Optimize SVE extract last for VLS.
For the test case
int32_t foo (svint32_t x)
{
svbool_t pg = svpfalse ();
return svlastb_s32 (pg, x);
}
compiled with -O3 -mcpu=grace -msve-vector-bits=128, GCC produced:
foo:
pfalse p3.b
lastb w0, p3, z0.s
ret
when it could use a Neon lane extract instead:
foo:
umov w0, v0.s[3]
ret
Similar optimizations can be made for VLS with other vector widths.
We implemented this optimization by guarding the emission of
pfalse+lastb in the pattern vec_extract<mode><Vel> by
!val.is_constant ().
Thus, for last-extract operations with VLS, the patterns
*vec_extract<mode><Vel>_v128, *vec_extract<mode><Vel>_dup, or
*vec_extract<mode><Vel>_ext are used instead.
We added tests for 128-bit VLS and adjusted the tests for the other vector
widths.
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve.md (vec_extract<mode><Vel>):
Prevent the emission of pfalse+lastb for VLS.
Jakub Jelinek [Fri, 2 May 2025 07:16:27 +0000 (09:16 +0200)]
++: Small build_vec_init improvement [PR117827]
As discussed in the
https://gcc.gnu.org/pipermail/gcc-patches/2025-January/674492.html
thread, the following patch attempts to improve build_vec_init generated
code. E.g. on g++.dg/eh/aggregate1.C test the patch has differences like:
D.2988 = &D.2950->e1;
D.2989 = D.2988;
D.2990 = 1;
try
{
goto <D.2996>;
<D.2997>:
A::A (D.2989);
D.2990 = D.2990 + -1;
D.2989 = D.2989 + 1;
<D.2996>:
if (D.2990 >= 0) goto <D.2997>; else goto <D.2995>;
<D.2995>:
retval.4 = D.2988;
_13 = &D.2950->e2;
A::A (_13);
- D.2990 = 1;
+ D.2988 = 0B;
D.2951 = D.2951 + -1;
}
catch
{
{
struct A * D.2991;
if (D.2988 != 0B) goto <D.3028>; else goto <D.3029>;
<D.3028>:
_11 = 1 - D.2990;
_12 = (sizetype) _11;
D.2991 = D.2988 + _12;
<D.3030>:
if (D.2991 == D.2988) goto <D.3031>; else goto <D.3032>;
<D.3032>:
D.2991 = D.2991 + 18446744073709551615;
A::~A (D.2991);
goto <D.3030>;
<D.3031>:
goto <D.3033>;
<D.3029>:
<D.3033>:
}
}
in 3 spots. As you can see, both setting D.2990 (i.e. iterator) to
maxindex and setting D.2988 (i.e. rval) to nullptr have the same effect of
not actually destructing anything anymore in the cleanup, the
advantage of clearing rval is that setting something to zero is often less
expensive than potentially huge maxindex and that the cleanup tests that
value first.
2025-05-02 Jakub Jelinek <jakub@redhat.com>
PR c++/117827
* init.cc (build_vec_init): Push to *cleanup_flags clearing of rval
instead of setting of iterator to maxindex.
Andrew Pinski [Thu, 1 May 2025 16:05:47 +0000 (09:05 -0700)]
vect: Use internal storage for converts for call into supportable_indirect_convert_operation [PR118617]
While looking into PR 118616, I noticed that
supportable_indirect_convert_operation only pushes up to 2 into its vec.
And the 2 places which call supportable_indirect_convert_operation,
use an auto_vec but without an internal storage. In this case an internal
storage of 2 elements would save both memory and slight compile time performance.
Bootstrapped and tested on x86_64-linux-gnu.
PR tree-optimization/118617
gcc/ChangeLog:
* tree-vect-generic.cc (expand_vector_conversion): Have 2 elements
as internal storage for converts.
* tree-vect-stmts.cc (vectorizable_conversion): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Thu, 1 May 2025 07:14:27 +0000 (00:14 -0700)]
get_known_nonzero_bits_1 should use wi::bit_and_not [PR118659]
While looking into bitwise optimizations, I noticed that
get_known_nonzero_bits_1 does `bm.value () & ~bm.mask ()` which
is ok except it creates a temporary wide_int. Instead if we
use wi::bit_and_not, we can avoid the temporary and on some
targets use the andn/bic instruction.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
PR tree-optimization/118659
* tree-ssanames.cc (get_known_nonzero_bits_1): Use
wi::bit_and_not instead of `a & ~b`.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Thu, 1 May 2025 15:31:18 +0000 (08:31 -0700)]
expand: Remove unsignedp argument from get_compare_parts [PR118090]
While helping Eikansh with a patch to ccmp, it was noticed that the
result stored in the up pointer that gets passed to get_compare_parts
was unused on all call sites.
It was always unused since get_compare_parts was added in r8-1717-gf580a969d7fbab. It looks it was not noticed it became unused
when rcode was set via get_compare_parts and in RTL, the signedness is
part of the comparison.
PR middle-end/118090
gcc/ChangeLog:
* ccmp.cc (get_compare_parts): Remove the up argument.
(expand_ccmp_next): Update call to get_compare_parts.
(expand_ccmp_expr_1): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Named loops (C2y) could not previously be compiled with
-O1 and -ggdb2 or higher because the label preceding
a loop (or switch) could not be found when using such
command lines.
This could be observed by compiling
gcc/gcc/testsuite/gcc.dg/c2y-named-loops-1.c with
the provoking command line (or any minimal example such
as that cited in the bug report).
The fix was simply to ignore the tree nodes inserted
for debugging information.
* c-decl.cc (c_get_loop_names): Do not prematurely
end the search for a label that names a loop or
switch statement upon encountering a DEBUG_BEGIN_STMT.
Instead, ignore any instances of DEBUG_BEGIN_STMT.
Tobias Burnus [Thu, 1 May 2025 15:39:42 +0000 (15:39 +0000)]
OpenMP: Restore lost Fortran testcase for 'omp allocate'
This testcase, which is present on the OG13 and OG14 branches, was
overlooked when the Fortran support for 'omp allocate' was added to
mainline (commit d4b6d147920b93297e621124a99ed01e7e310d92 from
December 2023).
libgomp/ChangeLog
* testsuite/libgomp.fortran/allocate-8a.f90: New test.
Patrick Palka [Thu, 1 May 2025 15:40:44 +0000 (11:40 -0400)]
c++: poor diag w/ non-constexpr dtor called from constexpr ctor
When diagnosing a non-constexpr constructor call during constexpr
evaluation, explain_invalid_constexpr_fn was passing the genericized
body to require_potential_constant_expression rather than the saved
non-genericized one.
This meant for the below testcase (reduced from PR libstdc++/119282)
in which B::B() is deemed non-constexpr due to the local variable having
a non-constexpr destructor we would then issue the cryptic diagnostic:
constexpr-nonlit19.C:17:16: error: non-constant condition for static assertion
17 | static_assert(f());
| ~^~
constexpr-nonlit19.C:17:16: in ‘constexpr’ expansion of ‘f()’
constexpr-nonlit19.C:13:5: error: ‘constexpr B::B()’ called in a constant expression
13 | B b;
| ^
constexpr-nonlit19.C:6:13: note: ‘constexpr B::B()’ is not usable as a ‘constexpr’ function because:
6 | constexpr B() {
| ^
constexpr-nonlit19.C:8:5: error: ‘goto’ is not a constant expression
8 | for (int i = 0; i < 10; i++) { }
| ^~~
This patch makes us pass the non-genericized body to
require_potential_constant_expression, and so we now emit:
...
constexpr-nonlit19.C:6:13: note: ‘constexpr B::B()’ is not usable as a ‘constexpr’ function because:
6 | constexpr B() {
| ^
constexpr-nonlit19.C:9:3: error: call to non-‘constexpr’ function ‘A::~A()’
9 | }
| ^
constexpr-nonlit19.C:3:12: note: ‘A::~A()’ declared here
3 | struct A { ~A() { } };
| ^
gcc/cp/ChangeLog:
* constexpr.cc (explain_invalid_constexpr_fn): In the
DECL_CONSTRUCTOR_P branch pass the non-genericized body to
require_potential_constant_expression.
Andrew Pinski [Wed, 30 Apr 2025 19:56:13 +0000 (12:56 -0700)]
phiopt: Remove special case for a sequence after match and simplify for early phiopt
r16-189-g99aa410f5e0a72 fixed the case where match-and-simplify there was an extra
assignment happening inside the sequence return. phiopt_early_allow had code to
workaround that issue but now can be removed and simplify down to only allowing
the sequence having only one MIN/MAX if the outer code is MIN/MAX also.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-phiopt.cc (phiopt_early_allow): Only allow a sequence
with one statement for MIN/MAX and the op was MIN/MAX.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Patrick Palka [Thu, 1 May 2025 14:58:50 +0000 (10:58 -0400)]
c++: more overeager use of deleted function before ADL [PR119034]
The PR68942 fix used the tf_conv flag to disable mark_used when
substituting a FUNCTION_DECL callee of an ADL-enabled call. In this
slightly more elaborate testcase, we end up prematurely calling
mark_used anyway on the FUNCTION_DECL directly from the CALL_EXPR case
of tsubst_expr during partial instantiation, leading to a bogus "use of
deleted function" error.
This patch fixes the general problem in a more robust way by ensuring
the callee of an ADL-enabled call is wrapped in an OVERLOAD, so that
tsubst_expr leaves it alone.
PR c++/119034
PR c++/68942
gcc/cp/ChangeLog:
* pt.cc (tsubst_expr) <case CALL_EXPR>: Revert PR68942 fix.
* semantics.cc (finish_call_expr): Ensure the callee of an
ADL-enabled call is wrapped in an OVERLOAD.
Paul Thomas [Thu, 1 May 2025 14:22:54 +0000 (15:22 +0100)]
Fortran: Source allocation of pure function result rejected [PR119948]
2025-05-01 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/119948
* resolve.cc (gfc_impure_variable): The result of a module
procedure with an interface declaration is not impure even if
the current namespace is not the same as the symbol's.
gcc/testsuite/
PR fortran/119948
* gfortran.dg/pr119948.f90: New test.
Ayan Shafqat [Thu, 1 May 2025 13:17:30 +0000 (06:17 -0700)]
Aarch64: Add __sqrt and __sqrtf intrinsics and corresponding tests
This patch introduces two new inline functions, __sqrt and __sqrtf, in
arm_acle.h for Aarch64 targets. These functions wrap the new builtins
__builtin_aarch64_sqrtdf and __builtin_aarch64_sqrtsf, respectively,
providing direct access to hardware instructions without relying on the
standard math library or optimization levels.
This patch also introduces acle_sqrt.c in the AArch64 testsuite,
verifying that the new __sqrt and __sqrtf intrinsics emit the expected
fsqrt instructions for double and float arguments.
Coverage for new intrinsics ensures that __sqrt and __sqrtf are
correctly expanded to hardware instructions and do not fall back to
library calls, regardless of optimization levels.
gcc/ChangeLog:
* config/aarch64/arm_acle.h (__sqrt, __sqrtf): New function.
Ayan Shafqat [Thu, 1 May 2025 13:14:44 +0000 (06:14 -0700)]
Aarch64: Use BUILTIN_VHSDF_HSDF for vector and scalar sqrt builtins
This patch changes the `sqrt` builtin definition from `BUILTIN_VHSDF_DF`
to `BUILTIN_VHSDF_HSDF` in `aarch64-simd-builtins.def`, ensuring the
builtin covers half, single, and double precision variants. The redundant
`VAR1 (UNOP, sqrt, 2, FP, hf)` lines are removed, as they are no longer
needed now that `BUILTIN_VHSDF_HSDF` handles those cases.
gcc/ChangeLog:
* config/aarch64/aarch64-simd-builtins.def: Change
BUILTIN_VHSDF_DF to BUILTIN_VHSDF_HSDF.
Signed-off-by: Ayan Shafqat <ayan.x.shafqat@gmail.com> Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Jason Merrill [Tue, 29 Apr 2025 12:32:44 +0000 (08:32 -0400)]
c++: avoid weird #line paths in std-name-hint.h
etags was getting confused by the #line pathnames in std-name-hint.h that
don't match my directory layout; let's avoid encoding information about
a particular developer's $(srcdir) in the generated file.
gcc/cp/ChangeLog:
* Make-lang.in: Don't pass the full path to gperf.
* std-name-hint.h: Regenerate.
Jason Merrill [Tue, 11 Mar 2025 15:17:46 +0000 (11:17 -0400)]
c++: remove TREE_STATIC from constexpr heap vars [PR119162]
While working on PR119162 it occurred to me that it would be simpler to
detect the problem of a value referring to a heap allocation if we stopped
setting TREE_STATIC on them so they naturally are not considered to have a
constant address. With that change we no longer need to specifically avoid
caching a value that refers to a deleted pointer.
But with this change maybe_nonzero_address is not sure whether the variable
could have address zero. I don't understand why it returns 1 only for
variables in the current function, rather than all non-symtab decls; an auto
variable from some other function also won't have address zero. Maybe this
made more sense when it was in tree_single_nonzero_warnv_p before r7-5868?
But assuming there is some reason for the current behavior, this patch only
changes the handling of non-symtab decls when folding_cxx_constexpr.
PR c++/119162
gcc/cp/ChangeLog:
* constexpr.cc (find_deleted_heap_var): Remove.
(cxx_eval_call_expression): Don't call it. Don't set TREE_STATIC on
heap vars.
(cxx_eval_outermost_constant_expr): Don't mess with varpool.
gcc/ChangeLog:
* fold-const.cc (maybe_nonzero_address): Return 1 for non-symtab
vars if folding_cxx_constexpr.
Richard Biener [Wed, 30 Apr 2025 12:57:03 +0000 (14:57 +0200)]
Remove non-SLP path from vectorizable_conversion
This removes the non-SLP paths from vectorizable_conversion and
in the process eliminates uses of 'ncopies' and 'STMT_VINFO_VECTYPE'
from the function.
Jakub Jelinek [Thu, 1 May 2025 06:29:03 +0000 (08:29 +0200)]
combine: Special case set_noop_p in two spots
Here is the incremental patch I was talking about.
For noop sets, we don't need to test much, they can go to i2
unless that would violate i3 JUMP condition.
With this the try_combine on the pr119291.c testcase doesn't fail,
but succeeds and we get
(insn 22 21 23 4 (set (pc)
(pc)) "pr119291.c":27:15 2147483647 {NOOP_MOVE}
(nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
(reg/v:SI 116 [ e ])) 96 {*movsi_internal}
(expr_list:REG_DEAD (reg/v:SI 116 [ e ])
(nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (reg/v:SI 116 [ e ])
(const_int 0 [0])) "pr119291.c":28:13 96 {*movsi_internal}
(nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
(const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
(nil))
after it.
2025-05-01 Jakub Jelinek <jakub@redhat.com>
* combine.cc (try_combine): Sets which satisfy set_noop_p can go
to i2 unless i3 is a jump and the other set is not.
c++/modules: Ensure deduction guides for imported types are reachable [PR120023]
In the linked PR, because the deduction guides depend on an imported
type, we never walk the type and so never call add_deduction_guides.
This patch ensures that we make bindings for deduction guides if we saw
any deduction guide at all.
PR c++/120023
gcc/cp/ChangeLog:
* module.cc (depset::hash::find_dependencies): Also call
add_deduction_guides when walking one.
gcc/testsuite/ChangeLog:
* g++.dg/modules/dguide-7_a.C: New test.
* g++.dg/modules/dguide-7_b.C: New test.
* g++.dg/modules/dguide-7_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
c++/modules: Fix imported CNTTPs being considered non-constant [PR119938]
When importing a CNTTP object, since r15-3031-g0b7904e274fbd6 we
shortcut the processing of the generated NTTP so that we don't attempt
to recursively load pendings. However, due to an oversight we do not
properly set TREE_CONSTANT or DECL_INITIALIZED_BY_CONSTANT_EXPRESSION_P
on the decl, which confuses later processing. This patch ensures that
this happens correctly.
PR c++/119938
gcc/cp/ChangeLog:
* pt.cc (get_template_parm_object): When !check_init, add assert
that expr really is constant and mark decl as such.
gcc/testsuite/ChangeLog:
* g++.dg/modules/tpl-nttp-2_a.H: New test.
* g++.dg/modules/tpl-nttp-2_b.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
c++/modules: Catch exposures of TU-local values through inline references [PR119996]
In r15-9136-g0210bedf481a9f we started erroring for inline variables
that exposed TU-local entities in their definition, as such variables
would need to have their definitions emitted in importers but would not
know about the TU-local entities they referenced.
A case we mised was potentially-constant references, which disable
streaming of their definitions in make_dependency so as to comply with
[expr.const] p9.2. This meant that we didn't see the definition
referencing a TU-local entity, leading to nonsensical results.
PR c++/119551
PR c++/119996
gcc/cp/ChangeLog:
* module.cc (depset::hash::make_dependency): Also mark inline
variables referencing TU-local values as exposures here.
(depset::hash::finalize_dependencies): Add error message for
inline variables.
gcc/testsuite/ChangeLog:
* g++.dg/modules/internal-13.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Andrew Pinski [Wed, 30 Apr 2025 22:10:29 +0000 (15:10 -0700)]
vectorizer: Fix riscv build [PR120042]
r15-9859-ga6cfde60d8c added a call to dominated_by_p to tree-vectorizer.h
but dominance.h is not always included; and you get a build failure on riscv building
riscv-vector-costs.cc.
Let's add the include of dominance.h to tree-vectorizer.h
Pushed as obvious after builds for riscv and x86_64.
gcc/ChangeLog:
PR target/120042
* tree-vectorizer.h: Include dominance.h.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
The SARIF 2.1.0 spec says that although a "SARIF log file SHALL contain
a serialization of the SARIF object model into the JSON format ... in the
future, other serializations might be defined." (§3.1)
I've been experimenting with alternative serializations of SARIF (CBOR
and JSON5 for now). To help with these experiments, this patch adds a
new param "serialization" to -fdiagnostics-add-output='s "sarif" scheme.
For now this must have value "json", but will be helpful for any
followup patches.
gcc/ChangeLog:
* diagnostic-format-sarif.cc
(sarif_serialization_format_json::write_to_file): New.
(sarif_builder::m_formatted): Replace field with...
(sarif_builder::m_serialization_format): ...this.
(sarif_builder::sarif_builder): Update for field change.
(sarif_builder::flush_to_file): Call m_serialization_format's
write_to_file vfunc.
(sarif_output_format::sarif_output_format): Replace param
"formatted" with "serialization_format".
(sarif_stream_output_format::sarif_output_format): Likewise.
(sarif_file_output_format::sarif_file_output_format): Likewise.
(diagnostic_output_format_init_sarif_stderr): Make a
sarif_serialization_format_json and pass it to
diagnostic_output_format_init_sarif.
(diagnostic_output_format_open_sarif_file): Split out into...
(diagnostic_output_file::try_to_open): ...this, adding
"serialization_kind" param.
(diagnostic_output_format_init_sarif_file): Update for new param
to diagnostic_output_format_open_sarif_file. Make a
sarif_serialization_format_json and pass it to
diagnostic_output_format_init_sarif.
(diagnostic_output_format_init_sarif_stream): Make a
sarif_serialization_format_json and pass it to
diagnostic_output_format_init_sarif.
(make_sarif_sink): Replace param "formatted" with "serialization".
(selftest::test_make_location_object): Update for changes to
sarif_builder ctor.
* diagnostic-format-sarif.h (enum class sarif_serialization): New.
(diagnostic_output_format_open_sarif_file): Add param
"serialization_kind".
(class sarif_serialization_format): New.
(class sarif_serialization_format_json): New.
(make_sarif_sink): Replace param "formatted" with
"serialization_format".
* diagnostic-output-file.h (diagnostic_output_file::try_to_open):
New decl.
* diagnostic.h (enum diagnostics_output_format): Tweak comments.
* doc/invoke.texi (-fdiagnostics-add-output): Add "serialization"
param to sarif scheme.
* libgdiagnostics.cc (sarif_sink::sarif_sink): Update for change
to make_sarif_sink.
* opts-diagnostic.cc (sarif_scheme_handler::make_sink): Add
"serialization" param and pass it on to make_sarif_sink.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 30 Apr 2025 20:50:16 +0000 (16:50 -0400)]
analyzer: add more test coverage for sprintf
gcc/testsuite/ChangeLog:
PR analyzer/107017
* c-c++-common/analyzer/sprintf-3.c: New test, covering use of
sprintf with specific format strings. Doesn't yet find problems
as the analyzer doesn't yet understand the format strings.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Wed, 30 Apr 2025 20:50:15 +0000 (16:50 -0400)]
analyzer: avoid saying "'0' is NULL"
gcc/analyzer/ChangeLog:
* sm-malloc.cc (malloc_diagnostic::describe_state_change): Tweak
the "EXPR is NULL" message for the case where EXPR is a null
pointer.
In r15-123 and r14-11434 we unconditionally set processing_template_decl
when substituting the context of an UNBOUND_CLASS_TEMPLATE, in order to
handle instantiation of the dependently scoped friend declaration
template<int N>
template<class T>
friend class A<N>::B;
where the scope A<N> remains dependent after instantiation. But this
turns out to misbehave for the UNBOUND_CLASS_TEMPLATE in the below
testcase representing
g<[]{}>::template fn
since with the flag set substituting the args of test3 into the lambda
causes us to defer the substitution and yield a lambda that still looks
dependent, which in turn makes g<[]{}> still dependent and not suitable
for qualified name lookup.
This patch restricts setting processing_template_decl during
UNBOUND_CLASS_TEMPLATE substitution to the case where there are multiple
levels of introduced template parameters, as in the friend declaration.
(This means we need to substitute the template parameter list(s) first,
which makes sense since they lexically appear first.)
PR c++/119981
PR c++/119378
gcc/cp/ChangeLog:
* pt.cc (tsubst) <case UNBOUND_CLASS_TEMPLATE>: Substitute
into template parameter list first. When substituting the
context, only set processing_template_decl if there's more
than one level of introduced template parameters.
Richard Biener [Tue, 29 Apr 2025 13:08:52 +0000 (15:08 +0200)]
tree-optimization/119960 - add validity checking to SLP scheduling
The following adds checks that when we search for a vector stmt
insert location we arrive at one where all required operand defs
are dominating the insert location. At the moment any such
failure only blows up during SSA verification.
There's the long-standing issue that we do not verify there
exists a valid schedule of the SLP graph from BB vectorization
into the existing CFG. We do not have the ability to insert
vector stmts on the dominance frontier "end", nor to insert
LC PHIs that would be eventually required.
This should be done all differently, computing the schedule
during analysis and failing if we can't schedule.
The following addresses a too conservative sanity check of SLP nodes
we want to promote external. The issue lies in code generation
for such external which relies on get_later_stmt to figure an
insert location. But get_later_stmt relies on the ability to
totally order stmts, specifically implementation-wise that they
are all from the same BB, which is what is verified at the moment.
The patch changes this to require stmts to be orderable by
dominance queries. For simplicity and seemingly enough for the
testcase in PR119960, this handles the case of two distinct BBs.
PR tree-optimization/119960
* tree-vect-slp.cc (vect_slp_can_convert_to_external):
Handle cases where defs from multiple BBs are ordered
by their dominance relation.
Richard Biener [Wed, 30 Apr 2025 08:01:47 +0000 (10:01 +0200)]
ipa/120006 - wrong code with IPA PTA
When PTA gets support for special-handling more builtins in
find_func_aliases the corresponding code in find_func_clobbers
needs updating as well since for unhandled cases it assumes
the former will populate ESCAPED accordingly. The following
fixes a few omissions, the testcase runs into the missing strdup
handling. I believe the more advanced handling using modref
results and fnspecs opened a larger gap, the proper fix is to
merge both functions, gating the clobber/use part on a parameter
to avoid diverging.
Richard Biener [Wed, 30 Apr 2025 09:52:17 +0000 (11:52 +0200)]
tree-optimization/120003 - missed jump threading
The following allows the entry and exit block of a jump thread path
to be equal, which can easily happen when there isn't a forwarder
on the interesting edge for an FSM thread conditional. We just
don't want to enlarge the path from such a block.
PR tree-optimization/120003
* tree-ssa-threadbackward.cc (back_threader::find_paths_to_names):
Allow block re-use but do not enlarge the path beyond such a
re-use.
* gcc.dg/tree-ssa/ssa-thread-23.c: New testcase.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
Eric Botcazou [Wed, 30 Apr 2025 10:41:36 +0000 (12:41 +0200)]
Fix GNAT build failure for x86/FreeBSD
gcc/ada/
PR ada/112958
* Makefile.rtl (LIBGNAT_TARGET_PAIRS) [x86 FreeBSD]: Add specific
version of s-dorepr.adb.
* libgnat/s-dorepr__freebsd.adb: New file.
AVR: target/119989 - Add missing clobbers to xload_<mode>_libgcc.
libgcc's __xload_1...4 is clobbering Z (and also R21 is some cases),
but avr.md had clobbers of respective GPRs only up to reload.
Outcome was that code reading from the same __memx address twice
could be wrong. This patch adds respective clobbers.
Forward-port from 2025-04-30 r14-11703
PR target/119989
gcc/
* config/avr/avr.md (xload_<mode>_libgcc): Clobber R21, Z.
gcc/testsuite/
* gcc.target/avr/torture/pr119989.h: New file.
* gcc.target/avr/torture/pr119989-memx-1.c: New test.
* gcc.target/avr/torture/pr119989-memx-2.c: New test.
* gcc.target/avr/torture/pr119989-memx-3.c: New test.
* gcc.target/avr/torture/pr119989-memx-4.c: New test.
* gcc.target/avr/torture/pr119989-flashx-1.c: New test.
* gcc.target/avr/torture/pr119989-flashx-2.c: New test.
* gcc.target/avr/torture/pr119989-flashx-3.c: New test.
* gcc.target/avr/torture/pr119989-flashx-4.c: New test.
Kito Cheng [Tue, 29 Apr 2025 03:35:00 +0000 (11:35 +0800)]
RISC-V: Allow different dynamic floating point mode to be merged [PR119832]
Although we already try to set the mode needed to FRM_DYN after a function call,
there are still some corner cases where both FRM_DYN and FRM_DYN_CALL may appear
on incoming edges.
Therefore, we use TARGET_MODE_CONFLUENCE to tell GCC that FRM_DYN, FRM_DYN_CALL,
and FRM_DYN_EXIT modes are compatible.
The Zve32x extension depends on the Zicsr extension.
Currently, enabling Zve32x alone does not automatically imply Zicsr in GCC.
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add Zve32x depends on Zicsr
gcc/testsuite/ChangeLog:
* gcc.target/riscv/predef-19.c: set the march to rv64im_zve32x
instead of rv64gc_zve32x to avoid Zicsr implied by g. Extra m is
added to avoid current 'V' extension requires 'M' extension
Signed-off-by: Jerry Zhang Jian <jerry.zhangjian@sifive.com>
Jennifer Schmitz [Thu, 13 Feb 2025 12:34:30 +0000 (04:34 -0800)]
AArch64: Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS
If -msve-vector-bits=128, SVE loads and stores (LD1 and ST1) with a
ptrue predicate can be replaced by neon instructions (LDR and STR),
thus avoiding the predicate altogether. This also enables formation of
LDP/STP pairs.
were previously compiled to
(with -O2 -march=armv8.2-a+sve -msve-vector-bits=128):
ptrue_load:
ptrue p3.b, vl16
ld1d z0.d, p3/z, [x0]
ret
ptrue_store:
ptrue p3.b, vl16
st1d z0.d, p3, [x0]
ret
Now the are compiled to:
ptrue_load:
ldr q0, [x0]
ret
ptrue_store:
str q0, [x0]
ret
The implementation includes the if-statement
if (known_eq (GET_MODE_SIZE (mode), 16)
&& aarch64_classify_vector_mode (mode) == VEC_SVE_DATA)
which checks for 128-bit VLS and excludes partial modes with a
mode size < 128 (e.g. VNx2QI).
The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
OK for mainline?
Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64.cc (aarch64_emit_sve_pred_move):
Fold LD1/ST1 with ptrue to LDR/STR for 128-bit VLS.
* gcc.target/riscv/rvv/xsfvector/sf_vc_f.c: New test.
* gcc.target/riscv/rvv/xsfvector/sf_vc_i.c: New test.
* gcc.target/riscv/rvv/xsfvector/sf_vc_v.c: New test.
* gcc.target/riscv/rvv/xsfvector/sf_vc_x.c: New test.
Adapt testsuite v3_target_compile to strip version namespace from compiler
output so that dg-error and dg-warning directives do not need to consider it.
Avoid a aligned_storage check as behavior has been fixed only when using
gnu-versioned-namespace as it is an abi breaking change.
libstdc++-v3/ChangeLog:
* testsuite/lib/libstdc++.exp (v3_target_compile): Strip version namespace
from compiler output.
* testsuite/20_util/aligned_storage/value.cc [_GLIBCXX_INLINE_VERSION]:
Avoid align_msa check.
* testsuite/20_util/function/cons/70692.cc: Remove now useless __8 namespace
pattern.
* testsuite/23_containers/map/48101_neg.cc: Likewise.
* testsuite/23_containers/multimap/48101_neg.cc: Likewise.
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>
Tomasz Kamiński [Fri, 25 Apr 2025 18:10:52 +0000 (20:10 +0200)]
libstdc++: Fix _Padding_sink in case when predicted width is between padwidth and maxwidth [PR109162]
The _Padding_sink was behaving incorrectly, when the predicted width (based on
code units count) was higher than _M_maxwidth, but lower than _M_padwidth.
In this case _M_update() returned without calling _M_force_update() and computing
field width for Unicode encoding, because _M_buffering() returned 'true'.
As a consequence we switched to _M_ignoring() mode, while storing a sequence
with more code units but smaller field width than _M_maxwidth.
We now call _M_force_update() if predicted width is greater or equal to either
_M_padwidth or _M_maxwidth.
This happened for existing test case on 32bit architecture.
PR libstdc++/109162
libstdc++-v3/ChangeLog:
* include/std/format (_Padding_sink::_M_update): Fixed condition for
calling _M_force_update.
* testsuite/std/format/debug.cc: Add test that reproduces this issue
on 64bit architecture.
* testsuite/std/format/ranges/sequence.cc: Another edge value test.
As noticed by Martin Jambor, I introduced a bug while simplifying
cs_interesting_for_ipcp_p and reversed condition for
flag_profile_partial_training. Also I noticed that we probably want to
consider calls with unintialized counts for cloning so the pass does somehting
with -fno-guess-branch-probability even thugh this is probably not very useful
in practice.
gcc/ChangeLog:
* ipa-cp.cc (cs_interesting_for_ipcp_p): Fix handling of uninitialized
counts and 0 IPA cost wrt flag_profile_partial_training.
d: Use __builtin_clear_padding for zeroing alignment holes after set
In an earlier change, a wrapper function was added to set
CONSTRUCTOR_ZERO_PADDING_BITS on all CONSTRUCTOR nodes. This removes all
the old generated calls to built-in memset and memcpy as zero padding is
now taken care of by the middle-end.
The remaining constructors that weren't getting zero padded was
ARRAY_TYPEs, so now `__builtin_clear_padding' is used to fill in all
alignment holes in constructed array literals where required.
PR d/103044
gcc/d/ChangeLog:
* d-tree.h (build_clear_padding_call): New prototype.
* d-codegen.cc (build_clear_padding_call): New function.
(build_memset_call): Remove generated call to __builtin_memcpy.
(build_address): Replace generated call to __builtin_memset with
__builtin_clear_padding.
(build_array_from_exprs): Likewise.
* expr.cc (ExprVisitor::visit (AssignExp *)): Remove generated call to
__builtin_memset.
(ExprVisitor::visit (ArrayLiteralExp *)): Likewise. Insert call to
__builtin_clear_padding after copying array into GC memory.
(ExprVisitor::visit (StructLiteralExp *)): Remove generated call to
__builtin_memset.
* toir.cc (IRVisitor::visit (ReturnStatement *)): Likewise.
Jonathan Wakely [Wed, 18 Dec 2024 18:31:16 +0000 (18:31 +0000)]
libstdc++: Use constexpr-if to slightly simplify <regex>
This will hardly make a dent in the very slow compile times for <regex>
but it seems worth doing anyway.
libstdc++-v3/ChangeLog:
* include/bits/regex_compiler.h: Replace _GLIBCXX17_CONSTEXPR
with constexpr and disable diagnostics with pragmas.
(_AnyMatcher::operator()): Use constexpr-if instead of tag
dispatching. Postpone calls to _M_translate until after checking
result of earlier calls.
(_AnyMatcher::_M_apply): Remove both overloads.
(_BracketMatcher::operator(), _BracketMatcher::_M_ready):
Replace tag dispatching with 'if constexpr'.
(_BracketMatcher::_M_apply(_CharT, true_type)): Remove.
(_BracketMatcher::_M_apply(_CharT, false_type)): Remove second
parameter.
(_BracketMatcher::_M_make_cache): Remove both overloads.
* include/bits/regex_compiler.tcc (_BracketMatcher::_M_apply):
Remove second parameter.
* include/bits/regex_executor.tcc: Replace _GLIBCXX17_CONSTEXPR
with constexpr and disable diagnostics with pragmas.
(_Executor::_M_handle_backref): Replace __glibcxx_assert with
static_assert.
(_Executor::_M_handle_accept): Mark _S_opcode_backref case as
unreachable for non-DFS mode and do not instantiate
_M_handle_backref for that case.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Pengfei Li [Tue, 29 Apr 2025 18:14:42 +0000 (19:14 +0100)]
simplify-rtx: Combine bitwise operations in more cases
This patch transforms RTL expressions of the form (subreg (not X)) into
(not (subreg X)) if the subreg is an operand of another binary logical
operation. This transformation can expose opportunities to combine more
logical operations.
For example, it improves the codegen of the following AArch64 NEON
intrinsics:
vandq_s64(vreinterpretq_s64_s32(vmvnq_s32(a)),
vreinterpretq_s64_s32(b));
from:
not v0.16b, v0.16b
and v0.16b, v0.16b, v1.16b
to:
bic v0.16b, v1.16b, v0.16b
Regression tested on x86_64-linux-gnu, arm-linux-gnueabihf and
aarch64-linux-gnu.
gcc/ChangeLog:
* simplify-rtx.cc (non_paradoxical_subreg_not_p): New function
for pattern match of (subreg (not X)).
(simplify_with_subreg_not): New function for simplification.
i386: Disable string insn from non-default AS for Pmode != word_mode [PR111657]
0x67 prefix is applied before segment register. That is in
rep movsq %gs:(%esi), (%edi)
the address is %gs + %esi. In case Pmode != word_mode (x32 with a default
-maddress-mode=short) instructions should not allow segment override prefixes.
Also, remove explicit addr32 prefix from asm templates because address
mode can be determined from explicit instruction operands. Also note that
Pmode != word_mode only with TARGET_64BIT, so the check in ix86_print_operand
is not needed.
PR target/111657
gcc/ChangeLog:
* config/i386/i386-expand.cc (alg_usable_p): For Pmode != word_mode
reject rep_prefix_{1,4,8}_byte algorithms with src_as in the
non-default address space.
* config/i386/i386-protos.h (ix86_check_movs): New prototype.
* config/i386/i386.cc (ix86_check_movs): New function.
(ix86_print_operand) [case '^']: Remove excess check for TARGET_64BIT.
* config/i386/i386.md (strmov): For Pmode != word_mode expand with
gen_strmov_single only when operands[3] (source) is in the default
address space.
(*strmovdi_rex_1) Use ix86_check_movs. Remove %^ from asm template.
(*strmovsi_1): Ditto.
(*strmovhi_1): DItto.
(*strmovqi_1): Ditto.
(*rep_movdi_rex64): Ditto.
(*rep_movsi): Ditto.
(*rep_movqi): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr111657-1.c: Check that segment override is not
generated for "rep movsq" for x32 target.
Barnabás Pőcze [Mon, 11 Mar 2024 23:35:50 +0000 (23:35 +0000)]
libstdc++: Optimize removal from unique assoc containers [PR112934]
Previously, calling erase(key) on both std::map and std::set
would execute that same code that std::multi{map,set} would.
However, doing that is unnecessary because std::{map,set}
guarantee that all elements are unique.
It is reasonable to expect that erase(key) is equivalent
or better than:
auto it = m.find(key);
if (it != m.end())
m.erase(it);
However, this was not the case. Fix that by adding a new
function _Rb_tree<>::_M_erase_unique() that is essentially
equivalent to the above snippet, and use this from both
std::map and std::set.
RISC-V: Fix register move cost for SIBCALL_REGS/JALR_REGS
SIBCALL_REGS/JALR_REGS are also subset of GR_REGS and need to
be taken into acount in riscv_register_move_cost, otherwise it
will get a incorrect cost.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_register_move_cost): Use
reg_class_subset_p to check the reg class.
Richard Biener [Tue, 29 Apr 2025 09:06:36 +0000 (11:06 +0200)]
tree-optimization/119997 - &ptr->field no longer subject to PRE
The following makes PRE handle &ptr->field the same as VN by
treating it as a POINTER_PLUS_EXPR when possible and thus as
'nary'. To facilitate this the patch splits out vn_pp_nary_for_addr
and adds const overloads for vec::last. The patch also avoids
handling an effective zero offset as POINTER_PLUS_EXPR.
PR tree-optimization/119997
* vec.h (vec<T, A, vl_embed>::last): Provide const overload.
(vec<T, va_heap, vl_ptr>::last): Likewise.
* tree-ssa-sccvn.h (vn_pp_nary_for_addr): Declare.
* tree-ssa-sccvn.cc (vn_pp_nary_for_addr): Split out from ...
(vn_reference_lookup): ... here.
(vn_reference_insert): ... and duplicate here. Do not handle
zero offset as POINTER_PLUS_EXPR.
* tree-ssa-pre.cc (compute_avail): Implement
ADDR_EXPR-as-POINTER_PLUS_EXPR special casing.
Jonathan Wakely [Mon, 28 Apr 2025 16:34:58 +0000 (17:34 +0100)]
libstdc++: Use constexpr-if for C++11 and C++14
Replace remaining uses of _GLIBCXX17_CONSTEXPR for constexpr-if, so that
we always use constexpr-if in C++11 and C++14. Diagnostic pragmas are
used to suppress diagnostics.
Jonathan Wakely [Mon, 28 Apr 2025 13:51:57 +0000 (14:51 +0100)]
libstdc++: Use constexpr-if in std::function for C++11 and C++14
This allows removing the _Target_handler class template, because it's no
longer needed to prevent instantiating invalid specializations of
_Function_handler.
libstdc++-v3/ChangeLog:
* include/bits/std_function.h (_Target_handler): Remove.
(function::target): Use constexpr-if for C++11 and
C++14, with diagnostic pragmas to suppress warnings.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Tue, 10 Dec 2024 20:40:42 +0000 (20:40 +0000)]
libstdc++: Use constexpr-if to simplify std::vector relocation
Simplify std::vector's use of std::__relocate_a by using 'if constexpr'
even in C++11 and C++14, with diagnostic pragmas to disable warnings.
This allows us to call std::__relocate_a directly, instead of via
_S_relocate and tag distpatching.
Preserve _S_relocate so that explicit instantiations still get it, but
make it a no-op when _S_use_relocate() is false, so that we don't
instantiate __relocate_a if it isn't needed.
libstdc++-v3/ChangeLog:
* include/bits/stl_vector.h (_S_do_relocate): Remove.
(_S_relocate): Remove tag dispatching path.
* include/bits/vector.tcc (reserve, _M_realloc_insert)
(_M_realloc_append, _M_default_append): Add diagnostic pragmas
and use 'if constexpr' in C++11 and C++14. Call
std::__relocate_a directly instead of _S_relocate.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jonathan Wakely [Mon, 28 Apr 2025 13:31:04 +0000 (14:31 +0100)]
libstdc++: Fix allocator propagation for rvalue+rvalue string concatenation
I made a last-minute change to Nina's r10-200-gf4e678ef74b272
implementation of P1165R1 (consistent allocator propagation for
operator+ on strings), so that the rvalue+rvalue case assumes that COW
strings do not support stateful allocators. I don't think that was true
when the change went in, and isn't true now. COW strings don't support
allocator propagation on assignment and swap, but they do support
non-equal stateful allocators, which are correctly propagated on move
construction.
This removes the preprocessor conditional in the rvalue+rvalue overload
so that COW strings are handled equivalently. Also use constexpr-if
unconditionally, disabling diagnostics with pragmas.
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (operator+(string&&, string&&)):
Do not assume that COW strings have equal allocators. Use
constexpr-if unconditionally.
* testsuite/21_strings/basic_string/allocator/char/operator_plus.cc:
Remove cxx11_abi effective-target check.
* testsuite/21_strings/basic_string/allocator/wchar_t/operator_plus.cc:
Likewise.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
to:
movl $m, %esi
movl $30, %ecx
rep movsq %gs:(%rsi), (%rdi)
ret
PR target/111657
gcc/ChangeLog:
* config/i386/i386-expand.cc (alg_usable_p): Remove have_as bool
argument and add dst_as and src_as address space arguments. Reject
libcall algorithm with dst_as and src_as in the non-default address
spaces. Reject rep_prefix_{1,4,8}_byte algorithms with dst_as in
the non-default address space.
(decide_alg): Remove have_as bool argument and add dst_as and src_as
address space arguments. Update calls to alg_usable_p.
(ix86_expand_set_or_cpymem): Update call to decide_alg.
* config/i386/i386.md (strmov): Do not fail if operand[3] (source)
is in the non-default address space. Expand with gen_strmov_singleop
only when operand[1] (destination) is in the default address space.
(*strmovdi_rex_1): Determine memory operands from insn pattern.
Allow only when destination is in the default address space.
Rewrite asm template to use explicit operands.
(*strmovsi_1): Ditto.
(*strmovhi_1): DItto.
(*strmovqi_1): Ditto.
(*rep_movdi_rex64): Ditto.
(*rep_movsi): Ditto.
(*rep_movqi): Ditto.
(*strsetdi_rex_1): Determine memory operands from insn pattern.
Allow only when destination is in the default address space.
(*strsetsi_1): Ditto.
(*strsethi_1): Ditto.
(*strsetqi_1): Ditto.
(*rep_stosdi_rex64): Ditto.
(*rep_stossi): Ditto.
(*rep_stosqi): Ditto.
hongtao.liu [Wed, 22 Jan 2025 06:44:01 +0000 (07:44 +0100)]
Annotate empty bb with all debug_stmt with location of phi in the single_succ.
For an empty BB with all debug_stmt, it will be ignored by
afdo_set_bb_count, but it can be set with count of single successors
PHIs which edge from the BB.
gcc/ChangeLog:
PR gcov-profile/118581
* auto-profile.cc (autofdo_source_profile::get_count_info):
Overload the function with parameter gimple location instead
of stmt.
(afdo_set_bb_count): For !has_annotated BB, Check single
successors PHIs corresponding to the block and use those
count.
Richard Biener [Mon, 28 Apr 2025 11:31:16 +0000 (13:31 +0200)]
debug/78685 - reword -Og documentation
The following rewords the documentation for -Og which over-promises
the ability to debug the generated code. While -Og enables
var-tracking and thus improves debugging in some areas the experience
is usually worse than -O0 for standard C code.
H.J. Lu [Fri, 29 Nov 2024 10:22:14 +0000 (18:22 +0800)]
x86: Add a pass to remove redundant all 0s/1s vector load
For all different modes of all 0s/1s vectors, we can use the single widest
all 0s/1s vector register for all 0s/1s vector uses in the whole function.
Add a pass to generate a single widest all 0s/1s vector set instruction at
entry of the nearest common dominator for basic blocks with all 0s/1s
vector uses. On Linux/x86-64, in cc1plus, this patch reduces the number
of vector xor instructions from 4803 to 4714 and pcmpeq instructions from
144 to 142.
NB: PR target/92080 and PR target/117839 aren't same. PR target/117839
is for vectors of all 0s and all 1s with different sizes and different
components. PR target/92080 is for broadcast of the same component to
different vector sizes. This patch covers only all 0s and all 1s cases
of PR target/92080.
gcc/
PR target/92080
PR target/117839
* config/i386/i386-features.cc (ix86_place_single_vector_set):
New function.
(remove_partial_avx_dependency): Use it.
(ix86_get_vector_load_mode): New function.
(replace_vector_const): Likewise.
(remove_redundant_vector_load): Likewise.
(pass_data_remove_redundant_vector_load): Likewise.
(pass_remove_redundant_vector_load): Likewise.
(make_pass_remove_redundant_vector_load): Likewise.
* config/i386/i386-passes.def: Add
pass_remove_redundant_vector_load after
pass_remove_partial_avx_dependency.
* config/i386/i386-protos.h
(make_pass_remove_redundant_vector_load): New.
* config/i386/i386.cc (ix86_modes_tieable_p): Return true for
narrower non-scalar-integer modes in SSE registers.
Drop targetm.promote_prototypes from C, C++ and Ada frontends
expand_normal now gets
<integer_cst 0x7fffe9824018 type <integer_type 0x7fffe9822348 unsigned char > constant 255>
and returns
(const_int -1 [0xffffffffffffffff])
which doesn't work with the predicates nor the instruction templates which
expect the unsigned expanded value. Extract the unsigned char and short
integer constants to return
(const_int 255 [0xff])
so that the expanded value is always unsigned, without the C frontend
promotion.
PR target/117547
* config/i386/i386-expand.cc (ix86_expand_unsigned_small_int_cst_argument):
New function.
(ix86_expand_args_builtin): Call
ix86_expand_unsigned_small_int_cst_argument to expand the argument
before calling fixup_modeless_constant.
(ix86_expand_round_builtin): Likewise.
(ix86_expand_special_args_builtin): Likewise.
(ix86_expand_builtin): Likewise.
libstdc++: centralize and improve testing of shared_ptr/weak_ptr conversions
Since the conversions are under the same constraints, centralize the
test in one file instead of two, testing both smart pointer classes, to
ease future maintenance. This is used right away: more tests are added.
Amends r15-8048-gdf0e6509bf7442.
libstdc++-v3/ChangeLog:
* testsuite/20_util/shared_ptr/requirements/1.cc: Test both
shared_ptr and weak_ptr.
Add more tests.
* testsuite/20_util/weak_ptr/requirements/1.cc: Removed as
superseded by the other test.
Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
David Malcolm [Mon, 28 Apr 2025 22:21:25 +0000 (18:21 -0400)]
analyzer: handle NRVO and DECL_BY_REFERENCE [PR111536]
The analyzer was issuing false warnings about uninitialized variables
in C++ in places where NRVO was marking DECL_RESULT with
DECL_BY_REFERENCE.
Fixed thusly.
gcc/analyzer/ChangeLog:
PR analyzer/111536
* engine.cc (maybe_update_for_edge): Update for new call_stmt
param to region_model::push_frame.
* program-state.cc (program_state::push_frame): Likewise.
* region-model.cc (region_model::update_for_gcall): Likewise.
(region_model::push_frame): Add "call_stmt" param.
Handle DECL_RESULT with DECL_BY_REFERENCE set on it by stashing
the region of the lhs of the call_stmt in the caller frame,
and writing a reference to it within the "result" in the callee
frame.
(region_model::pop_frame): Don't write back to the LHS for
DECL_BY_REFERENCE results.
(selftest::test_stack_frames): Update for new call_stmt param to
region_model::push_frame.
(selftest::test_get_representative_path_var): Likewise.
(selftest::test_state_merging): Likewise.
(selftest::test_alloca): Likewise.
* region-model.h (region_model::push_frame): Add "call_stmt"
param.
* region.cc: Include "tree-ssa.h".
(region::can_have_initial_svalue_p): Use ssa_defined_default_def_p
for ssa names, rather than special-casing it for just parameters.
This should now also cover DECL_RESULT with DECL_BY_REFERENCE and
hard registers.
* sm-signal.cc (update_model_for_signal_handler): Update for new
call_stmt param to region_model::push_frame.
* state-purge.cc (state_purge_per_decl::process_worklists):
Likewise.
gcc/testsuite/ChangeLog:
PR analyzer/111536
* c-c++-common/analyzer/hard-reg-1.c: New test.
* g++.dg/analyzer/nrvo-1.C: New test.
* g++.dg/analyzer/nrvo-2.C: New test.
* g++.dg/analyzer/nrvo-pr111536-1.C: New test.
* g++.dg/analyzer/nrvo-pr111536-1b.C: New test.
* g++.dg/analyzer/nrvo-pr111536-2.C: New test.
* g++.dg/analyzer/nrvo-pr111536-2b.C: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Mon, 28 Apr 2025 22:21:24 +0000 (18:21 -0400)]
analyzer: fix null deref false negative on std::unique_ptr [PR109366]
gcc/analyzer/ChangeLog:
PR analyzer/109366
* region-model-manager.cc
(region_model_manager::maybe_fold_sub_svalue): Sub-values of zero
constants are zero.
gcc/testsuite/ChangeLog:
PR analyzer/109366
* g++.dg/analyzer/unique_ptr-1.C: New test.
* g++.dg/analyzer/unique_ptr-2.C: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Mon, 28 Apr 2025 22:21:24 +0000 (18:21 -0400)]
analyzer: initial implementation of exception handling [PR97111]
This patch adds initial support for exception-handling to -fanalyzer,
handling eh_dispatch for regions of type ERT_TRY and
ERT_ALLOWED_EXCEPTIONS. I haven't managed yet seen eh_dispatch for
regions of type ERT_CLEANUP and ERT_MUST_NOT_THROW in the analyzer; with
this patch it will ICE if it sees those.
Additionally, this patch only checks for exact matches of exception
types, rather than supporting subclasses and references. I'm deferring
fixing this for now whilst figuring out how best to interact with the C++
type system; I'm tracking it as PR analyzer/119697.
The patch adds event classes for throwing and catching exceptions, and
seems to generate readable warnings for the kinds of leak that might
occur due to trying to manage resources manually and forgetting about
exceptions; for example:
exception-leak-1.C: In function ‘int test()’:
exception-leak-1.C:7:9: warning: leak of ‘ptr’ [CWE-401] [-Wanalyzer-malloc-leak]
7 | throw 42;
| ^~
‘int test()’: events 1-3
5 | void *ptr = __builtin_malloc (1024);
| ~~~~~~~~~~~~~~~~~^~~~~~
| |
| (1) allocated here
6 |
7 | throw 42;
| ~~
| |
| (2) throwing exception of type ‘int’ here...
| (3) ⚠️ ‘ptr’ leaks here; was allocated at (1)
Although dynamic exception specifications are only available in C++14
and earlier, the need to support them meant it seemed relatively easy to
add a warning to check them, hence the patch adds a new warning
for code paths that throw an exception that doesn't match a dynamic
exception specification: -Wanalyzer-throw-of-unexpected-type.
gcc/analyzer/ChangeLog:
PR analyzer/97111
* analyzer.cc (is_cxa_throw_p): New.
(is_cxa_rethrow_p): New.
* analyzer.opt (Wanalyzer-throw-of-unexpected-type): New.
* analyzer.opt.urls: Regenerate.
* call-info.cc (custom_edge_info::create_enode): New.
* call-info.h (call_info::print): Drop "final".
(call_info::add_events_to_path): Likewise.
* checker-event.cc (event_kind_to_string): Add cases for
event_kind::catch_, event_kind::throw_, and event_kind::unwind.
(explicit_throw_event::print_desc): New.
(throw_from_call_to_external_fn_event::print_desc): New.
(unwind_event::print_desc): New.
* checker-event.h (enum class event_kind): Add catch_, throw_,
and unwind.
(class catch_cfg_edge_event): New.
(class throw_event): New.
(class explicit_throw_event): New.
(class throw_from_call_to_external_fn_event): New.
(class unwind_event): New.
* common.h (class eh_dispatch_cfg_superedge): New forward decl.
(class eh_dispatch_try_cfg_superedge): New forward decl.
(class eh_dispatch_allowed_cfg_superedge): New forward decl.
(custom_edge_info::create_enode): New vfunc decl.
(is_cxa_throw_p): New decl.
(is_cxa_rethrow_p): New decl.
* diagnostic-manager.cc
(diagnostic_manager::add_events_for_superedge): Special-case edges
for eh_dispach_try.
(diagnostic_manager::prune_path): Call consolidate_unwind_events.
(diagnostic_manager::prune_for_sm_diagnostic): Don't filter the new
event_kinds.
(diagnostic_manager::consolidate_unwind_events): New.
* diagnostic-manager.h
(diagnostic_manager::consolidate_unwind_events): New decl.
* engine.cc (exploded_node::on_stmt_pre): Handle "__cxa_throw",
"__cxa_rethrow", and resx statements.
(class throw_custom_edge): New.
(class unwind_custom_edge): New.
(get_eh_outedge): New.
(exploded_graph::unwind_from_exception): New.
(exploded_node::on_throw): New.
(exploded_node::on_resx): New.
(exploded_graph::get_or_create_node): Add "add_to_worklist" param
and use it.
(exploded_graph::process_node): Use edge_info's create_enode vfunc
to create enodes, rather than calling get_or_create_node directly.
Ignore CFG edges in the sgraph flagged with EH whilst we're
exploring the egraph.
(exploded_graph_annotator::print_enode): Handle case
exploded_node::status::special.
* exploded-graph.h (exploded_node::status): Add value "special".
(exploded_node::on_throw): New decl.
(exploded_node::on_resx): New decl.
(exploded_graph::get_or_create_node): Add optional
"add_to_worklist" param.
(exploded_graph::unwind_from_exception): New decl.
* kf-lang-cp.cc (class kf_cxa_allocate_exception): New.
(class kf_cxa_begin_catch): New.
(class kf_cxa_end_catch): New.
(class throw_of_unexpected_type): New.
(class kf_cxa_call_unexpected): New.
(register_known_functions_lang_cp): Register known functions
"__cxa_allocate_exception", "__cxa_begin_catch",
"__cxa_end_catch", and "__cxa_call_unexpected".
* kf.cc (class kf_eh_pointer): New.
(register_known_functions): Register it for BUILT_IN_EH_POINTER.
* region-model.cc: Include "analyzer/function-set.h".
(exception_node::operator==): New.
(exception_node::dump_to_pp): New.
(exception_node::dump): New.
(exception_node::to_json): New.
(exception_node::make_dump_widget): New.
(exception_node::maybe_get_type): New.
(exception_node::add_to_reachable_regions): New.
(region_model::region_model): Initialize
m_thrown_exceptions_stack and m_caught_exceptions_stack.
(region_model::operator=): Likewise.
(region_model::operator==): Compare them.
(region_model::dump_to_pp): Dump exception stacks.
(region_model::to_json): Add exception stacks.
(region_model::make_dump_widget): Likewise.
(class exception_thrown_from_unrecognized_call): New.
(get_fns_assumed_not_to_throw): New.
(can_throw_p): New.
(region_model::check_for_throw_inside_call): New.
(region_model::on_call_pre): Call check_for_throw_inside_call
on unknown fns or those we don't have a body for.
(region_model::maybe_update_for_edge): Handle eh_dispatch_stmt
statements. Drop old code that called
apply_constraints_for_exception on EDGE_EH edges.
(class rejected_eh_dispatch): New.
(exception_matches_type_p): New.
(matches_any_exception_type_p): New.
(region_model::apply_constraints_for_eh_dispatch): New.
(region_model::apply_constraints_for_eh_dispatch_try): New.
(region_model::apply_constraints_for_eh_dispatch_allowed): New.
(region_model::apply_constraints_for_exception): Delete.
(region_model::can_merge_with_p): Don't merge models with
non-equal exception stacks.
(region_model::get_referenced_base_regions): Add regions from
exception stacks.
* region-model.h (struct exception_node): New.
(region_model::push_thrown_exception): New.
(region_model::get_current_thrown_exception): New.
(region_model::pop_thrown_exception): New.
(region_model::push_caught_exception): New.
(region_model::get_current_caught_exception): New.
(region_model::pop_caught_exception): New.
(region_model::apply_constraints_for_eh_dispatch_try): New decl.
(region_model::apply_constraints_for_eh_dispatch_allowed) New decl.
(region_model::apply_constraints_for_exception): Delete.
(region_model::apply_constraints_for_eh_dispatch): New decl.
(region_model::check_for_throw_inside_call): New decl.
(region_model::m_thrown_exceptions_stack): New field.
(region_model::m_caught_exceptions_stack): New field.
* supergraph.cc: Include "except.h" and "analyzer/region-model.h".
(supergraph::add_cfg_edge): Special-case eh_dispatch edges.
(superedge::get_description): Use default_tree_printer.
(get_catch): New.
(eh_dispatch_cfg_superedge::make): New.
(eh_dispatch_cfg_superedge::eh_dispatch_cfg_superedge): New.
(eh_dispatch_cfg_superedge::get_eh_status): New.
(eh_dispatch_try_cfg_superedge::dump_label_to_pp): New.
(eh_dispatch_try_cfg_superedge::apply_constraints): New.
(eh_dispatch_allowed_cfg_superedge::eh_dispatch_allowed_cfg_superedge):
New.
(eh_dispatch_allowed_cfg_superedge::dump_label_to_pp): New.
(eh_dispatch_allowed_cfg_superedge::apply_constraints): New.
* supergraph.h: Include "except.h".
(superedge::dyn_cast_eh_dispatch_cfg_superedge): New vfunc.
(superedge::dyn_cast_eh_dispatch_try_cfg_superedge): New vfunc.
(superedge::dyn_cast_eh_dispatch_allowed_cfg_superedge): New
vfunc.
(class eh_dispatch_cfg_superedge): New.
(is_a_helper <const eh_dispatch_cfg_superedge *>::test): New.
(class eh_dispatch_try_cfg_superedge): New.
(is_a_helper <const eh_dispatch_try_cfg_superedge *>::test): New.
(class eh_dispatch_allowed_cfg_superedge): New.
(is_a_helper <const eh_dispatch_allowed_cfg_superedge *>::test):
New.
* svalue.cc (svalue::maybe_get_type_from_typeinfo): New.
* svalue.h (svalue::maybe_get_type_from_typeinfo): New decl.
David Malcolm [Mon, 28 Apr 2025 22:21:23 +0000 (18:21 -0400)]
analyzer,c++: add placeholder implementation of ana::translation_unit for C++
Implement ana::translation_unit for the C++ frontend with a
no-op placeholder implementation, for now.
No functional change intended; a follow-up may implement
things further.
gcc/cp/ChangeLog:
* parser.cc: Include "analyzer/analyzer-language.h".
(ana::cp_translation_unit): New class.
(cp_parser_translation_unit): Add call to
ana::on_finish_translation_unit.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
gcc/c-family/ChangeLog:
* c-pretty-print.cc: Drop include of "make-unique.h".
Replace uses of ::make_unique with std::make_unique.
gcc/c/ChangeLog:
* c-decl.cc: Drop include of "make-unique.h".
Replace uses of ::make_unique with std::make_unique.
* c-objc-common.cc: Likewise.
* c-parser.cc: Likewise.
gcc/cp/ChangeLog:
* cxx-pretty-print.cc: Drop include of "make-unique.h".
Replace uses of ::make_unique with std::make_unique.
* error.cc: Likewise.
* name-lookup.cc: Likewise.
* parser.cc: Likewise.
gcc/ChangeLog:
* diagnostic-format-json.cc: Drop include of "make-unique.h".
Replace uses of ::make_unique with std::make_unique.
* diagnostic-format-sarif.cc: Likewise.
* diagnostic-format-text.cc: Likewise.
* diagnostic.cc: Likewise.
* dumpfile.cc: Likewise.
* gcc-attribute-urlifier.cc: Likewise.
* gcc-urlifier.cc: Likewise.
* json-parsing.cc: Likewise.
* json.cc: Likewise.
* lazy-diagnostic-path.cc: Likewise.
* libgdiagnostics.cc: Likewise.
* libsarifreplay.cc: Likewise.
* lto-wrapper.cc: Likewise.
* make-unique.h: Delete.
* opts-diagnostic.cc: Drop include of "make-unique.h".
Replace uses of ::make_unique with std::make_unique.
* pretty-print.cc: Likewise.
* text-art/style.cc: Likewise.
* text-art/styled-string.cc: Likewise.
* text-art/table.cc: Likewise.
* text-art/tree-widget.cc: Likewise.
* text-art/widget.cc: Likewise.
* timevar.cc: Likewise.
* toplev.cc: Likewise.
* tree-diagnostic-client-data-hooks.cc: Likewise.
gcc/jit/ChangeLog:
* dummy-frontend.cc: Drop include of "make-unique.h".
Replace uses of ::make_unique with std::make_unique.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/analyzer_cpython_plugin.cc: Drop include of
"make-unique.h". Replace uses of ::make_unique with
std::make_unique.
* gcc.dg/plugin/analyzer_gil_plugin.cc: Likewise.
* gcc.dg/plugin/analyzer_kernel_plugin.cc: Likewise.
* gcc.dg/plugin/analyzer_known_fns_plugin.cc: Likewise.
* gcc.dg/plugin/diagnostic_group_plugin.cc: Likewise.
* gcc.dg/plugin/diagnostic_plugin_xhtml_format.cc: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Mon, 28 Apr 2025 22:21:22 +0000 (18:21 -0400)]
analyzer: use unique_ptr for state_machine instances
gcc/analyzer/ChangeLog:
* engine.cc (class plugin_analyzer_init_impl): Convert
"m_checkers" to use std::vector of std::unique_ptr. Convert
"m_known_fn_mgr" to a reference.
(impl_run_checkers): Convert "checkers" to use std::vector of
std::unique_ptr and move it into the extrinsic_state.
* program-state.cc (extrinsic_state::dump_to_pp): Update for
changes to m_checkers.
(extrinsic_state::to_json): Likewise.
(extrinsic_state::get_sm_idx_by_name): Likewise.
(selftest::test_sm_state_map): Update to use std::unique_ptr
for state machines.
(selftest::test_program_state_1): Likewise.
(selftest::test_program_state_2): Likewise.
(selftest::test_program_state_merging): Likewise.
(selftest::test_program_state_merging_2): Likewise.
* program-state.h (class extrinsic_state): Convert "m_checkers" to
use std::vector of std::unique_ptr and to be owned by this object,
rather than a reference. Add ctor for use in selftests.
* sm-fd.cc (make_fd_state_machine): Update to use std::unique_ptr.
* sm-file.cc (make_fileptr_state_machine): Likewise.
* sm-malloc.cc (make_malloc_state_machine): Likewise.
* sm-pattern-test.cc (make_pattern_test_state_machine): Likewise.
* sm-sensitive.cc (make_sensitive_state_machine): Likewise.
* sm-signal.cc (make_signal_state_machine): Likewise.
* sm-taint.cc (make_taint_state_machine): Likewise.
* sm.cc: Define INCLUDE_LIST.
(make_checkers): Return the vector directly, rather than pass it
in by reference. Update to use std::unique_ptr throughout. Use
an intermediate list, and use that to filter with
flag_analyzer_checker, fixing memory leak for this case.
* sm.h: (make_checkers): Return the vector directly, rather than
pass it in by reference, and use std::vector of std::unique_ptr.
(make_malloc_state_machine): Convert return type to use std::unique_ptr.
(make_fileptr_state_machine): Likewise.
(make_taint_state_machine): Likewise.
(make_sensitive_state_machine): Likewise.
(make_signal_state_machine): Likewise.
(make_pattern_test_state_machine): Likewise.
(make_va_list_state_machine): Likewise.
(make_fd_state_machine): Likewise.
* varargs.cc (make_va_list_state_machine): Update to use
std::unique_ptr.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Mon, 28 Apr 2025 22:21:21 +0000 (18:21 -0400)]
analyzer: use analyzer/common.h as a common header
Our headers are a major pain to work with: many require certain other
headers to be included in a particular (undocumented) order in order
to be includable.
Simplify includes in the analyzer by renaming analyzer/analyzer.h to
analyzer/common.h and have it include all the common headers needed
throughout the analyzer, thus encapsulating the rules about e.g. being
able to include "gimple.h" in one place in the analyzer subdirectory.
Doing so also makes it easier to e.g. define INCLUDE_SET in one place,
rather than in many source files.