Richard Biener [Mon, 20 Nov 2023 10:12:43 +0000 (11:12 +0100)]
tree-optimization/112618 - unused .MASK_CALL
We have to make sure to remove unused .MASK_CALL internal function
calls after vectorization.
PR tree-optimization/112618
* tree-vect-loop.cc (vect_transform_loop_stmt): For not
relevant and unused .MASK_CALL make sure we remove the
scalar stmt.
Richard Biener [Mon, 20 Nov 2023 12:39:52 +0000 (13:39 +0100)]
tree-optimization/112281 - loop distribution and zero dependence distances
The following fixes an omission in dependence testing for loop
distribution. When the overall dependence distance is not zero but
the dependence direction in the innermost common loop is = there is
a conflict between the partitions and we have to merge them.
PR tree-optimization/112281
* tree-loop-distribution.cc
(loop_distribution::pg_add_dependence_edges): For = in the
innermost common loop record a partition conflict.
* gcc.dg/torture/pr112281-1.c: New testcase.
* gcc.dg/torture/pr112281-2.c: Likewise.
Richard Biener [Mon, 20 Nov 2023 10:29:59 +0000 (11:29 +0100)]
middle-end/112622 - convert and vector-to-float
The following avoids ICEing when trying to convert a vector to
a scalar float.
PR middle-end/112622
* convert.cc (convert_to_real_1): Use element_precision
where a vector type might appear. Provide specific
diagnostic for unexpected vector argument.
The rootcase is we don't enable V4SImode, instead, we already have RVVMF2SI which is totally same as V4SI
on -march=rv32gcv_zvl256 + --param=riscv-autovec-preference=fixed-vlmax.
The attribute VDEMODE map to V4SI is incorrect, we remove attributes and use get_vector_mode to get
right mode.
Robin Dapp [Tue, 14 Nov 2023 13:11:09 +0000 (14:11 +0100)]
RISC-V: Disallow 64-bit indexed loads and stores for rv32gcv.
We currently allow 64-bit indices/offsets for vector indexed loads and
stores even on rv32 but we should not.
This patch adjusts the iterators as well as the insn conditions to
reflect the RVV spec.
It also fixes an oversight in the VLS modes of the demote iterator that
was found while testing the patch.
gcc/ChangeLog:
* config/riscv/riscv-v.cc (gather_scatter_valid_offset_mode_p):
Add check for XLEN == 32.
* config/riscv/vector-iterators.md: Change VLS part of the
demote iterator to 2x elements modes
* config/riscv/vector.md: Adjust iterators and insn conditions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-1.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-10.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-11.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-12.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-2.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-3.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-4.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-5.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-6.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-7.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-8.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_32-9.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c:
Adjust include.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-1.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-10.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-11.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-2.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-3.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-4.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-5.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-6.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-7.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-8.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_32-9.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c:
Adjust include.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-1.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-10.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-2.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-3.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-4.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-5.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-6.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-7.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-8.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_32-9.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c:
Adjust include.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-1.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-10.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-2.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-4.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-5.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-6.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-7.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-8.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_32-9.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: Moved to...
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-2.c: ...here.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c:
Adjust include.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c: Ditto.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-11.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-12.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_64-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-11.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_64-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_64-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_64-9.c: New test.
Christophe Lyon [Wed, 15 Nov 2023 08:12:35 +0000 (08:12 +0000)]
arm: [MVE intrinsics] Add support for contiguous loads and stores
This patch adds base support for load/store intrinsics to the
framework, starting with loads and stores for contiguous memory
elements, without extension nor truncation.
Compared to the aarch64/SVE implementation, there's no support for
gather/scatter loads/stores yet. This will be added later as needed.
Christophe Lyon [Wed, 15 Nov 2023 07:50:57 +0000 (07:50 +0000)]
arm: Fix arm_simd_types and MVE scalar_types
So far we define arm_simd_types and scalar_types using type
definitions like intSI_type_node, etc...
This is causing problems with later patches which re-implement
load/store MVE intrinsics, leading to error messages such as:
error: passing argument 1 of 'vst1q_s32' from incompatible pointer type
note: expected 'int *' but argument is of type 'int32_t *' {aka 'long int *'}
This patch uses get_typenode_from_name (INT32_TYPE) instead, which
defines the types as appropriate for the target/C library.
Juzhe-Zhong [Mon, 20 Nov 2023 10:39:10 +0000 (18:39 +0800)]
RISC-V Regression: Remove scalable compile option
Since we already set scalable vectorization by default, this flag is redundant.
Also, we are start to full coverage testing with different compile option.
E.g --param=riscv-autovec-preference=fixed-vlmax.
To avoid compile option confusion. Remove it.
Jakub Jelinek [Mon, 20 Nov 2023 09:37:59 +0000 (10:37 +0100)]
c, c++: Add new value for vector types for __builtin_classify_type
While filing a clang request to return 18 on _BitInts for
__builtin_classify_type instead of -1 they return currently, I've
noticed that we return -1 for vector types. Initially I wanted to change
behavior just for __builtin_classify_type (type) form, as that is new in
GCC 14 and we've returned for 20+ years -1 for __builtin_classify_type
on vector expressions, but I was convinved otherwise, so this changes
the behavior even for that and now returns 19.
Robin Dapp [Fri, 17 Nov 2023 09:34:35 +0000 (10:34 +0100)]
vect: Add bool pattern handling for COND_OPs.
In order to handle masks properly for conditional operations this patch
teaches vect_recog_mask_conversion_pattern to also handle conditional
operations. Now we convert e.g.
Jakub Jelinek [Mon, 20 Nov 2023 09:03:20 +0000 (10:03 +0100)]
tree-ssa-math-opts: popcount (X) == 1 to (X ^ (X - 1)) > (X - 1) optimization for direct optab [PR90693]
On Fri, Nov 17, 2023 at 03:01:04PM +0100, Jakub Jelinek wrote:
> As a follow-up, I'm considering changing in this routine the popcount
> call to IFN_POPCOUNT with 2 arguments and during expansion test costs.
Here is the follow-up which does the rtx costs testing.
2023-11-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90693
* tree-ssa-math-opts.cc (match_single_bit_test): Mark POPCOUNT with
result only used in equality comparison against 1 with direct optab
support as .POPCOUNT call with 2 arguments.
* internal-fn.h (expand_POPCOUNT): Declare.
* internal-fn.def (DEF_INTERNAL_INT_EXT_FN): New macro, document it,
undefine at the end.
(POPCOUNT): Use it instead of DEF_INTERNAL_INT_FN.
* internal-fn.cc (DEF_INTERNAL_INT_EXT_FN): Define to nothing before
inclusion to define expanders.
(expand_POPCOUNT): New function.
Per the earlier discussions on this PR, the following patch folds
popcount (x) == 1 (and != 1) into (x ^ (x - 1)) > x - 1 (or <=)
if the corresponding popcount optab isn't implemented (I think any
double-word popcount or call will be necessarily slower than the
above cheap 3 op check and even for -Os larger or same size).
I've noticed e.g. C++ aligned new starts with std::has_single_bit
which does popcount (x) == 1.
As a follow-up, I'm considering changing in this routine the popcount
call to IFN_POPCOUNT with 2 arguments and during expansion test costs.
2023-11-20 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/90693
* tree-ssa-math-opts.cc (match_single_bit_test): New function.
(math_opts_dom_walker::after_dom_children): Call it for EQ_EXPR
and NE_EXPR assignments and GIMPLE_CONDs.
Jakub Jelinek [Mon, 20 Nov 2023 08:57:34 +0000 (09:57 +0100)]
internal-fn: Always undefine DEF_INTERNAL* macros at the end of internal-fn.def
I have noticed we are inconsistent, some DEF_INTERNAL*
macros (most of them) were undefined at the end of internal-fn.def (but in
some cases uselessly undefined again after inclusion), while others were not
(and sometimes undefined after the inclusion). I've changed it to always
undefine at the end of internal-fn.def.
2023-11-20 Jakub Jelinek <jakub@redhat.com>
* internal-fn.def: Document missing DEF_INTERNAL* macros and make sure
they are all undefined at the end.
* internal-fn.cc (lookup_hilo_internal_fn, lookup_evenodd_internal_fn,
widening_fn_p, get_len_internal_fn): Don't undef DEF_INTERNAL_*FN
macros after inclusion of internal-fn.def.
Alexandre Oliva [Sun, 19 Nov 2023 05:41:48 +0000 (02:41 -0300)]
testsuite: arm: fix arm_movt cut&pasto
I got spurious fails of tests that required arm_thumb1_movt_ok on a
target cpu that did not support movt. Looking into it, I found the
arm_movt property to have been cut&pasted into other procs that
checked for different properties. They shouldn't share the same test
results cache entry, so I'm changing their prop names. Or rather its
prop name, because the other occurrence was already fixed recently.
for gcc/testsuite/ChangeLog
* lib/target-supports.exp
(check_effective_target_arm_thumb1_cbz_ok): Fix prop name
cut&pasto.
Alexandre Oliva [Mon, 20 Nov 2023 08:14:31 +0000 (05:14 -0300)]
testsuite: analyzer: expect alignment warning with -fshort-enums
On targets that have -fshort-enums enabled by default, the type casts
in the pr108251 analyzer tests warn that the byte-aligned enums may
not be sufficiently aligned to be a struct connection *. The function
can't know better, the warning is reasonable, the code doesn't
expected enums to be shorter and less aligned than the struct.
Rather than use -fno-short-enums, I decided to embrace the warning on
targets that have short_enums enabled by default.
However, C++ doesn't issue the warning, because even with
-fshort-enums, enumeration types are not TYPE_PACKED, and the
expression is not sufficiently simplified by the C++ front-end for
check_and_warn_address_or_pointer_of_packed_member to identify the
insufficiently aligned pointer. So don't expect the warning there.
for gcc/testsuite/ChangeLog
* c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early-O2.c:
Expect "unaligned pointer value" warning on short_enums
targets, but not in c++.
* c-c++-common/analyzer/null-deref-pr108251-smp_fetch_ssl_fc_has_early.c:
Likewise.
Alexandre Oliva [Mon, 20 Nov 2023 08:14:25 +0000 (05:14 -0300)]
testsuite: scev: expect fail on ilp32
I've recently patched scev-3.c and scev-5.c because it only passed by
accident on ia32. It also fails on some (but not all) arm-eabi
variants. It seems hard to characterize the conditions in which the
optimization is supposed to pass, but expecting them to fail on ilp32
targets, though probably a little excessive and possibly noisy, is not
quite as alarming as getting a fail in test reports, so I propose
changing the xfail marker from ia32 to ilp32.
I'm also proposing to add a similar marker to scev-4.c. Though it
doesn't appear to be failing for me, I've got reports that suggest it
still does for others, and it certainly did for us as well.
for gcc/testsuite/ChangeLog
* gcc.dg/tree-ssa/scev-3.c: xfail on all ilp32 targets,
though some of these do pass.
* gcc.dg/tree-ssa/scev-4.c: Likewise.
* gcc.dg/tree-ssa/scev-5.c: Likewise.
Jason Merrill [Fri, 17 Nov 2023 22:17:32 +0000 (17:17 -0500)]
c++: compare one level of template parms
There should never be a reason to compare more than one level of template
parameters; additional levels are for the enclosing context, which is either
irrelevant (for a template template parameter) or already compared (for a
member template).
Also, the comp_template_parms handling of type parameters was wrongly
checking for TEMPLATE_TYPE_PARM when a type parameter appears here as a
TYPE_DECL.
gcc/cp/ChangeLog:
* pt.cc (comp_template_parms): Just one level.
(template_parameter_lists_equivalent_p): Likewise.
auto: Current status, use scalar or vector instructions.
libcall: Always use a library call.
scalar: Only use scalar instructions.
vector: Only use vector instructions.
PR target/112537
gcc/ChangeLog:
* config/riscv/riscv-opts.h (enum riscv_stringop_strategy_enum): Strategy enum.
* config/riscv/riscv-string.cc (riscv_expand_block_move): Disabled based on options.
(expand_block_move): Ditto.
* config/riscv/riscv.opt: Add -mmemcpy-strategy=.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/cpymem-strategy-1.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-2.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-3.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-4.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy-5.c: New test.
* gcc.target/riscv/rvv/base/cpymem-strategy.h: New test.
Lulu Cheng [Sat, 18 Nov 2023 03:04:42 +0000 (11:04 +0800)]
LoongArch: Modify MUSL_DYNAMIC_LINKER.
Use no suffix at all in the musl dynamic linker name for hard
float ABI. Use -sf and -sp suffixes in musl dynamic linker name
for soft float and single precision ABIs. The following table
outlines the musl interpreter names for the LoongArch64 ABI names.
Nathaniel Shead [Thu, 16 Nov 2023 21:39:53 +0000 (08:39 +1100)]
c++: Set DECL_CONTEXT for __cxa_thread_atexit [PR99187]
Modules streaming requires DECL_CONTEXT to be set on declarations that
are streamed. This ensures that __cxa_thread_atexit is given translation
unit context much like is already done with many other support
functions.
Philipp Tomsich [Sun, 19 Nov 2023 21:11:45 +0000 (14:11 -0700)]
[committed] RISC-V: Infrastructure for instruction fusion
I've been meaning to extract this and upstream it for a long time. The work is
primarily Philipp from VRULL with one case added by Raphael and light bugfixing
on my part.
Essentially there's 10 distinct fusions supported and they can be selected
individually by building a suitable mask in the uarch tuning structure.
Additional cases can be added -- the bulk of the effort is in recognizing the
two fusible instructions.
The cases supported in this patch are all from the Veyron V1 processor, though
the hope is they will be useful elsewhere. I would encourage those familiar
with other uarch implementations to enable fusion cases for those uarchs and
extend the set of supported cases if any are missing.
gcc/
* config/riscv/riscv-protos.h (extract_base_offset_in_addr): Prototype.
* config/riscv/riscv.cc (riscv_fusion_pairs): New enum.
(riscv_tune_param): Add fusible_ops field.
(riscv_tune_param_rocket_tune_info): Initialize new field.
(riscv_tune_param_sifive_7_tune_info): Likewise.
(thead_c906_tune_info): Likewise.
(generic_oo_tune_info): Likewise.
(optimize_size_tune_info): Likewise.
(riscv_macro_fusion_p): New function.
(riscv_fusion_enabled_p): Likewise.
(riscv_macro_fusion_pair_p): Likewise.
(TARGET_SCHED_MACRO_FUSION_P): Define.
(TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
(extract_base_offset_in_addr): Moved into riscv.cc from...
* config/riscv/thead.cc: Here.
Co-authored-by: Raphael Zinsly <rzinsly@ventanamicro.com> Co-authored-by: Jeff Law <jlaw@ventanamicro.com>
Jeff Law [Sun, 19 Nov 2023 18:56:57 +0000 (11:56 -0700)]
[committed] Fix missing mode on a few unspec/unspec_volatile operands
This is fix for a minor problem Jivan and I found while testing the ext-dce work originally from Joern.
The ext-dce pass will transform zero/sign extensions into subreg accesses when
the upper bits are actually unused. So it's more likely with the ext-dce work
to get a sequence like this prior to combine:
When we try to combine insn 10->11 we'll ultimately call simplify_subreg with
something like
(subreg:DI (unspec_volatile [...]) 0)
Note the lack of a mode on the unspec_volatile. That in turn will cause
simplify_subreg to trigger an assertion.
The modeless unspec is generated by the RISC-V backend and the more I've
pondered this issue over the last few days the more I'm convinced it's a
backend bug. Basically if the LHS of the set has a mode, then the RHS of the
set should have a mode as well.
I've audited the various backends and only found a few problems which are fixed
by this patch. I've tested the relevant ports in my tester. c6x, sh, mips and
s390[x].
There are other patterns that are potentially problematical in various ports.
They have a REG destination and an UNSPEC source, but the REG has no mode in
the pattern. Since it wasn't clear what mode to give the UNSPEC, I left those
alone.
Lewis Hyatt [Thu, 16 Nov 2023 16:18:37 +0000 (11:18 -0500)]
Makefile.tpl: Avoid race condition in generating site.exp from the top level
A command like "make -j 2 check-gcc-c check-gcc-c++" run in the top level of
a fresh build directory does not work reliably. That will spawn two
independent make processes inside the "gcc" directory, and each of those
will attempt to create site.exp if it doesn't exist and will interfere with
each other, producing often a corrupted or empty site.exp. Resolve that by
making these targets depend on a new phony target which makes sure site.exp
is created first before starting the recursive makes.
ChangeLog:
* Makefile.in: Regenerate.
* Makefile.tpl: Add dependency on site.exp to check-gcc-* targets
David Malcolm [Sun, 19 Nov 2023 01:35:59 +0000 (20:35 -0500)]
analyzer: new warning: -Wanalyzer-undefined-behavior-strtok [PR107573]
This patch:
- adds support to the analyzer for tracking API-private state
or which we don't have a decl (such as strtok's internal state),
- uses it to implement a new -Wanalyzer-undefined-behavior-strtok which
warns when strtok (NULL, delim) is called as the first call to
strtok after main.
gcc/testsuite/ChangeLog:
PR analyzer/107573
* c-c++-common/analyzer/strtok-1.c: New test.
* c-c++-common/analyzer/strtok-2.c: New test.
* c-c++-common/analyzer/strtok-3.c: New test.
* c-c++-common/analyzer/strtok-4.c: New test.
* c-c++-common/analyzer/strtok-cppreference.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jonathan Wakely [Tue, 15 Aug 2023 21:43:41 +0000 (22:43 +0100)]
libstdc++: Add fast path for std::format("{}", x) [PR110801]
This optimizes the simple case of formatting a single string, integer
or bool, with no format-specifier (so no padding, alignment, alternate
form etc.)
libstdc++-v3/ChangeLog:
PR libstdc++/110801
* include/std/format (_Sink_iter::_M_reserve): New member
function.
(_Sink::_Reservation): New nested class.
(_Sink::_M_reserve, _Sink::_M_bump): New virtual functions.
(_Seq_sink::_M_reserve, _Seq_sink::_M_bump): New virtual
overrides.
(_Iter_sink<O, ContigIter>::_M_reserve): Likewise.
(__do_vformat_to): Use new functions to optimize "{}" case.
Xi Ruoyao [Sat, 18 Nov 2023 17:41:12 +0000 (01:41 +0800)]
LoongArch: Fix "-mexplict-relocs=none -mcmodel=medium" producing %call36 when the assembler does not support it
Even if !HAVE_AS_SUPPORT_CALL36, const_call_insn_operand should still
return false when -mexplict-relocs=none -mcmodel=medium to make
loongarch_legitimize_call_address emit la.local or la.global.
gcc/ChangeLog:
* config/loongarch/predicates.md (const_call_insn_operand):
Remove buggy "HAVE_AS_SUPPORT_CALL36" conditions. Change "1" to
"true" to make the coding style consistent.
Xi Ruoyao [Thu, 16 Nov 2023 01:30:14 +0000 (09:30 +0800)]
LoongArch: Don't emit dbar 0x700 if -mld-seq-sa
This option (CPUCFG word 0x3 bit 23) means "the hardware guarantee that
two loads on the same address won't be reordered with each other". Thus
we can omit the "load-load" barrier dbar 0x700.
This is only a micro-optimization because dbar 0x700 is already treated
as nop if the hardware supports LD_SEQ_SA.
Xi Ruoyao [Thu, 16 Nov 2023 01:21:47 +0000 (09:21 +0800)]
LoongArch: Take the advantage of -mdiv32 if it's enabled
With -mdiv32, we can assume div.w[u] and mod.w[u] works on low 32 bits
of a 64-bit GPR even if it's not sign-extended.
gcc/ChangeLog:
* config/loongarch/loongarch.md (DIV): New mode iterator.
(<optab:ANY_DIV><mode:GPR>3): Don't expand if TARGET_DIV32.
(<optab:ANY_DIV>di3_fake): Disable if TARGET_DIV32.
(*<optab:ANY_DIV><mode:GPR>3): Allow SImode if TARGET_DIV32.
(<optab:ANY_DIV>si3_extended): New insn if TARGET_DIV32.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/div-div32.c: New test.
* gcc.target/loongarch/div-no-div32.c: New test.
Xi Ruoyao [Fri, 17 Nov 2023 19:19:07 +0000 (03:19 +0800)]
LoongArch: Add evolution features of base ISA revisions
* config/loongarch/loongarch-def.h:
(loongarch_isa_base_features): Declare. Define it in ...
* config/loongarch/loongarch-cpu.cc
(loongarch_isa_base_features): ... here.
(fill_native_cpu_config): If we know the base ISA of the CPU
model from PRID, use it instead of la64 (v1.0). Check if all
expected features of this base ISA is available, emit a warning
if not.
* config/loongarch/loongarch-opts.cc (config_target_isa): Enable
the features implied by the base ISA if not -march=native.
Xi Ruoyao [Thu, 16 Nov 2023 00:56:58 +0000 (08:56 +0800)]
LoongArch: genopts: Add infrastructure to generate code for new features in ISA evolution
LoongArch v1.10 introduced the concept of ISA evolution. During ISA
evolution, many independent features can be added and enumerated via
CPUCFG.
Add a data file into genopts storing the CPUCFG word, bit, the name
of the command line option controlling if this feature should be used
for compilation, and the text description. Make genstr.sh process these
info and add the command line options into loongarch.opt and
loongarch-str.h, and generate a new file loongarch-cpucfg-map.h for
mapping CPUCFG output to the corresponding option. When handling
-march=native, use the information in loongarch-cpucfg-map.h to generate
the corresponding option mask. Enable the features implied by -march
setting unless the user has explicitly disabled the feature.
The added options (-mdiv32 and -mld-seq-sa) are not really handled yet.
They'll be used in the following patches.
gcc/ChangeLog:
* config/loongarch/genopts/isa-evolution.in: New data file.
* config/loongarch/genopts/genstr.sh: Translate info in
isa-evolution.in when generating loongarch-str.h, loongarch.opt,
and loongarch-cpucfg-map.h.
* config/loongarch/genopts/loongarch.opt.in (isa_evolution):
New variable.
* config/loongarch/t-loongarch: (loongarch-cpucfg-map.h): New
rule.
(loongarch-str.h): Depend on isa-evolution.in.
(loongarch.opt): Depend on isa-evolution.in.
(loongarch-cpu.o): Depend on loongarch-cpucfg-map.h.
* config/loongarch/loongarch-str.h: Regenerate.
* config/loongarch/loongarch-def.h (loongarch_isa): Add field
for evolution features. Add helper function to enable features
in this field.
Probe native CPU capability and save the corresponding options
into preset.
* config/loongarch/loongarch-cpu.cc (fill_native_cpu_config):
Probe native CPU capability and save the corresponding options
into preset.
(cache_cpucfg): Simplify with C++11-style for loop.
(cpucfg_useful_idx, N_CPUCFG_WORDS): Move to ...
* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Enable the ISA evolution
feature options implied by -march and not explicitly disabled.
(loongarch_asm_code_end): New function, print ISA information as
comments in the assembly if -fverbose-asm. It makes easier to
debug things like -march=native.
(TARGET_ASM_CODE_END): Define.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch-cpucfg-map.h: Generate.
(cpucfg_useful_idx, N_CPUCFG_WORDS) ... here.
Xi Ruoyao [Fri, 17 Nov 2023 12:44:17 +0000 (20:44 +0800)]
LoongArch: Fix internal error running "gcc -march=native" on LA664
On LA664, the PRID preset is ISA_BASE_LA64V110 but the base architecture
is guessed ISA_BASE_LA64V100. This causes a warning to be outputed:
cc1: warning: base architecture 'la64' differs from PRID preset '?'
But we've not set the "?" above in loongarch_isa_base_strings, thus it's
a nullptr and then an ICE is triggered.
Add ISA_BASE_LA64V110 to genopts and initialize
loongarch_isa_base_strings[ISA_BASE_LA64V110] correctly to fix the ICE.
The warning itself will be fixed later.
Sebastian Huber [Tue, 14 Nov 2023 20:36:51 +0000 (21:36 +0100)]
gcov: Improve -fprofile-update=atomic
The code coverage support uses counters to determine which edges in the control
flow graph were executed. If a counter overflows, then the code coverage
information is invalid. Therefore the counter type should be a 64-bit integer.
In multi-threaded applications, it is important that the counter increments are
atomic. This is not the case by default. The user can enable atomic counter
increments through the -fprofile-update=atomic and
-fprofile-update=prefer-atomic options.
If the target supports 64-bit atomic operations, then everything is fine. If
not and -fprofile-update=prefer-atomic was chosen by the user, then non-atomic
counter increments will be used. However, if the target does not support the
required atomic operations and -fprofile-atomic=update was chosen by the user,
then a warning was issued and as a forced fallback to non-atomic operations was
done. This is probably not what a user wants. There is still hardware on the
market which does not have atomic operations and is used for multi-threaded
applications. A user which selects -fprofile-update=atomic wants consistent
code coverage data and not random data.
This patch removes the fallback to non-atomic operations for
-fprofile-update=atomic the target platform supports libatomic. To
mitigate potential performance issues an optimization for systems which
only support 32-bit atomic operations is provided. Here, the edge
counter increments are done like this:
In gimple_gen_time_profiler() this split operation cannot be used, since the
updated counter value is also required. Here, a library call is emitted. This
is not a performance issue since the update is only done if counters[0] == 0.
gcc/c-family/ChangeLog:
* c-cppbuiltin.cc (c_cpp_builtins): Define
__LIBGCC_HAVE_LIBATOMIC for libgcov.
gcc/ChangeLog:
* doc/invoke.texi (-fprofile-update): Clarify default method. Document
the atomic method behaviour.
* tree-profile.cc (enum counter_update_method): New.
(counter_update): Likewise.
(gen_counter_update): Use counter_update_method. Split the
atomic counter update in two 32-bit atomic operations if
necessary.
(tree_profiling): Select counter_update_method.
libgcc/ChangeLog:
* libgcov.h (GCOV_SUPPORTS_ATOMIC): Always define it.
Set it also to 1, if __LIBGCC_HAVE_LIBATOMIC is defined.
Sebastian Huber [Sat, 21 Oct 2023 13:52:15 +0000 (15:52 +0200)]
gcov: Add gen_counter_update()
Move the counter update to the new gen_counter_update() helper function. Use
it in gimple_gen_edge_profiler() and gimple_gen_time_profiler(). The resulting
gimple instructions should be identical with the exception of the removed
unshare_expr() call. The unshare_expr() call was used in
gimple_gen_edge_profiler().
Kito Cheng [Sat, 18 Nov 2023 10:37:11 +0000 (18:37 +0800)]
RISC-V: Fix mismatched new delete for unique_ptr
gcc/ChangeLog:
* config/riscv/riscv-target-attr.cc
(riscv_target_attr_parser::parse_arch): Use char[] for
std::unique_ptr to prevent mismatched new delete issue.
(riscv_process_one_target_attr): Ditto.
(riscv_process_target_attr): Ditto.
Lulu Cheng [Fri, 17 Nov 2023 08:04:45 +0000 (16:04 +0800)]
LoongArch: atomic_load and atomic_store are implemented using dbar grading.
Because the la464 memory model design allows the same address load out of order,
so in the following test example, the Load of 23 lines may be executed first over
the load of 21 lines, resulting in an error.
So when memmodel is MEMMODEL_RELAXED, the load instruction will be followed by
"dbar 0x700" when implementing _atomic_load.
1 void *
2 gomp_ptrlock_get_slow (gomp_ptrlock_t *ptrlock)
3 {
4 int *intptr;
5 uintptr_t oldval = 1;
6
7 __atomic_compare_exchange_n (ptrlock, &oldval, 2, false,
8 MEMMODEL_RELAXED, MEMMODEL_RELAXED);
9
10 /* futex works on ints, not pointers.
11 But a valid work share pointer will be at least
12 8 byte aligned, so it is safe to assume the low
13 32-bits of the pointer won't contain values 1 or 2. */
14 __asm volatile ("" : "=r" (intptr) : "0" (ptrlock));
15 #if __BYTE_ORDER == __BIG_ENDIAN
16 if (sizeof (*ptrlock) > sizeof (int))
17 intptr += (sizeof (*ptrlock) / sizeof (int)) - 1;
18 #endif
19 do
20 do_wait (intptr, 2);
21 while (__atomic_load_n (intptr, MEMMODEL_RELAXED) == 2);
22 __asm volatile ("" : : : "memory");
23 return (void *) __atomic_load_n (ptrlock, MEMMODEL_ACQUIRE);
24 }
gcc/ChangeLog:
* config/loongarch/sync.md (atomic_load<mode>): New template.
Lulu Cheng [Fri, 17 Nov 2023 07:42:53 +0000 (15:42 +0800)]
LoongArch: Implement atomic operations using LoongArch1.1 instructions.
1. short and char type calls for atomic_add_fetch and __atomic_fetch_add are
implemented using amadd{_db}.{b/h}.
2. Use amcas{_db}.{b/h/w/d} to implement __atomic_compare_exchange_n and __atomic_compare_exchange.
3. The short and char types of the functions __atomic_exchange and __atomic_exchange_n are
implemented using amswap{_db}.{b/h}.
Lulu Cheng [Thu, 16 Nov 2023 12:43:53 +0000 (20:43 +0800)]
LoongArch: Add LA664 support.
Define ISA_BASE_LA64V110, which represents the base instruction set defined in LoongArch1.1.
Support the configure setting --with-arch =la664, and support -march=la664,-mtune=la664.
gcc/ChangeLog:
* config.gcc: Support LA664.
* config/loongarch/genopts/loongarch-strings: Likewise.
* config/loongarch/genopts/loongarch.opt.in: Likewise.
* config/loongarch/loongarch-cpu.cc (fill_native_cpu_config): Likewise.
* config/loongarch/loongarch-def.c: Likewise.
* config/loongarch/loongarch-def.h (N_ISA_BASE_TYPES): Likewise.
(ISA_BASE_LA64V110): Define macro.
(N_ARCH_TYPES): Update value.
(N_TUNE_TYPES): Update value.
(CPU_LA664): New macro.
* config/loongarch/loongarch-opts.cc (isa_default_abi): Likewise.
(isa_base_compat_p): Likewise.
* config/loongarch/loongarch-opts.h (TARGET_64BIT): This parameter is enabled
when la_target.isa.base is equal to ISA_BASE_LA64V100 or ISA_BASE_LA64V110.
(TARGET_uARCH_LA664): Define macro.
* config/loongarch/loongarch-str.h (STR_CPU_LA664): Likewise.
* config/loongarch/loongarch.cc (loongarch_cpu_sched_reassociation_width):
Add LA664 support.
* config/loongarch/loongarch.opt: Regenerate.
Lulu Cheng [Thu, 16 Nov 2023 07:06:11 +0000 (15:06 +0800)]
LoongArch: Add code generation support for call36 function calls.
When compiling with '-mcmodel=medium', the function call is made through
'pcaddu18i+jirl' if binutils supports call36, otherwise the
native implementation 'pcalau12i+jirl' is used.
gcc/ChangeLog:
* config.in: Regenerate.
* config/loongarch/loongarch-opts.h (HAVE_AS_SUPPORT_CALL36): Define macro.
* config/loongarch/loongarch.cc (loongarch_legitimize_call_address):
If binutils supports call36, the function call is not split over expand.
* config/loongarch/loongarch.md: Add call36 generation code.
* config/loongarch/predicates.md: Likewise.
* configure: Regenerate.
* configure.ac: Check whether binutils supports call36.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/func-call-medium-5.c: If the assembler supports call36,
the test is abandoned.
* gcc.target/loongarch/func-call-medium-6.c: Likewise.
* gcc.target/loongarch/func-call-medium-7.c: Likewise.
* gcc.target/loongarch/func-call-medium-8.c: Likewise.
* lib/target-supports.exp: Added a function to see if the assembler supports
the call36 relocation.
* gcc.target/loongarch/func-call-medium-call36-1.c: New test.
* gcc.target/loongarch/func-call-medium-call36.c: New test.
David Malcolm [Sat, 18 Nov 2023 00:55:25 +0000 (19:55 -0500)]
analyzer: new warning: -Wanalyzer-infinite-loop [PR106147]
This patch implements a new analyzer warning: -Wanalyzer-infinite-loop.
It works by examining the exploded graph once the latter has been
fully built. It attempts to detect cycles in the exploded graph in
which:
- no externally visible work occurs
- no escape is possible from the cycle once it has been entered
- the program state is "sufficiently concrete" at each step:
- no unknown activity could be occurring
- the worklist was fully drained for each enode in the cycle
i.e. every enode in the cycle is processed
For example, it correctly complains about this bogus "for" loop:
int sum = 0;
for (struct node *iter = n; iter; iter->next)
sum += n->val;
return sum;
like this:
infinite-loop-linked-list.c: In function ‘for_loop_noop_next’:
infinite-loop-linked-list.c:110:31: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
110 | for (struct node *iter = n; iter; iter->next)
| ^~~~
‘for_loop_noop_next’: events 1-5
|
| 110 | for (struct node *iter = n; iter; iter->next)
| | ^~~~
| | |
| | (1) infinite loop here
| | (2) when ‘iter’ is non-NULL: always following ‘true’ branch...
| | (5) ...to here
| 111 | sum += n->val;
| | ~~~~~~~~~~~~~
| | | |
| | | (3) ...to here
| | (4) looping back...
|
gcc/ChangeLog:
PR analyzer/106147
* Makefile.in (ANALYZER_OBJS): Add analyzer/infinite-loop.o.
* doc/invoke.texi: Add -fdump-analyzer-infinite-loop and
-Wanalyzer-infinite-loop. Add missing CWE link for
-Wanalyzer-infinite-recursion.
* timevar.def (TV_ANALYZER_INFINITE_LOOPS): New.
gcc/analyzer/ChangeLog:
PR analyzer/106147
* analyzer.opt (Wanalyzer-infinite-loop): New option.
(fdump-analyzer-infinite-loop): New option.
* checker-event.h (start_cfg_edge_event::get_desc): Drop "final".
(start_cfg_edge_event::maybe_describe_condition): Convert from
private to protected.
* checker-path.h (checker_path::get_logger): New.
* diagnostic-manager.cc (process_worklist_item): Update for
new context param of maybe_update_for_edge.
* engine.cc
(impl_region_model_context::impl_region_model_context): Add
out_could_have_done_work param to both ctors and use it to
initialize mm_out_could_have_done_work.
(impl_region_model_context::maybe_did_work): New vfunc
implementation.
(exploded_node::on_stmt): Add out_could_have_done_work param and
pass to ctxt ctor.
(exploded_node::on_stmt_pre): Treat setjmp and longjmp as "doing
work".
(exploded_node::on_longjmp): Likewise.
(exploded_edge::exploded_edge): Add "could_do_work" param and use
it to initialize m_could_do_work_p.
(exploded_edge::dump_dot_label): Add result of could_do_work_p.
(exploded_graph::add_function_entry): Mark edge as doing no work.
(exploded_graph::add_edge): Add "could_do_work" param and pass to
exploded_edge ctor.
(add_tainted_args_callback): Treat as doing no work.
(exploded_graph::process_worklist): Likewise when merging nodes.
(maybe_process_run_of_before_supernode_enodes::item): Likewise.
(exploded_graph::maybe_create_dynamic_call): Likewise.
(exploded_graph::process_node): Likewise for phi nodes.
Pass in a "could_have_done_work" bool when handling stmts and use
when creating edges. Assume work is done at bifurcation.
(exploded_path::feasible_p): Update for new context param of
maybe_update_for_edge.
(feasibility_state::feasibility_state): New ctor.
(feasibility_state::operator=): New.
(feasibility_state::maybe_update_for_edge): Add ctxt param and use
it. Fix missing newline when logging state.
(impl_run_checkers): Call exploded_graph::detect_infinite_loops.
* exploded-graph.h
(impl_region_model_context::impl_region_model_context): Add
out_could_have_done_work param to both ctors.
(impl_region_model_context::maybe_did_work): New decl.
(impl_region_model_context::checking_for_infinite_loop_p): New.
(impl_region_model_context::on_unusable_in_infinite_loop): New.
(impl_region_model_context::m_out_could_have_done_work): New
field.
(exploded_node::on_stmt): Add "out_could_have_done_work" param.
(exploded_edge::exploded_edge): Add "could_do_work" param.
(exploded_edge::could_do_work_p): New accessor.
(exploded_edge::m_could_do_work_p): New field.
(exploded_graph::add_edge): Add "could_do_work" param.
(exploded_graph::detect_infinite_loops): New decl.
(feasibility_state::feasibility_state): New ctor.
(feasibility_state::operator=): New decl.
(feasibility_state::maybe_update_for_edge): Add ctxt param.
* infinite-loop.cc: New file.
* program-state.cc (program_state::on_edge): Log the rejected
constraint when region_model::maybe_update_for_edge fails.
* region-model.cc (region_model::on_assignment): Treat any writes
other than to the stack as "doing work".
(region_model::on_stmt_pre): Treat all asm stmts as "doing work".
(region_model::on_call_post): Likewise for all calls to functions
with unknown side effects.
(region_model::handle_phi): Add svals_changing_meaning param.
Mark widening svalue in phi nodes as changing meaning.
(unusable_in_infinite_loop_constraint_p): New.
(region_model::add_constraint): If we're checking for an infinite
loop, bail out on unusable svalues, or if we don't have a definite
true/false for the constraint.
(region_model::update_for_phis): Gather all svalues changing
meaning in phi nodes, and purge constraints involving them.
(region_model::replay_call_summary): Treat all call summaries as
doing work.
(region_model::can_merge_with_p): Purge constraints involving
svalues that change meaning.
(model_merger::on_widening_reuse): New.
(test_iteration_1): Likewise.
(selftest::test_iteration_1): Remove assertion that model6 "knows"
that i < 157.
* region-model.h (region_model::handle_phi): Add
svals_changing_meaning param
(region_model_context::maybe_did_work): New pure virtual func.
(region_model_context::checking_for_infinite_loop_p): Likewise.
(region_model_context::on_unusable_in_infinite_loop): Likewise.
(noop_region_model_context::maybe_did_work): Implement.
(noop_region_model_context::checking_for_infinite_loop_p):
Likewise.
(noop_region_model_context::on_unusable_in_infinite_loop):
Likewise.
(region_model_context_decorator::maybe_did_work): Implement.
(region_model_context_decorator::checking_for_infinite_loop_p):
Likewise.
(region_model_context_decorator::on_unusable_in_infinite_loop):
Likewise.
(model_merger::on_widening_reuse): New decl.
(model_merger::m_svals_changing_meaning): New field.
* sm-signal.cc (register_signal_handler::impl_transition): Assume
the edge "does work".
* supergraph.cc (supernode::get_start_location): Use CFG edge's
goto_locus if available.
(supernode::get_end_location): Likewise.
(cfg_superedge::dump_label_to_pp): Dump edges with a "goto_locus"
* supergraph.h (cfg_superedge::get_goto_locus): New.
* svalue.cc (svalue::can_merge_p): Call on_widening_reuse for
widening values.
(involvement_visitor::visit_widening_svalue): New.
(svalue::involves_p): Update assertion to allow widening svalues.
gcc/testsuite/ChangeLog:
PR analyzer/106147
* c-c++-common/analyzer/gzio-2.c: Add dg-warning for infinite
loop, marked as xfail.
* c-c++-common/analyzer/infinite-loop-2.c: New test.
* c-c++-common/analyzer/infinite-loop-4.c: New test.
* c-c++-common/analyzer/infinite-loop-crc32c.c: New test.
* c-c++-common/analyzer/infinite-loop-doom-d_main-IdentifyVersion.c:
New test.
* c-c++-common/analyzer/infinite-loop-doom-v_video.c: New test.
* c-c++-common/analyzer/infinite-loop-g_error.c: New test.
* c-c++-common/analyzer/infinite-loop-linked-list.c: New test.
* c-c++-common/analyzer/infinite-recursion-inlining.c: Add
dg-warning directives for infinite loop.
* c-c++-common/analyzer/inlining-4-multiline.c: Update expected
paths for event 5 having a location.
* gcc.dg/analyzer/boxed-malloc-1.c: Add dg-warning for infinite
loop.
* gcc.dg/analyzer/data-model-20.c: Likewise. Add comment about
suspect code, and create...
* gcc.dg/analyzer/data-model-20a.c: ...this new test by cleaning
it up.
* gcc.dg/analyzer/edges-1.c: Add a placeholder statement to avoid
the "...to here" from the if stmt occurring at the "while", and
thus being treated as a bogus event.
* gcc.dg/analyzer/explode-2a.c: Add dg-warning for infinite loop.
* gcc.dg/analyzer/infinite-loop-1.c: New test.
* gcc.dg/analyzer/malloc-1.c: Add dg-warning for infinite loop.
* gcc.dg/analyzer/out-of-bounds-coreutils.c: Add TODO.
* gcc.dg/analyzer/paths-4.c: Add dg-warning for infinite loop.
* gcc.dg/analyzer/pr103892.c: Likewise.
* gcc.dg/analyzer/pr93546.c: Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Robin Dapp [Thu, 16 Nov 2023 19:42:10 +0000 (20:42 +0100)]
vect: Pass truth type to vect_get_vec_defs.
For conditional operations the mask is loop invariant and cannot be
stored explicitly. By default, for reductions, we deduce the vectype
from the statement or the loop but this does not work for conditional
operations. Therefore this patch passes the truth type of the reduction
input vectype for the mask operand instead. This will override the
other choices and make sure we have the proper mask vectype.
gcc/ChangeLog:
PR middle-end/112406
PR middle-end/112552
* tree-vect-loop.cc (vect_transform_reduction): Pass truth
vectype for mask operand.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/pr112406.c: New test.
* gcc.target/riscv/rvv/autovec/pr112552.c: New test.
This was approved for C++26 last week at the WG21 meeting in Kona.
libstdc++-v3/ChangeLog:
* include/Makefile.am: Add new header.
* include/Makefile.in: Regenerate.
* include/bits/version.def (saturation_arithmetic): Define.
* include/bits/version.h: Regenerate.
* include/std/numeric: Include new header.
* include/bits/sat_arith.h: New file.
* testsuite/26_numerics/saturation/add.cc: New test.
* testsuite/26_numerics/saturation/cast.cc: New test.
* testsuite/26_numerics/saturation/div.cc: New test.
* testsuite/26_numerics/saturation/mul.cc: New test.
* testsuite/26_numerics/saturation/sub.cc: New test.
* testsuite/26_numerics/saturation/version.cc: New test.
Jakub Jelinek [Fri, 17 Nov 2023 14:43:31 +0000 (15:43 +0100)]
c++: Implement C++ DR 2406 - [[fallthrough]] attribute and iteration statements
The following patch implements
CWG 2406 - [[fallthrough]] attribute and iteration statements
The genericization of some loops leaves nothing at all or just a label
after a body of a loop, so if the loop is later followed by
case or default label in a switch, the fallthrough statement isn't
diagnosed.
The following patch implements it by marking the IFN_FALLTHROUGH call
in such a case, such that during gimplification it can be pedantically
diagnosed even if it is followed by case or default label or some normal
labels followed by case/default labels.
While looking into this, I've discovered other problems.
expand_FALLTHROUGH_r is removing the IFN_FALLTHROUGH calls from the IL,
but wasn't telling that to walk_gimple_stmt/walk_gimple_seq_mod, so
the callers would then skip the next statement after it, and it would
return non-NULL if the removed stmt was last in the sequence. This could
lead to wi->callback_result being set even if it didn't appear at the very
end of switch sequence.
The patch makes use of wi->removed_stmt such that the callers properly
know what happened, and use different way to handle the end of switch
sequence case.
That change discovered a bug in the gimple-walk handling of
wi->removed_stmt. If that flag is set, the callback is telling the callers
that the current statement has been removed and so the innermost
walk_gimple_seq_mod shouldn't gsi_next. The problem is that
wi->removed_stmt is only reset at the start of a walk_gimple_stmt, but that
can be too late for some cases. If we have two nested gimple sequences,
say GIMPLE_BIND as the last stmt of some gimple seq, we remove the last
statement inside of that GIMPLE_BIND, set wi->removed_stmt there, don't
do gsi_next correctly because already gsi_remove moved us to the next stmt,
there is no next stmt, so we return back to the caller, but wi->removed_stmt
is still set and so we don't do gsi_next even in the outer sequence, despite
the GIMPLE_BIND (etc.) not being removed. That means we walk the
GIMPLE_BIND with its whole sequence again.
The patch fixes that by resetting wi->removed_stmt after we've used that
flag in walk_gimple_seq_mod. Nothing really uses that flag after the
outermost walk_gimple_seq_mod, it is just a private notification that
the stmt callback has removed a stmt.
2023-11-17 Jakub Jelinek <jakub@redhat.com>
PR c++/107571
gcc/
* gimplify.cc (expand_FALLTHROUGH_r): Use wi->removed_stmt after
gsi_remove, change the way of passing fallthrough stmt at the end
of sequence to expand_FALLTHROUGH. Diagnose IFN_FALLTHROUGH
with GF_CALL_NOTHROW flag.
(expand_FALLTHROUGH): Change loc into array of 2 location_t elts,
don't test wi.callback_result, instead check whether first
elt is not UNKNOWN_LOCATION and in that case pedwarn with the
second location.
* gimple-walk.cc (walk_gimple_seq_mod): Clear wi->removed_stmt
after the flag has been used.
* internal-fn.def (FALLTHROUGH): Mention in comment the special
meaning of the TREE_NOTHROW/GF_CALL_NOTHROW flag on the calls.
gcc/c-family/
* c-gimplify.cc (genericize_c_loop): For C++ mark IFN_FALLTHROUGH
call at the end of loop body as TREE_NOTHROW.
gcc/testsuite/
* g++.dg/DRs/dr2406.C: New test.
Jonathan Wakely [Fri, 17 Nov 2023 12:52:45 +0000 (12:52 +0000)]
libstdc++: Add more Doxygen comments and another test for std::out_ptr
Improve Doxygen comments for std::out_ptr etc. and add a test for the
feature test macro. Also remove a redundant preprocessor condition.
Ideally the docs for std::out_ptr and std::inout_ptr would show examples
of how to use them and what they do, but that would take some effort.
I'll aim to do that before GCC 14 is released.
libstdc++-v3/ChangeLog:
* include/bits/out_ptr.h: Add Doxygen comments. Remove a
redundant preprocessor condition.
* testsuite/20_util/smartptr.adapt/version.cc: New test.
Jakub Jelinek [Fri, 17 Nov 2023 14:10:51 +0000 (15:10 +0100)]
match.pd: Optimize ctz/popcount/parity/ffs on extended argument [PR112566]
ctz(ext(X)) is the same as ctz(X) in the UB on zero case (or could be also
in the 2 argument case on large BITINT_TYPE by preserving the argument, not
implemented in this patch),
popcount(zext(X)) is the same as popcount(X),
parity(zext(X)) is the same as parity(X),
parity(sext(X)) is the same as parity(X) provided the bit difference between
the extended and unextended types is even,
ffs(ext(X)) is the same as ffs(X).
The following patch optimizes those in match.pd if those are beneficial
(always in the large BITINT_TYPE case, or if the narrower type has optab
and the wider doesn't, or the wider is larger than word and narrower is
one of the standard argument sizes (tested just int and long long, as
long is on most targets same bitsize as one of those two).
Joseph in the PR mentioned that ctz(narrow(X)) is the same as ctz(X)
if UB on 0, but that can be handled incrementally (and would need different
decisions when it is profitable).
And clz(zext(X)) is clz(X) + bit_difference, but not sure we want to change
that in match.pd at all, perhaps during insn selection?
2023-11-17 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112566
PR tree-optimization/83171
* match.pd (ctz(ext(X)) -> ctz(X), popcount(zext(X)) -> popcount(X),
parity(ext(X)) -> parity(X), ffs(ext(X)) -> ffs(X)): New
simplifications.
( __builtin_ffs (X) == 0 -> X == 0): Use FFS rather than
BUILT_IN_FFS BUILT_IN_FFSL BUILT_IN_FFSLL BUILT_IN_FFSIMAX.
* gcc.dg/pr112566-1.c: New test.
* gcc.dg/pr112566-2.c: New test.
* gcc.target/i386/pr78057.c (foo): Pass another long long argument
and use it in __builtin_ia32_*zcnt_u64 instead of the int one.
Jakub Jelinek [Fri, 17 Nov 2023 14:09:44 +0000 (15:09 +0100)]
vect: Fix check_reduction_path [PR112374]
As mentioned in the PR, the intent of the r14-5076 changes was that
it doesn't count one of the uses on the use_stmt, but what actually
got implemented is that it does this processing on any op_use_stmt,
even if it is not the use_stmt statement, which means that it
can increase count even on debug stmts (-fcompare-debug failures),
or if there would be some other use stmt with 2+ uses it could count
that as a single use. Though, because it fails whenever cnt != 1
and I believe use_stmt must be one of the uses, it would probably
fail in the latter case anyway.
The following patch fixes that by doing this extra processing only when
op_use_stmt is use_stmt, and using the normal processing otherwise
(so ignore debug stmts, and increase on any uses on the stmt).
2023-11-17 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112374
* tree-vect-loop.cc (check_reduction_path): Perform the cond_fn_p
special case only if op_use_stmt == use_stmt, use as_a rather than
dyn_cast in that case.
* gcc.dg/pr112374-1.c: New test.
* gcc.dg/pr112374-2.c: New test.
* g++.dg/opt/pr112374.C: New test.
Tobias Burnus [Fri, 17 Nov 2023 12:34:55 +0000 (13:34 +0100)]
Fortran: Accept -std=f2023, update line-length for Fortran 2023
This patch accepts -std=f2023, uses it by default and bumps for the
free-source form the line length to 10,000 and the statement length
alias number of continuation lines to unlimited.
gcc/fortran/ChangeLog:
* gfortran.texi (_gfortran_set_options): Document GFC_STD_F2023.
* invoke.texi (std,pedantic,Wampersand,Wtabs): Add -std=2023.
* lang.opt (std=f2023): Add.
* libgfortran.h (GFC_STD_F2023, GFC_STD_OPT_F23): Add.
* options.cc (set_default_std_flags): Add GFC_STD_F2023.
(gfc_init_options): Set max_continue_free to 1,000,000.
(gfc_post_options): Set flag_free_line_length if unset.
(gfc_handle_option): Add OPT_std_f2023, set max_continue_free = 255
for -std=f2003, f2008 and f2018.
Georg-Johann Lay [Fri, 17 Nov 2023 11:51:16 +0000 (12:51 +0100)]
PR target/53372: Don't ignore section attribute with address-space.
gcc/
PR target/53372
* config/avr/avr.cc (avr_asm_named_section) [AVR_SECTION_PROGMEM]:
Only return some .progmem*.data section if the user did not
specify a section attribute.
(avr_section_type_flags) [avr_progmem_p]: Unset SECTION_NOTYPE
in returned section flags.
gcc/testsuite/
PR target/53372
* gcc.target/avr/pr53372-1.c: New test.
* gcc.target/avr/pr53372-2.c: New test.
* config/loongarch/lsx.md (copysign<mode>3): Allow operand[2] to
be an reg_or_vector_same_val_operand. If it's a const vector
with same negative elements, expand the copysign with a bitset
instruction. Otherwise, force it into an register.
* config/loongarch/lasx.md (copysign<mode>3): Likewise.
gcc/testsuite/ChangeLog:
* g++.target/loongarch/vect-copysign-negconst.C: New test.
* g++.target/loongarch/vect-copysign-negconst-run.C: New test.
Haochen Gui [Fri, 17 Nov 2023 09:17:59 +0000 (17:17 +0800)]
rs6000: Fix regression cases caused 16-byte by pieces move
The previous patch enables 16-byte by pieces move. Originally 16-byte
move is implemented via pattern. expand_block_move does an optimization
on P8 LE to leverage V2DI reversed load/store for memory to memory move.
Now 16-byte move is implemented via by pieces move and finally split to
two DI load/store. This patch creates an insn_and_split pattern to
retake the optimization.
Haochen Gui [Fri, 17 Nov 2023 09:12:32 +0000 (17:12 +0800)]
rs6000: Enable vector mode for by pieces equality compare
This patch adds a new expand pattern - cbranchv16qi4 to enable vector
mode by pieces equality compare on rs6000. The macro MOVE_MAX_PIECES
(COMPARE_MAX_PIECES) is set to 16 bytes when EFFICIENT_UNALIGNED_VSX
is enabled, otherwise keeps unchanged. The macro STORE_MAX_PIECES is
set to the same value as MOVE_MAX_PIECES by default, so now it's
explicitly defined and keeps unchanged.
Li Wei [Fri, 17 Nov 2023 02:38:02 +0000 (10:38 +0800)]
LoongArch: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO
The LoongArch has defined ctz and clz on the backend, but if we want GCC
do CTZ transformation optimization in forwprop2 pass, GCC need to know
the value of c[lt]z at zero, which may be beneficial for some test cases
(like spec2017 deepsjeng_r).
After implementing the macro, we test dynamic instruction count on
deepsjeng_r:
- before 1688423249186
- after 1660311215745 (1.66% reduction)
Richard Biener [Mon, 30 Oct 2023 12:17:11 +0000 (13:17 +0100)]
Assert we don't create recursive DW_AT_{abstract_origin,specification}
We have a support case that shows GCC 7 sometimes creates
DW_TAG_label refering to itself via a DW_AT_abstract_origin
when using LTO. This for example triggers the sanity check
added below during LTO bootstrap.
Making this check cover more than just DW_AT_abstract_origin
breaks bootstrap on trunk for
/* GNU extension: Record what type our vtable lives in. */
if (TYPE_VFIELD (type))
{
tree vtype = DECL_FCONTEXT (TYPE_VFIELD (type));
Jiahao Xu [Thu, 16 Nov 2023 08:44:36 +0000 (16:44 +0800)]
LoongArch: Increase cost of vector aligned store/load.
Based on SPEC2017 performance evaluation results, it's better to make them equal
to the cost of unaligned store/load so as to avoid odd alignment peeling.
Andrew Pinski [Mon, 13 Nov 2023 20:18:34 +0000 (20:18 +0000)]
Only allow (copysign x, NEG_CONST) -> (fneg (fabs x)) simplification for constant folding [PR112483]
On targets with native copysign instructions, (copysign x, -1) is
usually more efficient than (fneg (fabs x)). Since r14-5284, in the
middle end we always optimize (fneg (fabs x)) to (copysign x, -1), not
vice versa. If the target does not support native fcopysign,
expand_COPYSIGN will expand it as (fneg (fabs x)) anyway.
gcc/ChangeLog:
PR rtl-optimization/112483
* simplify-rtx.cc (simplify_binary_operation_1) <case COPYSIGN>:
Call simplify_unary_operation for NEG instead of
simplify_gen_unary.
Uros Bizjak [Thu, 16 Nov 2023 17:07:36 +0000 (18:07 +0100)]
i386: Optimize QImode insn with high input registers
Sometimes the compiler emits the following code with <insn>qi_ext<mode>_0:
shrl $8, %eax
addb %bh, %al
Patch introduces new low part QImode insn patterns with both of
their input arguments extracted from high register. This invalid
insn is split after reload to a move from the high register
and <insn>qi_ext<mode>_0 instruction. The combine pass is able to
convert shift to zero/sign-extract sub-RTX, which we split to the
optimal:
movzbl %bh, %edx
addb %ah, %dl
PR target/78904
gcc/ChangeLog:
* config/i386/i386.md (*addqi_ext2<mode>_0):
New define_insn_and_split pattern.
(*subqi_ext2<mode>_0): Ditto.
(*<code>qi_ext2<mode>_0): Ditto.
gcc/testsuite/ChangeLog:
* gcc.target/i386/pr78904-10.c: New test.
* gcc.target/i386/pr78904-10a.c: New test.
* gcc.target/i386/pr78904-10b.c: New test.
hppa: Revise REG+D address support to allow long displacements before reload
In analyzing PR rtl-optimization/112415, I realized that restricting
REG+D offsets to 5-bits before reload results in very poor code and
complexities in optimizing these instructions after reload. The
general problem is long displacements are not allowed for floating
point accesses when generating PA 1.1 code. Even with PA 2.0, there
is a ELF linker bug that prevents using long displacements for
floating point loads and stores.
In the past, enabling long displacements before reload caused issues
in reload. However, there have been fixes in the handling of reloads
for floating-point accesses. This change allows long displacements
before reload and corrects a couple of issues in the constraint
handling for integer and floating-point accesses.
2023-11-16 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR rtl-optimization/112415
* config/pa/pa.cc (pa_legitimate_address_p): Allow 14-bit
displacements before reload. Simplify logic flow. Revise
comments.
* config/pa/pa.h (TARGET_ELF64): New define.
(INT14_OK_STRICT): Update define and comment.
* config/pa/pa64-linux.h (TARGET_ELF64): Define.
* config/pa/predicates.md (base14_operand): Don't check
alignment of short displacements.
(integer_store_memory_operand): Don't return true when
reload_in_progress is true. Remove INT_5_BITS check.
(floating_point_store_memory_operand): Don't return true when
reload_in_progress is true. Use INT14_OK_STRICT to check
whether long displacements are always okay.
Eric Botcazou [Thu, 16 Nov 2023 17:36:44 +0000 (18:36 +0100)]
Fix internal error on function returning dynamically-sized type
This is a tree sharing issue for the internal return type synthesized for
a function returning a dynamically-sized type and taking an Out or In/Out
parameter passed by copy.
gcc/ada/
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Also create a
TYPE_DECL for the return type built for the CI/CO mechanism.
gcc/testsuite/
* gnat.dg/varsize4.ads, gnat.dg/varsize4.adb: New test.
* gnat.dg/varsize4_pkg.ads: New helper.
Jonathan Wakely [Thu, 16 Nov 2023 16:11:18 +0000 (16:11 +0000)]
libstdc++: Fix aligned formatting of stacktrace_entry and thread::id [PR112564]
The formatter for std::thread::id should default to right-align, and the
formatter for std::stacktrace_entry should not just ignore the
fill-and-align and width from the format-spec!
libstdc++-v3/ChangeLog:
PR libstdc++/112564
* include/std/stacktrace (formatter::format): Format according
to format-spec.
* include/std/thread (formatter::format): Use _Align_right as
default.
* testsuite/19_diagnostics/stacktrace/output.cc: Check
fill-and-align handling. Change compile test to run.
* testsuite/30_threads/thread/id/output.cc: Check fill-and-align
handling.
Jakub Jelinek [Thu, 16 Nov 2023 16:42:22 +0000 (17:42 +0100)]
c++: Fix error recovery ICE [PR112365]
check_field_decls for DECL_C_BIT_FIELD FIELD_DECLs with error_mark_node
TREE_TYPE continues early and doesn't call check_bitfield_decl which would
either set DECL_BIT_FIELD, or clear DECL_C_BIT_FIELD. So, the following
testcase ICEs after emitting tons of errors, because
SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD asserts DECL_BIT_FIELD.
The patch skips that for FIELD_DECLs with error_mark_node, another
option would be to check DECL_BIT_FIELD in addition to DECL_C_BIT_FIELD.
2023-11-16 Jakub Jelinek <jakub@redhat.com>
PR c++/112365
* class.cc (layout_class_type): Don't
SET_DECL_FIELD_CXX_ZERO_WIDTH_BIT_FIELD on FIELD_DECLs with
error_mark_node type.
Patrick Palka [Thu, 16 Nov 2023 14:32:07 +0000 (09:32 -0500)]
c++: constantness of call to function pointer [PR111703]
potential_constant_expression for CALL_EXPR tests FUNCTION_POINTER_TYPE_P
on the callee rather than on the type of the callee, which means we
always pass want_rval=any when recursing and so may fail to identify a
non-constant function pointer callee as such. Fixing this turns out to
further work around PR111703.
David Malcolm [Thu, 16 Nov 2023 13:29:19 +0000 (08:29 -0500)]
diagnostics: make m_lang_mask private
No functional change intended.
gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::set_option_hooks): Add
"lang_mask" param.
* diagnostic.h (diagnostic_context::option_enabled_p): Update for
move of m_lang_mask.
(diagnostic_context::set_option_hooks): Add "lang_mask" param.
(diagnostic_context::get_lang_mask): New.
(diagnostic_context::m_lang_mask): Move into m_option_callbacks,
thus making private.
* lto-wrapper.cc (main): Update for new lang_mask param of
set_option_hooks.
* toplev.cc (init_asm_output): Use get_lang_mask.
(general_init): Move initialization of global_dc's lang_mask to
new lang_mask param of set_option_hooks.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Tamar Christina [Thu, 16 Nov 2023 12:11:22 +0000 (12:11 +0000)]
middle-end: skip checking loop exits if loop malformed [PR111878]
Before my refactoring if the loop->latch was incorrect then find_loop_location
skipped checking the edges and would eventually return a dummy location.
It turns out that a loop can have
loops_state_satisfies_p (LOOPS_HAVE_RECORDED_EXITS) but also not have a latch
in which case get_loop_exit_edges traps.