git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

commit | commitdiff | tree

Richard Biener [Fri, 11 Jul 2025 06:33:59 +0000 (06:33 +0000)]

Update ChangeLog and version files for release

commit | commitdiff | tree

GCC Administrator [Fri, 11 Jul 2025 00:21:31 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 10 Jul 2025 00:22:49 +0000 (00:22 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Wed, 9 Jul 2025 00:23:21 +0000 (00:23 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 8 Jul 2025 00:22:14 +0000 (00:22 +0000)]

Daily bump.

commit | commitdiff | tree

Iain Sandoe [Sun, 19 Jun 2022 19:47:43 +0000 (20:47 +0100)]

testsuite, Darwin: Remove an unnecessary flags addition.

The addition of the multiply_defined suppress flag has been handled for some
considerable time now in the Darwin specs; remove it from the testsuite libs.
Avoid duplicates in the specs.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/ChangeLog:

* config/darwin.h: Avoid duplicate multiply_defined specs on
earlier Darwin versions with shared libgcc.

libstdc++-v3/ChangeLog:

* testsuite/lib/libstdc++.exp: Remove additional flag handled
by Darwin specs.

gcc/testsuite/ChangeLog:

* lib/g++.exp: Remove additional flag handled by Darwin specs.
* lib/obj-c++.exp: Likewise.

(cherry picked from commit 3c776fdf1a825818ad7248d442e846f532574ff7)

commit | commitdiff | tree

Jakub Jelinek [Mon, 7 Jul 2025 07:51:51 +0000 (09:51 +0200)]

Revert "c++: Fix up cp_build_array_ref COND_EXPR handling [PR120471]"

This reverts commit b5f0faa4eb71650a9dde3938c3a98eda710534de.

commit | commitdiff | tree

Jakub Jelinek [Mon, 7 Jul 2025 07:51:38 +0000 (09:51 +0200)]

Revert "c++: Fix a pasto in the PR120471 fix [PR120940]"

This reverts commit ed950a9ed384389ff07ac793b7065abe31bcae3f.

commit | commitdiff | tree

GCC Administrator [Mon, 7 Jul 2025 00:21:34 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 6 Jul 2025 00:21:58 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 5 Jul 2025 00:21:49 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

Iain Sandoe [Tue, 15 Apr 2025 13:02:21 +0000 (14:02 +0100)]

configure, Darwin: Recognise new naming for Xcode ld.

The latest editions of XCode have altered the identify reported by 'ld -v'
(again). This means that GCC configure no longer detects the version.

Fixed by adding the new name to the set checked.

gcc/ChangeLog:

* configure: Regenerate.
* configure.ac: Recognise PROJECT:ld-mmmm.nn.aa as an identifier
for Darwin's static linker.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit 7f56a8e8ad1c33d358e9e09fcbaf263c2caba1b9)

commit | commitdiff | tree

Iain Sandoe [Sun, 9 Mar 2025 09:24:34 +0000 (09:24 +0000)]

Darwin: Pass -macos_version_min to the linker [PR119172].

For binaries to be notarised, the SDK version must be available.
Since we do not, at present, parse this information we have been
passing "0.0" to ld64. This now results in a warning and a fail
to notarise. As a quick-fix, we can fall back to letting ld64
figure out the SDK version (which it does for -macos_version_min).

TODO: Parse the SDKSetting.plist at some point.

cherry-picked from 952e17223d3a9 and fc728cfd569e291a5

PR target/119172

gcc/ChangeLog:

* config.in: Regenerate.
* config/darwin.h (DARWIN_PLATFORM_ID): Add the option to
use -macos_version_min where available.
* configure: Regenerate.
* configure.ac: Check for ld64 support of -macos_version_min.

Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Francois-Xavier Coudert [Thu, 27 Jun 2024 16:55:22 +0000 (18:55 +0200)]

fixincludes: adjust stdio fix for macOS 15 headers

fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (apple_local_stdio_fn_deprecation): Also apply to
_stdio.h.

(cherry picked from commit 1dc143181550573c9c902fb7a3b495e9b409d0b0)

commit | commitdiff | tree

Mark Mentovai [Tue, 24 Sep 2024 20:11:14 +0000 (16:11 -0400)]

libgcc, Darwin: Drop the legacy library build for macOS >= 10.12 [PR116809].

From macOSX15 SDK, the unwinder no longer exports some of the symbols used
in that library which (a) causes bootstrap fail and (b) means that the
legacy library is no longer useful.

No open branch of GCC emits references to this library - and any already
-built code that depends on the symbols would need rework anyway.

We have been asked to extend this back to the earliest OS vesion supported
by the SDK (10.12).

PR target/116809

libgcc/ChangeLog:

* config.host: Build legacy libgcc_s.1 on hosts before macOS 10.12.
* config/i386/t-darwin: Remove reference to legacy libgcc_s.1
* config/rs6000/t-darwin: Likewise.
* config/t-darwin-libgccs1: New file.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
(cherry picked from commit d9cafa0c4f0a81304d9b95a78ccc8e9003c6d7a3)

commit | commitdiff | tree

GCC Administrator [Fri, 4 Jul 2025 00:21:01 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

Jakub Jelinek [Thu, 3 Jul 2025 20:39:39 +0000 (22:39 +0200)]

c++: Fix a pasto in the PR120471 fix [PR120940]

No idea how this slipped in, I'm terribly sorry.
Strangely nothing in the testsuite has caught this, so I've added
a new test for that.

2025-07-03 Jakub Jelinek <jakub@redhat.com>

PR c++/120940
* typeck.cc (cp_build_array_ref): Fix a pasto.

* g++.dg/parse/pr120940.C: New test.
* g++.dg/warn/Wduplicated-branches9.C: New test.

(cherry picked from commit dc90649466a54ab61926d88500a05f59a55cb055)

commit | commitdiff | tree

Jakub Jelinek [Tue, 1 Jul 2025 13:28:10 +0000 (15:28 +0200)]

c++: Fix up cp_build_array_ref COND_EXPR handling [PR120471]

The following testcase is miscompiled since the introduction of UBSan,
cp_build_array_ref COND_EXPR handling replaces
(cond ? a : b)[idx] with cond ? a[idx] : b[idx], but if there are
SAVE_EXPRs inside of idx, they will be evaluated just in one of the
branches and the other uses uninitialized temporaries.

Fixed by keeping doing what it did if idx doesn't have side effects
and is invariant.  Otherwise if op1/op2 are ARRAY_TYPE arrays with
invariant addresses or pointers with invariant values, use
SAVE_EXPR <op0>, SAVE_EXPR <idx>, SAVE_EXPR <op0> as a new condition
and SAVE_EXPR <idx> instead of idx for the recursive calls.
Otherwise punt, but if op1/op2 are ARRAY_TYPE, furthermore call
cp_default_conversion on array, so that COND_EXPR with ARRAY_TYPE doesn't
survive in the IL until expansion.

2025-07-01  Jakub Jelinek  <jakub@redhat.com>

PR c++/120471
gcc/cp/
* typeck.cc (cp_build_array_ref) <case COND_EXPR>: If idx is not
INTEGER_CST, don't optimize the case (but cp_default_conversion on
array early if it has ARRAY_TYPE) or use
SAVE_EXPR <op0>, SAVE_EXPR <idx>, SAVE_EXPR <op0> as new op0 depending
on flag_strong_eval_order and whether op1 and op2 are arrays with
invariant address or tree invariant pointers.  Formatting fixes.
gcc/testsuite/
* g++.dg/ubsan/pr120471.C: New test.
* g++.dg/parse/pr120471.C: New test.

(cherry picked from commit 988e87b66882875b14a6cab11c17516863c74a63)

commit | commitdiff | tree

GCC Administrator [Thu, 3 Jul 2025 00:19:04 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

Jan Hubicka [Mon, 29 Jul 2024 08:48:34 +0000 (10:48 +0200)]

Fix ICE with -fdump-tree-moref

gcc/ChangeLog:

PR ipa/116055
* ipa-modref.cc (analyze_function): Do not ICE when flags regress.

(cherry picked from commit 98baaa17561ca299eefc98f469f4326e551604c9)

commit | commitdiff | tree

Jan Hubicka [Fri, 24 Jun 2022 11:52:44 +0000 (13:52 +0200)]

Fix stmt_kills_ref_p WRT external throws

Add missing check to stmt_kills_ref_p for case that function
is terminated by EH before call return value kills the ref. In the PR
I tried to construct testcase but I don't know how to do that until I
annotate EH code with fnspec attributes which I will do in separate patch
and add a testcase.

PR ipa/106057
* tree-ssa-alias.cc (stmt_kills_ref_p): Check for external throw.

(cherry picked from commit 7fd34782b95bbe1b4dc9936b8923f86d4aaee379)

commit | commitdiff | tree

Richard Biener [Fri, 28 Feb 2025 08:58:36 +0000 (09:58 +0100)]

middle-end/66279 - gimplification clobbers shared asm constraints

When the C++ frontend clones a CTOR we do not copy ASM_EXPR constraints
fully as walk_tree does not recurse to TREE_PURPOSE of TREE_LIST nodes.
At this point doing that seems too dangerous so the following instead
avoids gimplification of ASM_EXPRs to clobber the shared constraints
and unshares it there, like it also unshares TREE_VALUE when it
re-writes a "+" output constraint to separate "=" output and matching
input constraint.

PR middle-end/66279
* gimplify.cc (gimplify_asm_expr): Copy TREE_PURPOSE before
rewriting it for "+" processing.

* g++.dg/pr66279.C: New testcase.

(cherry picked from commit 95f5d6cc17e7d6b689674756c62b6b5e1284afd0)

commit | commitdiff | tree

GCC Administrator [Wed, 2 Jul 2025 00:20:17 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Tue, 1 Jul 2025 00:20:33 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 30 Jun 2025 00:21:33 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 29 Jun 2025 00:21:26 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 28 Jun 2025 00:23:42 +0000 (00:23 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 27 Jun 2025 00:23:08 +0000 (00:23 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 26 Jun 2025 00:22:20 +0000 (00:22 +0000)]

Daily bump.

commit | commitdiff | tree

Haochen Jiang [Wed, 25 Jun 2025 03:04:11 +0000 (11:04 +0800)]

i386: Remove CLDEMOTE for clients

CLDEMOTE is not enabled on clients according to SDM. SDM only mentioned
it will be enabled on Xeon and Atom servers, not clients. Remove them
since Alder Lake (where it is introduced).

gcc/ChangeLog:

* config/i386/i386.h (PTA_ALDERLAKE): Use PTA_GOLDMONT_PLUS
as base to remove PTA_CLDEMOTE.
* doc/invoke.texi: Update texi file.

commit | commitdiff | tree

GCC Administrator [Wed, 25 Jun 2025 00:22:09 +0000 (00:22 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Biener [Thu, 10 Apr 2025 11:30:42 +0000 (13:30 +0200)]

middle-end/119706 - allow POLY_INT_CST as is_gimple_mem_ref_addr

We currently only INTEGER_CST, but not POLY_INT_CST, which leads
to the situation that when the POLY_INT_CST is only indrectly
present via a SSA def the IL is valid but when propagated it's not.
That's unsustainable.

PR middle-end/119706
* gimple-expr.cc (is_gimple_mem_ref_addr): Also allow
POLY_INT_CST.

(cherry picked from commit bf812c6ad83ec0b241bb3fecc7e68f883b6083df)

commit | commitdiff | tree

Richard Biener [Tue, 1 Apr 2025 12:13:03 +0000 (14:13 +0200)]

tree-optimization/119534 - reject bogus emulated vectorized gather

The following makes sure to reject the attempts to emulate a vector
gather when the discovered index vector type is a vector mask.

PR tree-optimization/119534
* tree-vect-stmts.cc (get_load_store_type): Reject
VECTOR_BOOLEAN_TYPE_P offset vector type for emulated gathers.

* gcc.dg/vect/pr119534.c: New testcase.

(cherry picked from commit d0cc14c62ad7403afcab3c2e38851d3ab179352f)

commit | commitdiff | tree

Richard Biener [Thu, 6 Mar 2025 08:08:07 +0000 (09:08 +0100)]

middle-end/119119 - re-gimplification of empty CTOR assignments

The following testcase runs into a re-gimplification issue during
inlining when processing

  MEM[(struct e *)this_2(D)].a = {};

where re-gimplification does not handle assignments in the same
way than the gimplifier but instead relies on rhs_predicate_for
and gimplifying the RHS standalone.  This fails to handle
special-casing of CTORs.  The is_gimple_mem_rhs_or_call predicate
already handles clobbers but not empty CTORs so we end up in
the fallback code trying to force the CTOR into a separate stmt
using a temporary - but as we have a non-copyable type here that ICEs.

The following generalizes empty CTORs in is_gimple_mem_rhs_or_call
since those need no additional re-gimplification.

PR middle-end/119119
* gimplify.cc (is_gimple_mem_rhs_or_call): All empty CTORs
are OK when not a register type.

* g++.dg/torture/pr11911.C: New testcase.

(cherry picked from commit 3bd61c1dfaa2d7153eb4be82f423533ea937d0f9)

commit | commitdiff | tree

Richard Biener [Mon, 3 Mar 2025 12:21:53 +0000 (13:21 +0100)]

tree-optimization/119057 - bogus double reduction detection

We are detecting a cycle as double reduction where the inner loop
cycle has extra out-of-loop uses.  This clashes at least with
assumptions from the SLP discovery code which says the cycle
isn't reachable from another SLP instance.  It also was not intended
to support this case, in fact with GCC 14 we seem to generate wrong
code here.

PR tree-optimization/119057
* tree-vect-loop.cc (check_reduction_path): Add argument
specifying whether we're analyzing the inner loop of a
double reduction.  Do not allow extra uses outside of the
double reduction cycle in this case.
(vect_is_simple_reduction): Adjust.

* gcc.dg/vect/pr119057.c: New testcase.

(cherry picked from commit 758de6263dfc7ba8701965fa468691ac23cb7eb5)

commit | commitdiff | tree

Richard Biener [Tue, 28 Jan 2025 11:28:14 +0000 (12:28 +0100)]

tree-optimization/117424 - invalid LIM of trapping ref

The following addresses a bug in tree_could_trap_p leading to
hoisting of a possibly trapping, because of out-of-bound, access.
We only ensured the first accessed byte is within a decl there,
the patch makes sure the whole base of the reference is within it.
This is pessimistic if a handled component would then subset to
a sub-object within the decl but upcasting of a decl to larger
types should be uncommon, questionable, and wrong without
-fno-strict-aliasing.

The testcase is a bit fragile, but I could not devise a (portable)
way to ensure an out-of-bound access to a decl would fault.

PR tree-optimization/117424
* tree-eh.cc (tree_could_trap_p): Verify the base is
fully contained within a decl.

* gcc.dg/tree-ssa/ssa-lim-25.c: New testcase.

(cherry picked from commit f1e776ce58ae4a6ae67886adb4ae806598e2c7ef)

commit | commitdiff | tree

Richard Biener [Mon, 3 Feb 2025 14:12:52 +0000 (15:12 +0100)]

tree-optimization/117113 - ICE with unroll-and-jam

When there's an inner loop without virtual header PHI but the outer
loop has one the fusion process cannot handle the need to create
an inner loop virtual header PHI. Punt in this case.

PR tree-optimization/117113
* gimple-loop-jam.cc (unroll_jam_possible_p): Detect when
we cannot handle virtual SSA update.

* gcc.dg/torture/pr117113.c: New testcase.

(cherry picked from commit 0675eb17480bada678bf2769d39732027abcd6d0)

commit | commitdiff | tree

Richard Biener [Thu, 6 Mar 2025 12:48:16 +0000 (13:48 +0100)]

lto/114501 - missed free-lang-data for CONSTRUCTOR index

The following makes sure to also walk CONSTRUCTOR element indexes
which can be FIELD_DECLs, referencing otherwise unused types we
need to clean. walk_tree only walks CONSTRUCTOR element data.

PR lto/114501
* ipa-free-lang-data.cc (find_decls_types_r): Explicitly
handle CONSTRUCTORs as walk_tree handling of those is
incomplete.

* g++.dg/pr114501_0.C: New testcase.

(cherry picked from commit fdd95e1cf29137a19baed25f8c817d320dfe63e3)

commit | commitdiff | tree

Richard Biener [Mon, 3 Feb 2025 13:27:01 +0000 (14:27 +0100)]

lto/113207 - fix free_lang_data_in_type

When we process function types we strip volatile and const qualifiers
after building a simplified type variant (which preserves those).
The qualified type handling of both isn't really compatible, so avoid
bad interaction by swapping this, first dropping const/volatile
qualifiers and then building the simplified type thereof.

PR lto/113207
* ipa-free-lang-data.cc (free_lang_data_in_type): First drop
const/volatile qualifiers from function argument types,
then build a simplified type.

* gcc.dg/pr113207.c: New testcase.

(cherry picked from commit a55e14b239181381204c615335929b3316d75370)

commit | commitdiff | tree

Richard Biener [Tue, 28 Jan 2025 14:01:25 +0000 (15:01 +0100)]

tree-optimization/112859 - add comment

This adds a comment before the workaround, indicating flaky
dependence analysis.

PR tree-optimization/112859
* tree-loop-distribution.cc
(loop_distribution::pg_add_dependence_edges): Add comment.

(cherry picked from commit 3ccbc8c9d182c380e396631b2b5a683de4fddba9)

commit | commitdiff | tree

Richard Biener [Thu, 23 Jan 2025 12:10:17 +0000 (13:10 +0100)]

tree-optimization/112859 - bogus loop distribution

When we get a zero distance vector we still have to check for the
situation of a common inner loop with zero distance. But we can
still allow a zero distance for the loop we distribute
(gcc.dg/tree-ssa/ldist-33.c is such a case). This is because
zero distances in non-outermost loops are a misrepresentation
of dependence by dependence analysis.

Note that test coverage of loop distribution of loop nests is
very low.

PR tree-optimization/112859
PR tree-optimization/115347
* tree-loop-distribution.cc
(loop_distribution::pg_add_dependence_edges): For a zero
distance vector still make sure to not have an inner
loop with zero distance.

* gcc.dg/torture/pr112859.c: New testcase.
* gcc.dg/torture/pr115347.c: Likewise.
* gcc.dg/tree-ssa/ldist-36.c: Adjust.

(cherry picked from commit 04ba1300407f106a6dd10d346f58a51d87e6d43e)

commit | commitdiff | tree

Richard Biener [Fri, 28 Feb 2025 10:44:26 +0000 (11:44 +0100)]

ipa/111245 - bogus modref analysis for store in call that might throw

We currently record a kill for

*x_4(D) = always_throws ();

because we consider the store always executing since the appropriate
check for whether the stmt could throw is guarded by
!cfun->can_throw_non_call_exceptions.

PR ipa/111245
* ipa-modref.cc (modref_access_analysis::analyze_store): Do
not guard the check of whether the stmt could throw by
cfun->can_throw_non_call_exceptions.

* g++.dg/torture/pr111245.C: New testcase.

(cherry picked from commit e6037af6d5e5a43c437257580d75bc8b35a6dcfd)

commit | commitdiff | tree

Sam James [Mon, 21 Oct 2024 11:11:42 +0000 (12:11 +0100)]

testsuite: add testcase for fixed PR107467

PR107467 ended up being fixed by the fix for PR115110, but let's
add the testcase on top.

gcc/testsuite/ChangeLog:
PR tree-optimization/107467
PR middle-end/115110

* g++.dg/lto/pr107467_0.C: New test.

(cherry picked from commit 4e09ae37dbe0a10f48490214f50ff733cc92280a)

commit | commitdiff | tree

Richard Biener [Fri, 17 May 2024 09:02:29 +0000 (11:02 +0200)]

middle-end/115110 - Fix view_converted_memref_p

view_converted_memref_p was checking the reference type against the
pointer type of the offset operand rather than its pointed-to type
which leads to all refs being subject to view-convert treatment
in get_alias_set causing numerous testsuite fails but with its
new uses from r15-512-g9b7cad5884f21c is also a wrong-code issue.

PR middle-end/115110
* tree-ssa-alias.cc (view_converted_memref_p): Fix.

(cherry picked from commit a5b3721c06646bf5b9b50a22964e8e2bd4d03f5f)

commit | commitdiff | tree

Richard Biener [Wed, 31 Jul 2024 08:07:45 +0000 (10:07 +0200)]

middle-end/101478 - ICE with degenerate address during gimplification

When we gimplify &MEM[0B + 4] we are re-folding the address in case
types are not canonical which ends up with a constant address that
recompute_tree_invariant_for_addr_expr ICEs on. Properly guard
that call.

PR middle-end/101478
* gimplify.cc (gimplify_addr_expr): Check we still have an
ADDR_EXPR before calling recompute_tree_invariant_for_addr_expr.

* gcc.dg/pr101478.c: New testcase.

(cherry picked from commit 33ead6400ad59d4b38fa0527a9a7b53a28114ab7)

commit | commitdiff | tree

Richard Biener [Fri, 28 Feb 2025 13:09:29 +0000 (14:09 +0100)]

lto/91299 - weak definition inlined with LTO

The following fixes a thinko in the handling of interposed weak
definitions which confused the interposition check in
get_availability by setting DECL_EXTERNAL too early.

PR lto/91299
gcc/lto/
* lto-symtab.cc (lto_symtab_merge_symbols): Set DECL_EXTERNAL
only after calling get_availability.

gcc/testsuite/
* gcc.dg/lto/pr91299_0.c: New testcase.
* gcc.dg/lto/pr91299_1.c: Likewise.

(cherry picked from commit bc34db5b12e008f6ec4fdf4ebd22263c8617e5e3)

commit | commitdiff | tree

Richard Biener [Fri, 28 Feb 2025 09:36:11 +0000 (10:36 +0100)]

tree-optimization/87984 - hard register assignments not preserved

The following disables redundant store elimination to hard register
variables which isn't valid.

PR tree-optimization/87984
* tree-ssa-dom.cc (dom_opt_dom_walker::optimize_stmt): Do
not perform redundant store elimination to hard register
variables.
* tree-ssa-sccvn.cc (eliminate_dom_walker::eliminate_stmt):
Likewise.

* gcc.target/i386/pr87984.c: New testcase.

(cherry picked from commit 535115caaf97f5201fb528f67f15b4c52be5619d)

commit | commitdiff | tree

Richard Biener [Mon, 3 Feb 2025 10:27:20 +0000 (11:27 +0100)]

c++/79786 - bougs invocation of DATA_ABI_ALIGNMENT macro

The first argument is supposed to be a type, not a decl.

PR c++/79786
gcc/cp/
* rtti.cc (emit_tinfo_decl): Fix DATA_ABI_ALIGNMENT invocation.

(cherry picked from commit 6ec19825b4e72611cdbd4749feed67b61392aa81)

commit | commitdiff | tree

Richard Biener [Thu, 24 Aug 2023 09:10:43 +0000 (11:10 +0200)]

tree-optimization/111125 - avoid BB vectorization in novector loops

When a loop is marked with

  #pragma GCC novector

the following makes sure to also skip BB vectorization for contained
blocks.  That avoids gcc.dg/vect/bb-slp-29.c failing on aarch64
because of extra BB vectorization therein.  I'm not specifically
dealing with sub-loops of novector loops, the desired semantics
isn't documented.

PR tree-optimization/111125
* tree-vect-slp.cc (vect_slp_function): Split at novector
loop entry, do not push blocks in novector loops.

(cherry picked from commit 43da77a4f1636280c4259402c9c2c543e6ec6c0b)

commit | commitdiff | tree

GCC Administrator [Tue, 24 Jun 2025 00:21:10 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 23 Jun 2025 00:20:56 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 22 Jun 2025 00:21:06 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 21 Jun 2025 00:20:52 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Fri, 20 Jun 2025 00:23:38 +0000 (00:23 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Thu, 19 Jun 2025 00:21:43 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

Patrick Palka [Thu, 9 Jan 2025 15:49:45 +0000 (10:49 -0500)]

c++: template-id dependence wrt local static arg [PR117792]

Here we end up ICEing at instantiation time for the call to
f<local_static> ultimately because we wrongly consider the call to be
non-dependent, and so we specialize f ahead of time and then get
confused when fully substituting this specialization.

The call is dependent due to [temp.dep.temp]/3 and we miss that because
function template-id arguments aren't coerced until overload resolution,
and so the local static template argument lacks an implicit cast to
reference type that value_dependent_expression_p looks for before
considering dependence of the address. Other kinds of template-ids aren't
affected since they're coerced ahead of time.

So when considering dependence of a function template-id, we need to
conservatively consider dependence of the address of each argument (if
applicable).

PR c++/117792

gcc/cp/ChangeLog:

* pt.cc (type_dependent_expression_p): Consider the dependence
of the address of each template argument of a function
template-id.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/nontype7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 40f0f6ab75a391906bed40cbdc098b0df3a91af7)

commit | commitdiff | tree

Patrick Palka [Thu, 15 Aug 2024 14:23:54 +0000 (10:23 -0400)]

c++: c->B::m access resolved through current inst [PR116320]

Here when checking the access of (the injected-class-name) B in c->B::m
at parse time, we notice its context B (now the type) is a base of the
object type C<T>, so we proceed to use C<T> as the effective qualifying
type. But this C<T> is the dependent specialization not the primary
template type, so it has empty TYPE_BINFO, which leads to a segfault later
from perform_or_defer_access_check.

The reason the DERIVED_FROM_P (B, C<T>) test guarding this code path works
despite C<T> having empty TYPE_BINFO is because of its currently_open_class
logic (added in r9-713-gd9338471b91bbe) which replaces a dependent
specialization with the primary template type if we're inside it. So the
safest fix seems to be to call currently_open_class in the caller as well.

PR c++/116320

gcc/cp/ChangeLog:

* semantics.cc (check_accessibility_of_qualified_id): Try
currently_open_class when using the object type as the
effective qualifying type.

gcc/testsuite/ChangeLog:

* g++.dg/template/access42.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 484f139ccd3b631a777802e810a632678b42ffab)

commit | commitdiff | tree

Patrick Palka [Thu, 11 Apr 2024 14:16:41 +0000 (10:16 -0400)]

c++: build_extra_args recapturing local specs [PR114303]

r13-6452-g341e6cd8d603a3 made build_extra_args walk evaluated contexts
first so that we prefer processing a local specialization in an evaluated
context even if its first use is in an unevaluated context.  But this
means we need to avoid walking a tree that already has extra args/specs
saved because the list of saved specs appears to be an evaluated
context which we'll now walk first.  It seems then that we should be
calculating the saved specs from scratch each time, rather than
potentially walking the saved specs list from an earlier partial
instantiation when calling build_extra_args a second time around.

PR c++/114303

gcc/cp/ChangeLog:

* constraint.cc (tsubst_requires_expr): Clear
REQUIRES_EXPR_EXTRA_ARGS before calling build_extra_args.
* pt.cc (tree_extra_args): Define.
(extract_locals_r): Assert *_EXTRA_ARGS is empty.
(tsubst_stmt) <case IF_STMT>: Clear IF_SCOPE on the new
IF_STMT.  Call build_extra_args on the new IF_STMT instead
of t which might already have IF_STMT_EXTRA_ARGS.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda6.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit b262b17636e47ae969a74f16e86ccb00678d5e88)

commit | commitdiff | tree

GCC Administrator [Wed, 18 Jun 2025 00:23:04 +0000 (00:23 +0000)]

Daily bump.

commit | commitdiff | tree

Patrick Palka [Thu, 12 Sep 2024 16:45:03 +0000 (12:45 -0400)]

c++: decltype(auto) deduction of statement-expression [PR116418]

r8-7538 for PR84968 made strip_typedefs_expr diagnose STATEMENT_LIST
so that we reject statement-expressions in noexcept-specifiers to
match our behavior in template arguments (which the parser diagnoses
directly).

Later r11-7452 made decltype(auto) deduction canonicalize the expression
(as an implementation detail) which in turn calls strip_typedefs_expr,
and so ever since we inadvertently reject decltype(auto) deduction of a
statement-expression.

This patch just removes the diagnostic in strip_typedefs_expr and instead
treats statement-expressions similar to lambda-expressions. The function
doesn't seem like the right place for such a diagnostic and so it seems
easier to just accept rather than try to reject them in a suitable place.

PR c++/116418

gcc/cp/ChangeLog:

* tree.cc (strip_typedefs_expr) <case STATEMENT_LIST>: Replace
this error path with ...
<case STMT_EXPR>: ... this, returning the original tree.

gcc/testsuite/ChangeLog:

* g++.dg/eh/pr84968.C: No longer expect an ahead of time diagnostic
for the statement-expresssion. Instantiate the template and expect
an incomplete type error instead.
* g++.dg/ext/stmtexpr26.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 12bdcc3d7970860b9d66ed4dea203bde8fd68d4d)

commit | commitdiff | tree

Georg-Johann Lay [Thu, 12 Jun 2025 08:07:37 +0000 (10:07 +0200)]

Fix test case for PR117811 which failed for int < 32 bit.

PR middle-end/117811
PR testsuite/52641
gcc/testsuite/
* gcc.dg/torture/pr117811.c: Fix for int < 32 bit.

(cherry picked from commit 07f229c2d7ee6b604e5a86092e675d5d36c1ba4e)

commit | commitdiff | tree

Richard Earnshaw [Thu, 20 Mar 2025 14:42:59 +0000 (14:42 +0000)]

opcodes: fix wrong code in expand_binop_directly [PR117811]

If expand_binop_directly fails to add a REG_EQUAL note it tries to
unwind and restart.  But it can unwind too far if expand_binop changed
some of the operands before calling it.  We don't need to unwind that
far anyway since we should end up taking exactly the same route next
time, just without a target rtx.

To fix this we remove LAST from the argument list and let the callers
(all in expand_binop) do their own unwinding if the call fails.
Instead we unwind just as far as the entry to expand_binop_directly
and recurse within this function instead of all the way back up.

gcc/ChangeLog:

PR middle-end/117811
* optabs.cc (expand_binop_directly): Remove LAST as an argument,
instead record the last insn on entry.  Only delete insns if
we need to restart and restart by calling ourself, not expand_binop.
(expand_binop): Update callers to expand_binop_directly.  If it
fails to expand the operation, delete back to LAST.

gcc/testsuite:

PR middle-end/117811
* gcc.dg/torture/pr117811.c: New test.

(cherry picked from commit 7679b826840c58343d72d05922355b646db4bdcc)

commit | commitdiff | tree

GCC Administrator [Tue, 17 Jun 2025 00:20:42 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Mon, 16 Jun 2025 00:20:59 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sun, 15 Jun 2025 00:20:50 +0000 (00:20 +0000)]

Daily bump.

commit | commitdiff | tree

GCC Administrator [Sat, 14 Jun 2025 00:21:59 +0000 (00:21 +0000)]

Daily bump.

commit | commitdiff | tree

Jakub Jelinek [Fri, 30 May 2025 12:35:12 +0000 (14:35 +0200)]

testsuite: Add testcase for GCC 13 branch s390 bug [PR120480]

This got broken with r13-9727 and fixed with either of
r13-9729 or r13-9728.

2025-05-30 Jakub Jelinek <jakub@redhat.com>

PR target/120480
* gcc.dg/pr120480.c: New test.

(cherry picked from commit c13d5b939fee565047394475952878dc5394fb74)

commit | commitdiff | tree

Jakub Jelinek [Thu, 17 Apr 2025 08:57:18 +0000 (10:57 +0200)]

s390: Use match_scratch instead of scratch in define_split [PR119834]

The following testcase ICEs since r15-1579 (addition of late combiner),
because *clrmem_short can't be split.
The problem is that the define_insn uses
   (use (match_operand 1 "nonmemory_operand" "n,a,a,a"))
   (use (match_operand 2 "immediate_operand" "X,R,X,X"))
   (clobber (match_scratch:P 3 "=X,X,X,&a"))
and define_split assumed that if operands[1] is const_int_operand,
match_scratch will be always scratch, and it will be reg only if
it was the last alternative where operands[1] is a reg.
The pattern doesn't guarantee it though, of course RA will not try to
uselessly assign a reg there if it is not needed, but during RA
on the testcase below we match the last alternative, but then comes
late combiner and propagates const_int 3 into operands[1].  And that
matches fine, match_scratch matches either scratch or reg and the constraint
in that case is X for the first variant, so still just fine.  But we won't
split that because the splitters only expect scratch.

The following patch fixes it by using match_scratch instead of scratch,
so that it accepts either.

2025-04-17  Jakub Jelinek  <jakub@redhat.com>

PR target/119834
* config/s390/s390.md (define_split after *cpymem_short): Use
(clobber (match_scratch N)) instead of (clobber (scratch)).  Use
(match_dup 4) and operands[4] instead of (match_dup 3) and operands[3]
in the last of those.
(define_split after *clrmem_short): Use (clobber (match_scratch N))
instead of (clobber (scratch)).
(define_split after *cmpmem_short): Likewise.

* g++.target/s390/pr119834.C: New test.

(cherry picked from commit 22fe83d6fc9f59311241c981bcad58b61e2056d4)

commit | commitdiff | tree

Jakub Jelinek [Wed, 28 Feb 2024 22:20:13 +0000 (23:20 +0100)]

c++: Fix explicit instantiation of const variable templates after earlier implicit instantation [PR113976]

Already previously instantiated const variable templates had
cp_apply_type_quals_to_decl called when they were instantiated,
but if they need runtime initialization, their TREE_READONLY flag
has been subsequently cleared.
Explicit variable template instantiation calls grokdeclarator which
calls cp_apply_type_quals_to_decl on them again, setting TREE_READONLY
flag again, but nothing clears it afterwards, so we emit such
instantiations into rodata sections and segfault when the dynamic
initialization attempts to initialize them.

The following patch fixes that by not calling cp_apply_type_quals_to_decl
on already instantiated variable declarations.

2024-02-28 Jakub Jelinek <jakub@redhat.com>
Patrick Palka <ppalka@redhat.com>

PR c++/113976
* decl.cc (grokdeclarator): Don't call cp_apply_type_quals_to_decl
on DECL_TEMPLATE_INSTANTIATED VAR_DECLs.

* g++.dg/cpp1y/var-templ87.C: New test.

(cherry picked from commit 29ac92436aa5c702e9e02c206e7590ebd806398e)

commit | commitdiff | tree

Jakub Jelinek [Mon, 10 Feb 2025 09:40:22 +0000 (10:40 +0100)]

i386: Change RTL representation of bt[lq] [PR118623]

The following testcase is miscompiled because of RTL represententation
of bt{l,q} insn followed by e.g. j{c,nc} being misleading to what it
actually does.
Let's look e.g. at
(define_insn_and_split "*jcc_bt<mode>"
  [(set (pc)
        (if_then_else (match_operator 0 "bt_comparison_operator"
                        [(zero_extract:SWI48
                           (match_operand:SWI48 1 "nonimmediate_operand")
                           (const_int 1)
                           (match_operand:QI 2 "nonmemory_operand"))
                         (const_int 0)])
                      (label_ref (match_operand 3))
                      (pc)))
   (clobber (reg:CC FLAGS_REG))]
  "(TARGET_USE_BT || optimize_function_for_size_p (cfun))
   && (CONST_INT_P (operands[2])
       ? (INTVAL (operands[2]) < GET_MODE_BITSIZE (<MODE>mode)
          && INTVAL (operands[2])
               >= (optimize_function_for_size_p (cfun) ? 8 : 32))
       : !memory_operand (operands[1], <MODE>mode))
   && ix86_pre_reload_split ()"
  "#"
  "&& 1"
  [(set (reg:CCC FLAGS_REG)
        (compare:CCC
          (zero_extract:SWI48
            (match_dup 1)
            (const_int 1)
            (match_dup 2))
          (const_int 0)))
   (set (pc)
        (if_then_else (match_op_dup 0 [(reg:CCC FLAGS_REG) (const_int 0)])
                      (label_ref (match_dup 3))
                      (pc)))]
{
  operands[0] = shallow_copy_rtx (operands[0]);
  PUT_CODE (operands[0], reverse_condition (GET_CODE (operands[0])));
})
The define_insn part in RTL describes exactly what it does,
jumps to op3 if bit op2 in op1 is set (for op0 NE) or not set (for op0 EQ).
The problem is with what it splits into.
put_condition_code %C1 for CCCmode comparisons emits c for EQ and LTU,
nc for NE and GEU and ICEs otherwise.
CCCmode is used mainly for carry out of add/adc, borrow out of sub/sbb,
in those cases e.g. for add we have
(set (reg:CCC flags) (compare:CCC (plus:M x y) x))
and use (ltu (reg:CCC flags) (const_int 0)) for carry set and
(geu (reg:CCC flags) (const_int 0)) for carry not set.  These cases
model in RTL what is actually happening, compare in infinite precision
x from the result of finite precision addition in M mode and if it is
less than unsigned (i.e. overflow happened), carry is set.
Another use of CCCmode is in UNSPEC_* patterns, those are used with
(eq (reg:CCC flags) (const_int 0)) for carry set and ne for unset,
given the UNSPEC no big deal, the middle-end doesn't know what means
set or unset.
But for the bt{l,q}; j{c,nc} case the above splits it into
(set (reg:CCC flags) (compare:CCC (zero_extract) (const_int 0)))
for bt and
(set (pc) (if_then_else (eq (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
for the bit set case (so that the jump expands to jc) and ne for
the bit not set case (so that the jump expands to jnc).
Similarly for the different splitters for cmov and set{c,nc} etc.
The problem is that when the middle-end reads this RTL, it feels
the exact opposite to it.  If zero_extract is 1, flags is set
to comparison of 1 and 0 and that would mean using ne ne in the
if_then_else, and vice versa.

So, in order to better describe in RTL what is actually happening,
one possibility would be to swap the behavior of put_condition_code
and use NE + LTU -> c and EQ + GEU -> nc rather than the current
EQ + LTU -> c and NE + GEU -> nc; and adjust everything.  The
following patch uses a more limited approach, instead of representing
bt{l,q}; j{c,nc} case as written above it uses
(set (reg:CCC flags) (compare:CCC (const_int 0) (zero_extract)))
and
(set (pc) (if_then_else (ltu (reg:CCC flags) (const_int 0)) (label_ref) (pc)))
which uses the existing put_condition_code but describes what the
insns actually do in RTL clearly.  If zero_extract is 1,
then flags are LTU, 0U < 1U, if zero_extract is 0, then flags are GEU,
0U >= 0U.  The patch adjusts the *bt<mode> define_insn and all the
splitters to it and its comparisons/conditional moves/setXX.

2025-02-10  Jakub Jelinek  <jakub@redhat.com>

PR target/118623
* config/i386/i386.md (*bt<mode>): Represent bt as
compare:CCC of const0_rtx and zero_extract rather than
zero_extract and const0_rtx.
(*jcc_bt<mode>): Likewise.  Use LTU and GEU as flags test
instead of EQ and NE.
(*jcc_bt<mode>_1): Likewise.
(*jcc_bt<mode>_mask): Likewise.
(Help combine recognize bt followed by cmov splitter): Likewise.
(*bt<mode>_setcqi): Likewise.
(*bt<mode>_setncqi): Likewise.
(*bt<mode>_setnc<mode>): Likewise.

* gcc.c-torture/execute/pr118623.c: New test.

(cherry picked from commit 92142019b6cd0cf1fe483203cf3ec451a9848a42)

commit | commitdiff | tree

Jakub Jelinek [Thu, 5 Dec 2024 12:01:21 +0000 (13:01 +0100)]

doloop: Fix up doloop df use [PR116799]

The following testcases are miscompiled on s390x-linux, because the
doloop_optimize
  /* Ensure that the new sequence doesn't clobber a register that
     is live at the end of the block.  */
  {
    bitmap modified = BITMAP_ALLOC (NULL);

    for (rtx_insn *i = doloop_seq; i != NULL; i = NEXT_INSN (i))
      note_stores (i, record_reg_sets, modified);

    basic_block loop_end = desc->out_edge->src;
    bool fail = bitmap_intersect_p (df_get_live_out (loop_end), modified);
check doesn't work as intended.
The problem is that it uses df, but the df analysis was only done using
  iv_analysis_loop_init (loop);
->
  df_analyze_loop (loop);
which computes df inside on the bbs of the loop.
While loop_end bb is inside of the loop, df_get_live_out computed that
way includes registers set in the loop and used at the start of the next
iteration, but doesn't include registers set in the loop (or before the
loop) and used after the loop.

The following patch fixes that by doing whole function df_analyze first,
changes the loop iteration mode from 0 to LI_ONLY_INNERMOST (on many
targets which use can_use_doloop_if_innermost target hook a so are known
to only handle innermost loops) or LI_FROM_INNERMOST (I think only bfin
actually allows non-innermost loops) and checking not just
df_get_live_out (loop_end) (that is needed for something used by the
next iteration), but also df_get_live_in (desc->out_edge->dest),
i.e. what will be used after the loop.  df of such a bb shouldn't
be affected by the df_analyze_loop and so should be from df_analyze
of the whole function.

2024-12-05  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/113994
PR rtl-optimization/116799
* loop-doloop.cc: Include targhooks.h.
(doloop_optimize): Also punt on intersection of modified
with df_get_live_in (desc->out_edge->dest).
(doloop_optimize_loops): Call df_analyze.  Use
LI_ONLY_INNERMOST or LI_FROM_INNERMOST instead of 0 as
second loops_list argument.

* gcc.c-torture/execute/pr116799.c: New test.
* g++.dg/torture/pr113994.C: New test.

(cherry picked from commit 0eed81612ad6eac2bec60286348a103d4dc02a5a)

commit | commitdiff | tree

Jakub Jelinek [Thu, 12 Sep 2024 16:22:21 +0000 (18:22 +0200)]

c++: Disable deprecated/unavailable diagnostics when creating thunks for methods with such attributes [PR116636]

On the following testcase, we emit false positive warnings/errors about using
the deprecated or unavailable methods when creating thunks for them, even
when nothing (in the testcase so far) actually used those.

The following patch temporarily disables that diagnostics when creating
the thunks.

2024-09-12 Jakub Jelinek <jakub@redhat.com>

PR c++/116636
* method.cc: Include decl.h.
(use_thunk): Temporarily change deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.

* g++.dg/warn/deprecated-19.C: New test.

(cherry picked from commit 4026d89d623e322920b052f7ac0d940ef267dc0f)

commit | commitdiff | tree

Marek Polacek [Thu, 5 Sep 2024 17:01:59 +0000 (13:01 -0400)]

c++: vtable referring to "unavailable" virtual fn [PR116606]

mark_vtable_entries already has

   /* It's OK for the vtable to refer to deprecated virtual functions.  */
   warning_sentinel w(warn_deprecated_decl);

but that doesn't cover __attribute__((unavailable)).  We can use the
following override to cover both.

PR c++/116606

gcc/cp/ChangeLog:

* decl2.cc (mark_vtable_entries): Temporarily override deprecated_state to
UNAVAILABLE_DEPRECATED_SUPPRESS.  Remove a warning_sentinel.

gcc/testsuite/ChangeLog:

* g++.dg/ext/attr-unavailable-13.C: New test.

(cherry picked from commit d9d34f9a91371dea4bab0b54b2d7f762a6cc23e0)

commit | commitdiff | tree

Jakub Jelinek [Tue, 6 Feb 2024 12:00:04 +0000 (13:00 +0100)]

asan: Don't fold some strlens with -fsanitize=address [PR110676]

The UB on the following testcase isn't diagnosed by -fsanitize=address,
because we see that the array has a single element and optimize the
strlen to 0. I think it is fine to assume e.g. for range purposes the
lower bound for the strlen as long as we don't try to optimize
strlen (str)
where we know that it returns [26, 42] to
26 + strlen (str + 26), but for the upper bound we really want to punt
on optimizing that for -fsanitize=address to read all the bytes of the
string and diagnose if we run to object end etc.

2024-02-06 Jakub Jelinek <jakub@redhat.com>

PR sanitizer/110676
* gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address
reset maxlen to sizetype maximum.

* gcc.dg/asan/pr110676.c: New test.

(cherry picked from commit d3eac7d96de790df51859f63c13838f153b416de)

commit | commitdiff | tree

Jakub Jelinek [Tue, 13 May 2025 12:20:22 +0000 (14:20 +0200)]

libfortran: Fix up _gfortran_{,m,s}findloc2_s{1,4} [PR120196]

As mentioned in the PR, _gfortran_{,m,s}findloc2_s{1,4} iterate too many
times in the back case if nothing is found.
For !back, the loops are for (i = 1; i <= extent; i++) so i is in the
body [1, extent] if nothing is found, but for back it is
for (i = extent; i >= 0; i--) so i is in the body [0, extent] and compares
one element before the start of the array.
Note, findloc1_s{1,4} uses
          for (n = len; n > 0; n--, src -= delta * len_array)
for the back loop and
          for (n = 1; n <= len; n++, src += delta * len_array)
for !back.  This patch fixes that.
The testcase fails under valgrind without the libgfortran changes and
succeeds with those.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR libfortran/120196
* m4/ifindloc2.m4 (header1, header2): For back use i > 0 rather than
i >= 0 as for condition.
* generated/findloc2_s1.c: Regenerate.
* generated/findloc2_s4.c: Regenerate.

* gfortran.dg/pr120196.f90: New test.

(cherry picked from commit 748a7bc4624e7b000f6fdb93a8cf7da73ff193bb)

commit | commitdiff | tree

Jakub Jelinek [Tue, 13 May 2025 12:19:25 +0000 (14:19 +0200)]

libfortran: Fix up _gfortran_s{max,min}loc1_{4,8,16}_s{1,4} [PR120191]

There is a bug in _gfortran_s{max,min}loc1_{4,8,16}_s{1,4} which the
following testcase shows.
The functions return but then crash in the caller.
Seems that is because buffer overflows, I believe those functions for
if (mask == NULL || *mask) condition being false are supposed to fill in
the result array with all zeros (or allocate it and fill it with zeros).
My understanding is the result array in that case is integer(kind={4,8,16})
and should have the extents the character input array has.
The problem is that it uses * string_len in the extent multiplication:
      extent[n] = GFC_DESCRIPTOR_EXTENT(array,n) * string_len;
and
      extent[n] =
        GFC_DESCRIPTOR_EXTENT(array,n + 1) * string_len;
which is I guess fine and desirable for the extents of the character array,
but not for the extents of the destination array.  Yet the code uses
that extent array for that purpose (and no other purposes).
Here it uses it to set the dimensions for the case where it needs to
allocate (as well as size):
      for (n = 0; n < rank; n++)
        {
          if (n == 0)
            str = 1;
          else
            str = GFC_DESCRIPTOR_STRIDE(retarray,n-1) * extent[n-1];
          GFC_DIMENSION_SET(retarray->dim[n], 0, extent[n] - 1, str);
        }
Here it uses it for bounds checking of the destination:
      if (unlikely (compile_options.bounds_check))
        {
          for (n=0; n < rank; n++)
            {
              index_type ret_extent;

              ret_extent = GFC_DESCRIPTOR_EXTENT(retarray,n);
              if (extent[n] != ret_extent)
                runtime_error ("Incorrect extent in return value of"
                               " MAXLOC intrinsic in dimension %ld:"
                               " is %ld, should be %ld", (long int) n + 1,
                               (long int) ret_extent, (long int) extent[n]);
            }
        }
and here to find out how many retarray elements to actually fill in each
dimension:
  while(1)
    {
      *dest = 0;
      count[0]++;
      dest += dstride[0];
      n = 0;
      while (count[n] == extent[n])
        {
          /* When we get to the end of a dimension, reset it and increment
             the next dimension.  */
          count[n] = 0;
          /* We could precalculate these products, but this is a less
             frequently used path so probably not worth it.  */
          dest -= dstride[n] * extent[n];
Seems maxloc1s.m4 and minloc1s.m4 are the only users of ifunction-s.m4,
so we can change SCALAR_ARRAY_FUNCTION in there without breaking anything
else.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR fortran/120191
* m4/ifunction-s.m4 (SCALAR_ARRAY_FUNCTION): Don't multiply
GFC_DESCRIPTOR_EXTENT(array,) by string_len.
* generated/maxloc1_4_s1.c: Regenerate.
* generated/maxloc1_4_s4.c: Regenerate.
* generated/maxloc1_8_s1.c: Regenerate.
* generated/maxloc1_8_s4.c: Regenerate.
* generated/maxloc1_16_s1.c: Regenerate.
* generated/maxloc1_16_s4.c: Regenerate.
* generated/minloc1_4_s1.c: Regenerate.
* generated/minloc1_4_s4.c: Regenerate.
* generated/minloc1_8_s1.c: Regenerate.
* generated/minloc1_8_s4.c: Regenerate.
* generated/minloc1_16_s1.c: Regenerate.
* generated/minloc1_16_s4.c: Regenerate.

* gfortran.dg/pr120191_3.f90: New test.

(cherry picked from commit 781cfc454b8dc24952fe7f4c5c409296dca505e1)

commit | commitdiff | tree

Jakub Jelinek [Tue, 13 May 2025 12:18:10 +0000 (14:18 +0200)]

libfortran: Fix up _gfortran_s{max,min}loc2_{4,8,16}_s{1,4} [PR120191]

I've tried to write a testcase for the BT_CHARACTER maxloc/minloc with named
or unnamed arguments and indeed the just posted patch fixed the arguments
in there in multiple cases to match what the library expects.
But the testcase still fails, due to library problems.

One dealt with in this patch are _gfortran_s{max,min}loc2_{4,8,16}_s{1,4}
functions.  Those are trivial wrappers around
_gfortrani_{max,min}loc2_{4,8,16}_s{1,4} which should call those functions
if the scalar mask is true and just return 0 otherwise.
The two bugs I see there is that the back, len arguments are swapped,
which means that it always acts as back=.true. and for len will use
character length of 1 or 0 instead of the desired one.
The _gfortrani_{max,min}loc2_{4,8,16}_s{1,4} functions have prototypes like
GFC_INTEGER_4
maxloc2_4_s1 (gfc_array_s1 * const restrict array, GFC_LOGICAL_4 back, gfc_charlen_type len)
so back comes before len, ditto for the
GFC_INTEGER_4
smaxloc2_4_s1 (gfc_array_s1 * const restrict array,
               GFC_LOGICAL_4 *mask, GFC_LOGICAL_4 back, gfc_charlen_type len)
The other problem is that it was just testing if (mask).  In my limited
Fortran understanding that means that the optional argument mask was
supplied but nothing about its actual value.  Other scalar mask generated
routines use if (mask == NULL || *mask) as the condition when to call the
non-masked function, i.e. when mask is not supplied (then it should act like
.true. mask) or when it is supplied and evaluates to .true.).

2025-05-13  Jakub Jelinek  <jakub@redhat.com>

PR fortran/120191
* m4/maxloc2s.m4: For smaxloc2 call maxloc2 if mask is NULL or *mask.
Swap back and len arguments.
* m4/minloc2s.m4: Likewise.
* generated/maxloc2_4_s1.c: Regenerate.
* generated/maxloc2_4_s4.c: Regenerate.
* generated/maxloc2_8_s1.c: Regenerate.
* generated/maxloc2_8_s4.c: Regenerate.
* generated/maxloc2_16_s1.c: Regenerate.
* generated/maxloc2_16_s4.c: Regenerate.
* generated/minloc2_4_s1.c: Regenerate.
* generated/minloc2_4_s4.c: Regenerate.
* generated/minloc2_8_s1.c: Regenerate.
* generated/minloc2_8_s4.c: Regenerate.
* generated/minloc2_16_s1.c: Regenerate.
* generated/minloc2_16_s4.c: Regenerate.

* gfortran.dg/pr120191_2.f90: New test.

(cherry picked from commit 482f2192d4ef6af55acae2dc3e0df00b8487cc7d)

commit | commitdiff | tree

Jakub Jelinek [Tue, 13 May 2025 12:14:55 +0000 (14:14 +0200)]

fortran: Fix up minloc/maxloc lowering [PR120191]

We need to drop the kind argument from what is passed to the
library, but need to do it not only when one uses the argument name
for it (so kind=4 etc.) but also when one passes all the arguments
to the intrinsics.

The following patch uses what gfc_conv_intrinsic_findloc uses,
which looks more efficient and cleaner, we already set automatic
vars to point to the kind and back actual arguments, so we can just
free/clear expr on the former and set name to "%VAL" on the latter.

And similarly clears dim argument for the BT_CHARACTER case when using
maxloc2/minloc2, again regardless of whether it was named or not.

2025-05-13  Jakub Jelinek  <jakub@redhat.com>
    Daniil Kochergin  <daniil2472s@gmail.com>
    Tobias Burnus  <tburnus@baylibre.com>

PR fortran/120191
* trans-intrinsic.cc (strip_kind_from_actual): Remove.
(gfc_conv_intrinsic_minmaxloc): Don't call strip_kind_from_actual.
Free and clear kind_arg->expr if non-NULL.  Set back_arg->name to
"%VAL" instead of a loop looking for last argument.  Remove actual
variable, use array_arg instead.  Free and clear dim_arg->expr if
non-NULL for BT_CHARACTER cases instead of using a loop.

* gfortran.dg/pr120191_1.f90: New test.

(cherry picked from commit ec249be3c287c6f1dfb328712ac9c39e6fa95eca)

commit | commitdiff | tree

Jakub Jelinek [Tue, 22 Apr 2025 19:27:28 +0000 (21:27 +0200)]

rs6000: Ignore OPTION_MASK_SAVE_TOC_INDIRECT differences in inlining decisions [PR119327]

The following testcase FAILs because the always_inline function can't
be inlined.
The rs6000 backend has similarly to other targets a hook which rejects
inlining which would bring in new ISAs which aren't there in the caller.
And this hook rejects this because of OPTION_MASK_SAVE_TOC_INDIRECT
differences.
This flag is set if explicitly requested or by default depending on
whether the current function looks hot (or at least not cold):
  if ((rs6000_isa_flags_explicit & OPTION_MASK_SAVE_TOC_INDIRECT) == 0
      && flag_shrink_wrap_separate
      && optimize_function_for_speed_p (cfun))
    rs6000_isa_flags |= OPTION_MASK_SAVE_TOC_INDIRECT;
The target nodes that are being compared here are actually the default
target node (which was created when cfun was NULL) vs. one that was
created for the always_inline function when it wasn't NULL, so one
doesn't have it, the other does.
In any case, this flag feels like a tuning decision rather than hard
ISA requirement and I see no problems why we couldn't inline
even explicit -msave-toc-indirect function into -mno-save-toc-indirect
or vice versa.
We already ignore OPTION_MASK_P{8,10}_FUSION which are also more
like tuning flags.

2025-04-22  Jakub Jelinek  <jakub@redhat.com>

PR target/119327
* config/rs6000/rs6000.cc (rs6000_can_inline_p): Ignore also
OPTION_MASK_SAVE_TOC_INDIRECT differences.

* g++.dg/opt/pr119327.C: New test.

(cherry picked from commit 4b62cf555b5446cb02fc471519cf1afa09e1a108)

commit | commitdiff | tree

Jakub Jelinek [Wed, 16 Apr 2025 15:22:49 +0000 (17:22 +0200)]

libatomic: Fix up libat_{,un}lock_n for mingw [PR119796]

Here is just a port of the previously posted patch to mingw which
clearly has the same problems.

2025-04-16 Jakub Jelinek <jakub@redhat.com>

PR libgcc/101075
PR libgcc/119796
* config/mingw/lock.c (libat_lock_n, libat_unlock_n): Start with
computing how many locks will be needed and take into account
((uintptr_t)ptr % WATCH_SIZE). If some locks from the end of the
locks array and others from the start of it will be needed, first
lock the ones from the start followed by ones from the end.

(cherry picked from commit 34fe8e90007afbc87941df9b01ffcf8747c11497)

commit | commitdiff | tree

Jakub Jelinek [Wed, 16 Apr 2025 15:21:39 +0000 (17:21 +0200)]

libatomic: Fix up libat_{,un}lock_n [PR119796]

As mentioned in the PR (and I think in PR101075 too), we can run into
deadlock with libat_lock_n calls with larger n.
As mentioned in PR66842, we use multiple locks (normally 64 mutexes
for each 64 byte cache line in 4KiB page) and currently can lock more
than one lock, in particular for n [0, 64] a single lock, for n [65, 128]
2 locks, for n [129, 192] 3 locks etc.
There are two problems with this:
1) we can deadlock if there is some wrap-around, because the locks are
   acquired always in the order from addr_hash (ptr) up to
   locks[NLOCKS-1].mutex and then if needed from locks[0].mutex onwards;
   so if e.g. 2 threads perform libat_lock_n with n = 2048+64, in one
   case at pointer starting at page boundary and in another case at
   page boundary + 2048 bytes, the first thread can lock the first
   32 mutexes, the second thread can lock the last 32 mutexes and
   then first thread wait for the lock 32 held by second thread and
   second thread wait for the lock 0 held by the first thread;
   fixed below by always locking the locks in order of increasing
   index, if there is a wrap-around, by locking in 2 loops, first
   locking some locks at the start of the array and second at the
   end of it
2) the number of locks seems to be determined solely depending on the
   n value, I think that is wrong, we don't know the structure alignment
   on the libatomic side, it could very well be 1 byte aligned struct,
   and so how many cachelines are actually (partly or fully) occupied
   by the atomic access depends not just on the size, but also on
   ptr % WATCH_SIZE, e.g. 2 byte structure at address page_boundary+63
   should IMHO lock 2 locks because it occupies the first and second
   cacheline

Note, before this patch it locked exactly one lock for n = 0, while
with this patch it could lock either no locks at all (if it is at cacheline
boundary) or 1 (otherwise).
Dunno of libatomic APIs can be called for zero sizes and whether
we actually care that much how many mutexes are locked in that case,
because one can't actually read/write anything into zero sized memory.
If you think it is important, I could add else if (nlocks == 0) nlocks = 1;
in both spots.

2025-04-16  Jakub Jelinek  <jakub@redhat.com>

PR libgcc/101075
PR libgcc/119796
* config/posix/lock.c (libat_lock_n, libat_unlock_n): Start with
computing how many locks will be needed and take into account
((uintptr_t)ptr % WATCH_SIZE).  If some locks from the end of the
locks array and others from the start of it will be needed, first
lock the ones from the start followed by ones from the end.

(cherry picked from commit 61dfb0747afcece3b7a690807b83b366ff34f329)

commit | commitdiff | tree

Jakub Jelinek [Mon, 14 Apr 2025 17:34:22 +0000 (19:34 +0200)]

expmed: Always use QImode for init_expmed set_zero_cost [PR119785]

This is a regression on some targets introduced I believe by r6-2055
which added mode argument to set_src_cost.

The problem here is that in the first iteration, mode is always QImode
and we get as -Os zero cost set_src_cost (const0_rtx, QImode, false).
But then we use the mode variable for iterating over int, partial int
and vector int modes, so for the second iteration we call set_src_cost
with mode which is at that time (machine_mode) (MAX_MODE_VECTOR_INT + 1).

In the x86 case that happens to be V2HFmode and we don't crash (and
compute the same 0 cost as we would for QImode).
But e.g. in the SPARC case (machine_mode) (MAX_MODE_VECTOR_INT + 1) is
MAX_MACHINE_MODE and that does all kinds of weird things especially
when doing ubsan bootstrap.

Fixed by always using QImode.

2025-04-14 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/119785
* expmed.cc (init_expmed): Always pass QImode rather than mode to
set_src_cost passed to set_zero_cost.

(cherry picked from commit f96a54350afcf7f3c90d0ecb51d7683d826acc00)

commit | commitdiff | tree

Jakub Jelinek [Mon, 14 Apr 2025 08:18:13 +0000 (10:18 +0200)]

driver: On linux hosts disable ASLR during -freport-bug [PR119727]

Andi had a useful comment that even with the PR119727 workaround to
ignore differences in libbacktrace printed addresses, it is still better
to turn off ASLR when easily possible, e.g. in case some address leaks
in somewhere in the ICE message elsewhere, or to verify the ICE doesn't
depend on a particular library/binary load addresses.

The following patch adds a configure check and uses personality syscall
to turn off randomization for further -freport-bug subprocesses.

2025-04-14 Jakub Jelinek <jakub@redhat.com>

PR driver/119727
* configure.ac (HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE): New check.
* gcc.cc: Include sys/personality.h if
HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE is defined.
(try_generate_repro): Call
personality (personality (0xffffffffU) | ADDR_NO_RANDOMIZE)
if HOST_HAS_PERSONALITY_ADDR_NO_RANDOMIZE is defined.
* config.in: Regenerate.
* configure: Regenerate.

(cherry picked from commit 5a32e85810d33dc46b1b5fe2803ee787d40709d5)

commit | commitdiff | tree

Jakub Jelinek [Sat, 12 Apr 2025 11:15:13 +0000 (13:15 +0200)]

driver: Fix up -freport-bug for ASLR [PR119727]

With --enable-host-pie -freport-bug almost never prepares preprocessed
source and instead emits
The bug is not reproducible, so it is likely a hardware or OS problem.
message even for bogus which are 100% reproducible.
The way -freport-bug works is that it reruns it 3 times, capturing stdout
and stderr from each and then tries to compare the outputs in between
different runs.
The libbacktrace emitted hexadecimal addresses at the start of the lines
can differ between runs due to ASLR, either of the PIE executable, or
even if not PIE if there is some frame with e.g. libc function (say
crash in strlen/memcpy etc.).

The following patch fixes it by ignoring such differences at the start of
the lines.

2025-04-12 Jakub Jelinek <jakub@redhat.com>

PR driver/119727
* gcc.cc (files_equal_p): Rewritten using fopen/fgets/fclose instead
of open/fstat/read/close. At the start of lines, ignore lowercase
hexadecimal addresses followed by space.

(cherry picked from commit 8b2ceb421f045ee8b39d7941f39f1e9a67217583)

commit | commitdiff | tree

Jakub Jelinek [Wed, 9 Apr 2025 20:01:30 +0000 (22:01 +0200)]

libquadmath: Fix up THREEp96 constant in expq

Here is a cherry-pick from glibc [BZ #32411] fix.

As mentioned by the reporter in a pull request against gcc-mirror,
the THREEp96 constant in e_expl.c is incorrect, it is actually 0x3.p+94f128
rather than 0x3.p+96f128.

The algorithm uses that to compute the t2 integer (tval2), by whose
delta it adjusts the x+xl pair and then in the result uses the precomputed
exp value for that entry.
Using 0x3.p+94f128 rather than 0x3.p+96f128 results in tval2 sometimes
being one smaller, sometimes one larger than the desired value, thus can mean
the x+xl pair after adjustment will be larger in absolute value than it
should be.

DesWursters created a test program for this
https://github.com/DesWurstes/comparefloats
and his results were
total: 1135000000 not_equal: 4322 earlier_score: 674 later_score: 3648
I've modified this so with
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c3
so that it actually tests pseudo-random _Float128 values with range
(-16384.,16384) with strong bias on values larger than 0.0002 in absolute
value (so that tval1/tval2 aren't zero most of the time) and that gave
total: 10000000000 not_equal: 29861 earlier_score: 4606 later_score: 25255
So, in both cases, in most cases the change doesn't result in any differences,
and in those rare cases where does, about 85% have smaller ulp than without
the patch.
Additionally I've tried
https://sourceware.org/bugzilla/show_bug.cgi?id=32411#c4
and in 2 billion iterations it didn't find any case where x+xl after the
adjustments without this change would be smaller in absolute value compared
to x+xl after the adjustments with this change.

2025-04-09 Jakub Jelinek <jakub@redhat.com>

* math/expq.c (C): Fix up THREEp96 constant.

(cherry picked from commit e081ced345c45581a4891361c08e50e07720239e)

commit | commitdiff | tree

Jakub Jelinek [Fri, 4 Apr 2025 18:57:09 +0000 (20:57 +0200)]

lto: lto-opts fixes [PR119625]

I can reproduce a really weird error in our distro i686 trunk gcc
(but haven't managed to reproduce it with vanilla trunk yet).
echo 'void foo (void) {}' > a.c; gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -o a.o -c a.c; gcc -O2 -flto=auto -m32 -march=i686 -r -o a.lo a.o
lto1: fatal error: open  failed: No such file or directory
compilation terminated.
lto-wrapper: fatal error: gcc returned 1 exit status
The error is because
cat ./a.lo.lto.o-args.0
""
a.o
My suspicion is that this "" in there is caused by weird .gnu.lto_.opts
section content during
gcc -O2 -flto=auto -m32 -march=i686 -ffat-lto-objects -fhardened -S -o a.s -c a.c
compilation (and I can reproduce that one with vanilla trunk).
The above results in
        .section        .gnu.lto_.opts,"e",@progbits
        .string "'-fno-openmp' '-fno-openacc' '-fPIC' '' '-m32' '-march=i686' '-O2' '-flto=auto' '-ffat-lto-objects'"
There are two weird things, one (IMHO the cause of the "" later on) is
the '' part, I think it comes from lto_write_options doing
append_to_collect_gcc_options (&temporary_obstack, &first_p, "");
IMHO it shouldn't call append_to_collect_gcc_options at all for that case.

The -fhardened option causes global_options.x_flag_cf_protection
to be set to CF_FULL and later on the backend option processing
sets it to CF_FULL | CF_SET (i.e. 7, a value not handled in
lto_write_options).

The following patch fixes it by not emitting anything there if
flag_cf_protection is one of the unhandled values.

Perhaps it could incrementally use
switch (global_options.x_flag_cf_protection & ~CF_SET)
instead, dunno.

And the other problem is that the -fPIC in there is really weird.
Our distro compiler or vanilla configured trunk certainly doesn't
default to -fPIC and -fhardened uses -fPIE when
-fPIC/-fpic/-fno-pie/-fno-pic is not specified, so I was expecting
-fPIE in there.
The thing is that the -fpie option causes setting of both
global_options.x_flag_pi{c,e} to 1, -fPIE both to 2:
      /* If -fPIE or -fpie is used, turn on PIC.  */
      if (opts->x_flag_pie)
        opts->x_flag_pic = opts->x_flag_pie;
      else if (opts->x_flag_pic == -1)
        opts->x_flag_pic = 0;
      if (opts->x_flag_pic && !opts->x_flag_pie)
        opts->x_flag_shlib = 1;
so checking first for flag_pic == 2 and then flag_pic == 1
and only afterwards for flag_pie means we never print
-fPIE/-fpie.

Or do you want something further (like
switch (global_options.x_flag_cf_protection & ~CF_SET)
)?

2025-04-04  Jakub Jelinek  <jakub@redhat.com>

PR lto/119625
* lto-opts.cc (lto_write_options): If neither flag_pic nor
flag_pie are set, check first for flag_pie and only later
for flag_pic rather than the other way around, use a temporary
variable.  If flag_cf_protection is not set, don't append anything
if flag_cf_protection is none of CF_{NONE,FULL,BRANCH,RETURN} and
use a temporary variable.

(cherry picked from commit d25728c98682c058bfda79333c94b0a8cf2a3f49)

commit | commitdiff | tree

Jakub Jelinek [Wed, 2 Apr 2025 17:28:20 +0000 (19:28 +0200)]

c: Fix ICEs with -fsanitize=pointer-{subtract,compare} [PR119582]

The following testcase ICEs because c_fully_fold isn't performed on the
arguments of __sanitizer_ptr_{sub,cmp} builtins and so e.g.
C_MAYBE_CONST_EXPR can leak into the gimplifier where it ICEs.

2025-04-02 Jakub Jelinek <jakub@redhat.com>

PR c/119582
* c-typeck.cc (pointer_diff, build_binary_op): Call c_fully_fold on
__sanitizer_ptr_sub or __sanitizer_ptr_cmp arguments.

* gcc.dg/asan/pr119582.c: New test.

(cherry picked from commit 29bc904cb827615ed9f36bc3742ccc4ac77515ec)

commit | commitdiff | tree

Jakub Jelinek [Tue, 1 Apr 2025 14:40:55 +0000 (16:40 +0200)]

combine: Use reg_used_between_p rather than modified_between_p in two spots [PR119291]

The following testcase is miscompiled on x86_64-linux at -O2 by the combiner.
We have from earlier combinations
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
        (const_int 0 [0])) "pr119291.c":25:15 96 {*movsi_internal}
     (nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
        (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
     (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
        (nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (parallel [
            (set (reg:CCZ 17 flags)
                (compare:CCZ (neg:SI (reg:SI 104 [ _7 ]))
                    (const_int 0 [0])))
            (set (reg/v:SI 116 [ e ])
                (neg:SI (reg:SI 104 [ _7 ])))
        ]) "pr119291.c":26:13 977 {*negsi_2}
     (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
        (nil)))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
        (ne:DI (reg:CCZ 17 flags)
            (const_int 0 [0]))) "pr119291.c":26:13 1447 {*setcc_di_1}
     (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (nil)))
and try_combine is called on i3 25 and i2 22 (second time)
and reach the hunk being patched with simplified i3
(insn 25 24 26 4 (parallel [
            (set (pc)
                (pc))
            (set (reg/v:SI 116 [ e ])
                (const_int 0 [0]))
        ]) "pr119291.c":28:13 977 {*negsi_2}
     (expr_list:REG_DEAD (reg:SI 104 [ _7 ])
        (nil)))
and
(insn 22 21 23 4 (set (reg:SI 104 [ _7 ])
        (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
     (nil))
Now, the try_combine code there attempts to split two independent
sets in newpat by moving one of them to i2.
And among other tests it checks
!modified_between_p (SET_DEST (set1), i2, i3)
which is certainly needed, if there would be say
(set (reg/v:SI 116 [ e ]) (const_int 42 [0x2a]))
in between i2 and i3, we couldn't do that, as that set would overwrite
the value set by set1 we want to move to the i2 position.
But in this case pseudo 116 isn't set in between i2 and i3, but used
(and additionally there is a REG_DEAD note for it).

This is equally bad for the move, because while the i3 insn
and later will see the pseudo value that we set, the insn in between
which uses the value will see a different value from the one that
it should see.

As we don't check for that, in the end try_combine succeeds and
changes the IL to:
(insn 22 21 23 4 (set (reg/v:SI 116 [ e ])
        (const_int 0 [0])) "pr119291.c":27:15 96 {*movsi_internal}
     (nil))
(insn 23 22 24 4 (set (reg/v:SI 117 [ e ])
        (reg/v:SI 116 [ e ])) 96 {*movsi_internal}
     (expr_list:REG_DEAD (reg/v:SI 116 [ e ])
        (nil)))
(note 24 23 25 4 NOTE_INSN_DELETED)
(insn 25 24 26 4 (set (pc)
        (pc)) "pr119291.c":28:13 2147483647 {NOOP_MOVE}
     (nil))
(note 26 25 27 4 NOTE_INSN_DELETED)
(insn 27 26 28 4 (set (reg:DI 128 [ _9 ])
        (const_int 0 [0])) "pr119291.c":28:13 95 {*movdi_internal}
     (nil))
(note, the i3 got turned into a nop and try_combine also modified insn 27).

The following patch replaces the modified_between_p
tests with reg_used_between_p, my understanding is that
modified_between_p is a subset of reg_used_between_p, so one
doesn't need both.

Looking at this some more today, I think we should special case
set_noop_p because that can be put into i2 (except for the JUMP_P
violations), currently both modified_between_p (pc_rtx, i2, i3)
and reg_used_between_p (pc_rtx, i2, i3) returns false.
I'll post a patch incrementally for that (but that feels like
new optimization, so probably not something that should be backported).

On Tue, Apr 01, 2025 at 11:27:25AM +0200, Richard Biener wrote:
> Can we constrain SET_DEST (set1/set0) to a REG_P in combine?  Why
> does the comment talk about memory?

I was worried about making too risky changes this late in stage4
(and especially also for backports).  Most of this code is 1992-ish.
I think many of the functions are just misnamed, the reg_ in there doesn't
match what those functions do (bet they initially supported just REGs
and later on support for other kinds of expressions was added, but haven't
done git archeology to prove that).

What we know for sure is:
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != ZERO_EXTRACT
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 0))) != STRICT_LOW_PART
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
           && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
that is checked earlier in the condition.
Then it calls
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
                                  XVECEXP (newpat, 0, 0))
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
                                  XVECEXP (newpat, 0, 1))
While it has reg_* in it, that function mostly calls reg_overlap_mentioned_p
which is also misnamed, that function handles just fine all of
REG, MEM, SUBREG of REG, (SUBREG of MEM not, see below), ZERO_EXTRACT,
STRICT_LOW_PART, PC and even some further cases.
So, IMHO SET_DEST (set0) or SET_DEST (set0) can be certainly a REG, SUBREG
of REG, PC (at least the REG and PC cases are triggered on the testcase)
and quite possibly also MEM (SUBREG of MEM not, see below).

Now, the code uses !modified_between_p (SET_SRC (set{1,0}), i2, i3) where that
function for constants just returns false, for PC returns true, for REG
returns reg_set_between_p, for MEM recurses on the address, for
MEM_READONLY_P otherwise returns false, otherwise checks using alias.cc code
whether the memory could have been modified in between, for all other
rtxes recurses on the subrtxes.  This part didn't change in my patch.

I've only changed those
-         && !modified_between_p (SET_DEST (set{1,0}), i2, i3)
+         && !reg_used_between_p (SET_DEST (set{1,0}), i2, i3)
where the former has been described above and clearly handles all of
REG, SUBREG of REG, PC, MEM and SUBREG of MEM among other things.

The replacement reg_used_between_p calls reg_overlap_mentioned_p on each
instruction in between i2 and i3.  So, there is clearly a difference
in behavior if SET_DEST (set{1,0}) is pc_rtx, in that case modified_between_p
returns unconditionally true even if there are no instructions in between,
but reg_used_between_p if there are no non-debug insns in between returns
false.  Sorry for missing that, guess I should check for that (with the
exception of the noop moves which are often (set (pc) (pc)) and handled
by the incremental patch).  In fact not just that, reg_used_between_p
will only return true for PC if it is mentioned anywhere in the insns
in between.
Anyway, except for that, for REG it calls refers_to_regno_p
and so should find any occurrences of any of the REG or parts of it for hard
registers, for MEM returns true if it sees any MEMs in insns in between
(conservatively), for SUBREGs apparently it relies on it being SUBREG of REG
(so doesn't handle SUBREG of MEM) and handles SUBREG of REG like the
SUBREG_REG, PC I've already described.

Now, because reg_overlap_mentioned_p doesn't handle SUBREG of MEM, I think
already the initial
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 1)),
                                  XVECEXP (newpat, 0, 0))
           && ! reg_referenced_p (SET_DEST (XVECEXP (newpat, 0, 0)),
                                  XVECEXP (newpat, 0, 1))
calls would have failed --enable-checking=rtl or would have misbehaved, so
I think there is no need to check for it further.

To your question why I don't use reg_referenced_p, that is because
reg_referenced_p is something to call on one insn pattern, while
reg_used_between_p is pretty much that on all insns in between two
instructions (excluding the boundaries).

So, I think it would be safer to add && SET_DEST (set{1,0} != pc_rtx
checks to preserve former behavior, like in the following version.

2025-04-01  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/119291
* combine.cc (try_combine): For splitting of PARALLEL with
2 independent SETs into i2 and i3 sets check reg_used_between_p
of the SET_DESTs rather than just modified_between_p.

* gcc.c-torture/execute/pr119291.c: New test.

(cherry picked from commit 19ba913517b5e2a001fa9c0f060a1ac74430c027)

commit | commitdiff | tree

Jakub Jelinek [Sat, 22 Mar 2025 07:39:38 +0000 (08:39 +0100)]

Fix up some further cases of missing or extraneous spaces in diagnostics

Given the recent PR119406 I've tried to grep for concatenated string
literals without space at the end of one line and at the start of next line,
unless it was obviously intentional.
Furthermore, I've then looked through gcc.pot looking for 2 adjacent spaces
and looking back if that wasn't the case of "something "
" with spaces at both sides".

Here is the result from that.

I think just the c.opt change needs an explanation, the "" in the
description is simply eaten up somewhere during the option processing and
gcc -v --help before this patch was displaying
-Wdeprecated-literal-operator Warn about deprecated space between and suffix in a user-defined literal operator.

2025-03-22 Jakub Jelinek <jakub@redhat.com>

gcc/
* gimplify.cc (warn_switch_unreachable_and_auto_init_r): Add missing
space in the middle of diagnostics.

(cherry picked from commit 20360e4b6b5a63bc65d1855a7ecf22eb7148a452)

commit | commitdiff | tree

Jakub Jelinek [Tue, 11 Mar 2025 10:01:55 +0000 (11:01 +0100)]

tree: Improve skip_simple_arithmetic [PR119183]

The following testcase takes very long time to compile, because
skip_simple_arithmetic decides to first call tree_invariant_p on
the second argument (and indirectly recurse there).  I think before
canonicalization of operands for commutative binary expressions
(and for non-commutative ones always) it is pretty common that the
first operand is a constant, something which tree_invariant_p handles
immediately, so the following patch special cases that; I've added
there a tree_invariant_p call too after the checks, while it is not
really needed currently, tree_invariant_p has the same checks, I wanted
to be prepared in case tree_invariant_p changes.  But if you think
I should avoid it, I can drop it too.

This is just a partial fix, I think one can certainly construct a testcase
which will still have horrible compile time complexity (but I've tried and
haven't managed to do so), so perhaps we should just limit the recursion
depth through skip_simple_arithmetic/tree_invariant_p with some defaulted
argument.

2025-03-11  Jakub Jelinek  <jakub@redhat.com>

PR c/119183
* tree.cc (skip_simple_arithmetic): If first operand of binary
expr is TREE_CONSTANT or TREE_READONLY with no side-effects, call
tree_invariant_p on that operand first instead of on the second.

* gcc.dg/pr119183.c: New test.

(cherry picked from commit 20e5aa9cc1519f871cce25dbfdc149d9d60da779)

commit | commitdiff | tree

Jakub Jelinek [Thu, 6 Mar 2025 17:26:37 +0000 (18:26 +0100)]

c++: Update TYPE_FIELDS of variant types if cp_parser_late_parsing_default_args etc. modify it [PR98533]

The following testcases ICE during type verification, because TYPE_FIELDS
of e.g. S RECORD_TYPE in pr119123.C is different from TYPE_FIELDS of const S.
Various decls are added to S's TYPE_FIELDS first, then finish_struct
indirectly calls fixup_type_variants to sync the variant copies.
But later on cp_parser_class_specifier calls
cp_parser_late_parsing_default_args and that apparently adds a lambda
type (from default argument) to TYPE_FIELDS of S.
Dunno if that is right or not, assuming it is right, the following
patch fixes it by updating TYPE_FIELDS of variant types if there were
any changes in the various functions cp_parser_class_specifier defers and
calls on the outermost enclosing class.
There was quite a lot of code repetition already before, so the patch
uses a lambda to avoid the repetitions.
To my surprise, in some of the contract testcases (
g++.dg/contracts/contracts-friend1.C
g++.dg/contracts/contracts-nested-class1.C
g++.dg/contracts/contracts-nested-class2.C
g++.dg/contracts/contracts-redecl7.C
g++.dg/contracts/contracts-redecl8.C
) it is actually setting class_type and pushing TRANSLATION_UNIT_DECL
rather than some class types in some cases.

Or should the lambda pushing into the containing class be somehow avoided?

2025-03-06 Jakub Jelinek <jakub@redhat.com>

PR c++/98533
PR c++/119123
* parser.cc (cp_parser_class_specifier): Update TYPE_FIELDS of
variant types in case cp_parser_late_parsing_default_args etc. change
TYPE_FIELDS on the main variant. Add switch_to_class lambda and
use it to simplify repeated class switching code.

* g++.dg/cpp0x/pr98533.C: New test.
* g++.dg/cpp0x/pr119123.C: New test.

(cherry picked from commit 179e01085b0aed111ef1f7908c4b87c800f880e9)

commit | commitdiff | tree

Jakub Jelinek [Tue, 25 Feb 2025 08:33:21 +0000 (09:33 +0100)]

openmp: Mark OpenMP atomic write expression as read [PR119000]

The following testcase was emitting false positive warning that
the rhs of #pragma omp atomic write was stored but not read,
when the atomic actually does read it. The following patch
fixes that by calling default_function_array_read_conversion
on it, so that it is marked as read as well as converted from
lvalue to rvalue.
Furthermore, the code had
if (code == NOP_EXPR) ... else ... if (code == NOP_EXPR) ...
with none of ... parts changing code, so I've merged the two ifs.

2025-02-25 Jakub Jelinek <jakub@redhat.com>

PR c/119000
* c-parser.cc (c_parser_omp_atomic): For omp write call
default_function_array_read_conversion on the rhs expression.
Merge the two adjacent if (code == NOP_EXPR) blocks.

* c-c++-common/gomp/pr119000.c: New test.

(cherry picked from commit cdffc76393488a73671b70481cf8a4b7c289029d)

commit | commitdiff | tree

Jakub Jelinek [Mon, 24 Feb 2025 11:19:16 +0000 (12:19 +0100)]

reassoc: Fix up optimize_range_tests_to_bit_test [PR118915]

The following testcase is miscompiled due to a bug in
optimize_range_tests_to_bit_test.  It is trying to optimize
check for a in [-34,-34] or [-26,-26] or [-6,-6] or [-4,inf] ranges.
Another reassoc optimization folds the the test for the first
two ranges into (a + 34U) & ~8U in [0U,0U] range, and extract_bit_test_mask
actually has code to virtually undo it and treat that again as test
for a being -34 or -26.  The problem is that optimize_range_tests_to_bit_test
remembers in the type variable TREE_TYPE (ranges[i].exp); from the first
range.  If extract_bit_test_mask doesn't do that virtual undoing of the
BIT_AND_EXPR handling, that is just fine, the returned exp is ranges[i].exp.
But if the first range is BIT_AND_EXPR, the type could be different, the
BIT_AND_EXPR form has the optional cast to corresponding unsigned type
in order to avoid introducing UB.  Now, type was used to fill in the
max value if ranges[j].high was missing in subsequently tested range,
and so in this particular testcase the [-4,inf] range which was
signed int and so [-4,INT_MAX] was treated as [-4,UINT_MAX] instead.
And we were subtracting values of 2 different types and trying to make
sense out of that.

The following patch fixes this by using the type of the low bound
(which is always non-NULL) for the max value of the high bound instead.

2025-02-24  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/118915
* tree-ssa-reassoc.cc (optimize_range_tests_to_bit_test): For
highj == NULL_TREE use TYPE_MAX_VALUE (TREE_TYPE (lowj)) rather
than TYPE_MAX_VALUE (type).

* gcc.c-torture/execute/pr118915.c: New test.

(cherry picked from commit 5806279610783805286ebcd0af3b455602a3a8f9)

commit | commitdiff | tree

Jakub Jelinek [Sat, 8 Feb 2025 07:54:31 +0000 (08:54 +0100)]

i386: Fix ICE with conditional QI/HI vector maxmin [PR118776]

The following testcase ICEs starting with GCC 12 since r12-4526
although the bug has been introduced already in r12-2751.
The problem was in the addition of cond_<code><mode> define_expand
which uses nonimmediate_operand predicates for both maxmin operands
for all VI1248_AVX512VLBW modes.  It works fine with
VI48_AVX512VL modes because the <code><mode>3_mask VI48_AVX512VL
define_expand uses ix86_fixup_binary_operands_no_copy and the
*avx512f_<code><mode>3<mask_name> VI48_AVX512VL define_insn uses
% in constraint and !(MEM_P && MEM_P) check in condition (and
<code><mode>3 define_expand with VI124_256_AVX512F_AVX512BW iterator
does that too), but eventhough the 8-bit and 16-bit element maxmin
is commutative too, the <mask_codefor><code><mode>3<mask_name>
define_insn with VI12_AVX512VL iterator didn't use % in constraint
to make it commutative.  So, e.g. cond_umaxv32qi define_expand
allowed nonimmediate_operand for both umax operands, but used
gen_umaxv32qi_mask which wasn't commutative and only allowed
nonimmediate_operand for the second operand.

The following patch fixes it by keeping the <code><mode>3
VI124_256_AVX512F_AVX512BW define_expand as is (it does
ix86_fixup_binary_operands_no_copy) but extending the
<code><mode>3_mask define_expand from VI48_AVX512VL to
VI1248_AVX512VLBW which keeps the current modes with their
ISA conditions and adds the VI12_AVX512VL modes under additional
TARGET_AVX512BW condition, and turning the actual define_insn
into an * prefixed name (which it was before just for the non-masked
case) and having the same commutative operand handling as in other
define_insns.

2025-02-08  Jakub Jelinek  <jakub@redhat.com>

PR target/118776
* config/i386/sse.md (<code><mode>3_mask): Use VI1248_AVX512VLBW
iterator rather than VI48_AVX512VL.
(<mask_codefor><code><mode>3<mask_name>): Rename to ...
(*avx512bw_<code><mode>3<mask_name>): ... this.  Use
nonimmediate_operand rather than register_operand predicate and %v
rather than v constraint for operand 1 and adjust condition to reject
MEMs in both operand 1 and 2.

* gcc.target/i386/pr118776.c: New test.

(cherry picked from commit 64d8ea056a5c339700118a412dea1c44a57acf55)

commit | commitdiff | tree

Jakub Jelinek [Fri, 7 Feb 2025 13:30:11 +0000 (14:30 +0100)]

c++: Don't use CLEANUP_EH_ONLY for new expression cleanup [PR118763]

The following testcase is miscompiled since r12-6325 stopped
preevaluating the initializers for new expression.
If evaluating the initializers throws, there is a correct cleanup
for that, but it is marked CLEANUP_EH_ONLY.  While in standard
C++ that is just fine, if it has statement expressions, it can
return or goto out of the expression and we should delete the
pointer in that case too.

There is already a sentry variable initialized to true and
set to false after everything is initialized and used as a guard
for the cleanup, so just removing the CLEANUP_EH_ONLY flag does
everything we need.  And in the normal case of the initializer
not using statement expressions at least with -O2 we get the same code,
while the change changes one
try { sentry = true; ... sentry = false; } catch { if (sentry) delete ...; }
into
try { sentry = true; ... sentry = false; } finally { if (sentry) delete ...; }
optimizations will see that sentry is false when reaching the finally
other than through an exception.

Though, wonder what other CLEANUP_EH_ONLY cleanups might be an issue
with statement expressions.

2025-02-07  Jakub Jelinek  <jakub@redhat.com>

PR c++/118763
* init.cc (build_new_1): Don't set CLEANUP_EH_ONLY.

* g++.dg/asan/pr118763.C: New test.

(cherry picked from commit fcecc74cb38723457a0447924d9993b31252a8f9)

commit | commitdiff | tree

Jakub Jelinek [Fri, 7 Feb 2025 13:27:18 +0000 (14:27 +0100)]

c++: Allow constexpr reads from volatile std::nullptr_t objects [PR118661]

As mentioned in the PR, https://eel.is/c++draft/conv.lval#note-1
says that even volatile reads from std::nullptr_t typed objects actually
don't read anything and https://eel.is/c++draft/expr.const#10.9
says that even those are ok in constant expressions.

So, the following patch adjusts the r9-4793 changes to have an exception
for NULLPTR_TYPE.
As [conv.lval]/3 also talks about accessing to inactive member, I've added
testcase to cover that as well.

2025-02-07 Jakub Jelinek <jakub@redhat.com>

PR c++/118661
* constexpr.cc (potential_constant_expression_1): Don't diagnose
lvalue-to-rvalue conversion of volatile lvalue if it has NULLPTR_TYPE.
* decl2.cc (decl_maybe_constant_var_p): Return true for constexpr
decls with NULLPTR_TYPE even if they are volatile.

* g++.dg/cpp0x/constexpr-volatile4.C: New test.
* g++.dg/cpp0x/constexpr-union9.C: New test.

(cherry picked from commit 6c8e6d6febaed3c167ca9534935c2cb18045528e)

commit | commitdiff | tree

Jakub Jelinek [Fri, 31 Jan 2025 23:50:24 +0000 (00:50 +0100)]

icf: Compare call argument types in certain cases and asm operands [PR117432]

compare_operand uses operand_equal_p under the hood, which e.g. for
INTEGER_CSTs will just match the values rather regardless of their types.
Now, in many comparing the type is redundant, if we have
  x_2 = y_3 + 1;
we've already compared the type for the lhs and also for rhs1, there won't
be any surprises on rhs2.
As noted in the PR, there are cases where the type of the operand is the
sole place of information and we don't want to ICF merge functions if the
types differ.
One case is stdarg functions, arguments passed to ..., it is different
if we pass 1, 1L, 1LL.
Another case are the K&R unprototyped functions (sure, gone in C23).
And yet another case are inline asm operands, "r" (1) is different from "r"
(1L) from "r" (1LL).

So, the following patch determines based on lack of fntype (e.g. for
internal functions), or on !prototype_p, or on stdarg_p (in that case
using number of named arguments) which arguments need to have type checked
and does that, plus compares types on inline asm operands (maybe it would be
enough to do that just for input operands but we have just a routine to
handle both and I didn't feel we need to differentiate).

Furthermore, I've noticed fntype{1,2} isn't actually compared if it is a
direct call (gimple_call_fndecl is non-NULL).  That is wrong too, we could
have
  void (*fn) (int, long long) = (void (*) (int, long long)) foo;
  fn (1, 1LL);
in one case and
  void (*fn) (long long, int) = (void (*) (long long, int)) foo;
  fn (1LL, 1);
in another, both folded into a direct call of foo with different
gimple_call_fntype.  Sure, one of them would be UB at runtime (or both), but
what if we ICF merge it into something that into the one UB at runtime
and the program actually calls the correct one only?

2025-02-01  Jakub Jelinek  <jakub@redhat.com>

PR ipa/117432
* ipa-icf-gimple.cc (func_checker::compare_asm_inputs_outputs):
Also return_false if operands have incompatible types.
(func_checker::compare_gimple_call): Check fntype1 vs. fntype2
compatibility for all non-internal calls and assume fntype1 and
fntype2 are non-NULL for those.  For calls to non-prototyped
calls or for stdarg_p functions after the last named argument (if any)
check type compatibility of call arguments.

* gcc.c-torture/execute/pr117432.c: New test.
* gcc.target/i386/pr117432.c: New test.

(cherry picked from commit ebd111a2896816e4f5ddf5108f361b3d9d287fa0)

commit | commitdiff | tree

Arsen Arsenović [Wed, 29 Jan 2025 20:14:33 +0000 (21:14 +0100)]

d: give dependency files better filenames [PR118477]

Currently, the dependency files for root-file.o and common-file.o were
both d/.deps/file.Po, which would cause parallel builds to fail
sometimes with:

  make[3]: Leaving directory '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
  make[3]: Entering directory '/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/build/gcc'
  mv: cannot stat 'd/.deps/file.TPo': No such file or directory
  make[3]: *** [/var/tmp/portage/sys-devel/gcc-14.1.1_p20240511/work/gcc-14-20240511/gcc/d/Make-lang.in:421: d/root-file.o] Error 1 shuffle=131581365

Also, this means that dependencies of one of root-file or common-file
are missing when developing.  After this patch, those two files get
assigned dependency files d/.deps/root-file.Po and
d/.deps/common-file.Po respectively, so match the actual object
files in the d/ subdirectory.

There are other files with similar conflicts (mangle-package.o,
visitor-package.o for instance).

2025-01-29  Arsen Arsenović  <arsen@aarsen.me>
    Jakub Jelinek  <jakub@redhat.com>

PR d/118477
* Make-lang.in (DCOMPILE, DPOSTCOMPILE): Use $(basename $(@F))
instead of $(*F).

Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
(cherry picked from commit d9ac0ad1e9a4ceec2d354ac0368da7462bea5675)

commit | commitdiff | tree

Jakub Jelinek [Sat, 25 Jan 2025 09:15:24 +0000 (10:15 +0100)]

c++: Only destruct elts of array for new expression if exception is thrown during the initialization [PR117827]

The following testcase r12-6328, because the elements of the array
are destructed twice, once when the callee encounters delete[] p;
and then second time when the exception is thrown.
The array elts should be only destructed if exception is thrown from
one of the constructors during the build_vec_init emitted code in case of
new expressions, but when the new expression completes, it is IMO
responsibility of user code to delete[] it when it is no longer needed.

So, the following patch uses the cleanup_flags argument to build_vec_init
to get notified of the flags that need to be changed when the expression
is complete and build_disable_temp_cleanup to do the changes.

2025-01-25 Jakub Jelinek <jakub@redhat.com>

PR c++/117827
* init.cc (build_new_1): Pass address of a make_tree_vector ()
initialized gc tree vector to build_vec_init and append
build_disable_temp_cleanup to init_expr from it.

* g++.dg/init/array66.C: New test.

(cherry picked from commit ce268ca2a923f8f35cc9dd5a7d0468a3980f129f)

commit | commitdiff | tree

Jakub Jelinek [Thu, 23 Jan 2025 10:11:23 +0000 (11:11 +0100)]

builtins: Store unspecified value to *exp for inf/nan [PR114877]

The fold_builtin_frexp folding for NaN/Inf just returned the first argument
with evaluating second arguments side-effects, rather than storing something
to what the second argument points to.

The PR argues that the C standard requires the function to store something
there but what exactly is stored is unspecified, so not storing there
anything can result in UB if the value isn't initialized and is read later.

glibc and newlib store there 0, musl apparently doesn't store anything.

The following patch stores there zero (or would you prefer storing there
some other value, 42, INT_MAX, INT_MIN, etc.?; zero is cheapest to form
in assembly though) and adjusts the test so that it
doesn't rely on not storing there anything but instead checks for
-Wmaybe-uninitialized warning to find out that something has been stored
there.
Unfortunately I had to disable the NaN tests for -O0, while we can fold
__builtin_isnan (__builtin_nan ("")) at compile time, we can't fold
__builtin_isnan ((i = 0, __builtin_nan (""))) at compile time.
fold_builtin_classify uses just tree_expr_nan_p and if that isn't true
(because expr is a COMPOUND_EXPR with tree_expr_nan_p on the second arg),
it does
      arg = builtin_save_expr (arg);
      return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
and that isn't folded at -O0 further, as we wrap it into SAVE_EXPR and
nothing propagates the NAN to the comparison.
I think perhaps tree_expr_nan_p etc. could have case COMPOUND_EXPR:
added and recurse on the second argument, but that feels like stage1
material to me if we want to do that at all.

2025-01-23  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/114877
* builtins.cc (fold_builtin_frexp): Handle rvc_nan and rvc_inf cases
like rvc_zero, return passed in arg and set *exp = 0.

* gcc.dg/torture/builtin-frexp-1.c: Add -Wmaybe-uninitialized as
dg-additional-options.
(bar): New function.
(TESTIT_FREXP2): Rework the macro so that it doesn't test whether
nothing has been stored to what the second argument points to, but
instead that something has been stored there, whatever it is.
(main): Temporarily don't enable the nan tests for -O0.

(cherry picked from commit d19b0682f18f9f5217aee8002e3d04f8ded04ae8)

Mirror of https://gcc.gnu.org/git/gcc.git

RSS Atom