git.ipfire.org Git - thirdparty/gcc.git/log

builtins: Store unspecified value to *exp for inf/nan [PR114877]

The fold_builtin_frexp folding for NaN/Inf just returned the first argument
with evaluating second arguments side-effects, rather than storing something
to what the second argument points to.

The PR argues that the C standard requires the function to store something
there but what exactly is stored is unspecified, so not storing there
anything can result in UB if the value isn't initialized and is read later.

glibc and newlib store there 0, musl apparently doesn't store anything.

The following patch stores there zero (or would you prefer storing there
some other value, 42, INT_MAX, INT_MIN, etc.?; zero is cheapest to form
in assembly though) and adjusts the test so that it
doesn't rely on not storing there anything but instead checks for
-Wmaybe-uninitialized warning to find out that something has been stored
there.
Unfortunately I had to disable the NaN tests for -O0, while we can fold
__builtin_isnan (__builtin_nan ("")) at compile time, we can't fold
__builtin_isnan ((i = 0, __builtin_nan (""))) at compile time.
fold_builtin_classify uses just tree_expr_nan_p and if that isn't true
(because expr is a COMPOUND_EXPR with tree_expr_nan_p on the second arg),
it does
      arg = builtin_save_expr (arg);
      return fold_build2_loc (loc, UNORDERED_EXPR, type, arg, arg);
and that isn't folded at -O0 further, as we wrap it into SAVE_EXPR and
nothing propagates the NAN to the comparison.
I think perhaps tree_expr_nan_p etc. could have case COMPOUND_EXPR:
added and recurse on the second argument, but that feels like stage1
material to me if we want to do that at all.

2025-01-23  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/114877
* builtins.cc (fold_builtin_frexp): Handle rvc_nan and rvc_inf cases
like rvc_zero, return passed in arg and set *exp = 0.

* gcc.dg/torture/builtin-frexp-1.c: Add -Wmaybe-uninitialized as
dg-additional-options.
(bar): New function.
(TESTIT_FREXP2): Rework the macro so that it doesn't test whether
nothing has been stored to what the second argument points to, but
instead that something has been stored there, whatever it is.
(main): Temporarily don't enable the nan tests for -O0.

(cherry picked from commit d19b0682f18f9f5217aee8002e3d04f8ded04ae8)

match.pd: Fix (FTYPE) N CMP (FTYPE) M optimization for GENERIC [PR118522]

The last case of this optimization assumes that if 2 integral types
have same precision and TYPE_UNSIGNED, then they are uselessly convertible.
While that is very likely the case for GIMPLE, it is not the case for
GENERIC, so the following patch adds there a convert so that the
optimization produces also valid GENERIC. Without it we got
(int) p == b where b had _BitInt(32) type, so incompatible types.

2025-01-17 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/118522
* match.pd ((FTYPE) N CMP (FTYPE) M): Add convert, as in GENERIC
integral types with the same precision and sign might actually not
be compatible types.

* gcc.dg/bitint-120.c: New test.

(cherry picked from commit 3ab9eb6946f7b832834b3d808c5617935e0be727)

vec.h: Properly destruct elements in auto_vec auto storage [PR118400]

For T with non-trivial destructors, we were destructing objects in the
vector on release only when not using auto storage of auto_vec.

The following patch calls truncate (0) instead of m_vecpfx.m_num clearing,
and truncate takes care of that destruction:
  unsigned l = length ();
  gcc_checking_assert (l >= size);
  if (!std::is_trivially_destructible <T>::value)
    vec_destruct (address () + size, l - size);
  m_vecpfx.m_num = size;

2025-01-16  Jakub Jelinek  <jakub@redhat.com>

PR ipa/118400
* vec.h (vec<T, va_heap, vl_ptr>::release): Call m_vec->truncate (0)
instead of clearing m_vec->m_vecpfx.m_num.

(cherry picked from commit 43f4d44bebd63b354f8798fcef512d4d2b42c655)

Daily bump.

libgcc: On FreeBSD use GCC's crt objects for static linking

Add crtbeginT.o to extra_parts on FreeBSD. This ensures we use GCC's
crt objects for static linking. Otherwise it could mix crtbeginT.o
from the base system with libgcc's crtend.o, possibly leading to
segfaults.

libgcc:
PR target/118685
* config.host (*-*-freebsd*): Add crtbeginT.o to extra_parts.

Signed-off-by: Dimitry Andric <dimitry@andric.com>

Daily bump.

Fortran: FIx ICE in associate with elemental function [PR118750]

2025-02-06 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/118750
* resolve.cc (resolve_assoc_var): If the target expression has
a rank, do not use gfc_expression_rank, since it will return 0
if the function is elemental. Resolution will have produced the
correct rank.

gcc/testsuite/
PR fortran/118750
* gfortran.dg/associate_72.f90: New test.

(cherry picked from commit a03303b4d5b2ca58e5750a4d5bd735d85a091273)

Fortran: Fix error recovery for bad component arrayspecs [PR108434]

2025-01-11 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran/
PR fortran/108434
* class.cc (generate_finalization_wrapper): To avoid memory
leaks from callocs, return immediately if the derived type
error flag is set.
* decl.cc (build_struct): If the declaration of a derived type
or class component does not have a deferred arrayspec, correct,
set the error flag of the derived type and emit an immediate
error.

gcc/testsuite/
PR fortran/108434
* gfortran.dg/pr108434.f90 : Add tests from comment 1.

(cherry picked from commit d64ca15351029164bac30b49fb3c4f9723e755de)

Daily bump.

Fortran: host association issue with symbol in COMMON block [PR108454]

When resolving a flavorless symbol that is already registered with a COMMON
block, and which neither has the intrinsic, generic, or external attribute,
skip searching among interfaces to avoid false resolution to a derived type
of the same name.

PR fortran/108454

gcc/fortran/ChangeLog:

* resolve.cc (resolve_common_blocks): Initialize variable.
(resolve_symbol): If a symbol is already registered with a COMMON
block, do not search for an interface with the same name.

gcc/testsuite/ChangeLog:

* gfortran.dg/common_29.f90: New test.

(cherry picked from commit d6418fe22684f9335474d1fd405ade45954c069d)

LoongArch: Fix ICE caused by illegal calls to builtin functions [PR118561].

PR target/118561

gcc/ChangeLog:

* config/loongarch/loongarch-builtins.cc
(loongarch_expand_builtin_lsx_test_branch):
NULL_RTX will not be returned when an error is detected.
(loongarch_expand_builtin): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/pr118561.c: New test.

(cherry picked from commit 50d2bde68a097c2e9fb3bdd7e6664c8988889828)

Daily bump.

RTEMS: Add Cortex-M33 multilib

Enable use of Armv8-M instruction set.

Account for CVE-2021-35465 mitigation [PR102035].
The -mfix-cmse-cve-2021-35465 option is enabled by default,
if -mcpu=cortex-m33 is used.

gcc/

* config/arm/t-rtems: Add Cortex-M33 multilib.

(cherry picked from commit f2a8f3c364a0d196efb0687ea421190b46d041d5)

Daily bump.

Fortran: F2008 passing of internal procs to a proc pointer [PR117434]

2024-11-06 Paul Thomas <pault@gcc.gnu.org>

gcc/fortran
PR fortran/117434
* interface.cc (gfc_compare_actual_formal): Skip 'Expected a
procedure pointer error' if the formal argument typespec has an
interface and the type of the actual arg is BT_PROCEDURE.

gcc/testsuite/
PR fortran/117434
* gfortran.dg/proc_ptr_54.f90: New test. This is temporarily
compile-only until one one seven four five five is fixed.
* gfortran.dg/proc_ptr_55.f90: New test.
* gfortran.dg/proc_ptr_56.f90: New test.

(cherry picked from commit 4dbf4c0fdb188e1c348688de91e010f696cd59fc)

Daily bump.

options: Adjust cl_optimization_compare to avoid checking ICE [PR115913]

At the end of a sequence like:
#pragma GCC push_options
...
#pragma GCC pop_options

the handler for pop_options calls cl_optimization_compare() (as generated by
optc-save-gen.awk) to make sure that all global state has been restored to
the value it had prior to the push_options call. The verification is
performed for almost all entries in the global_options struct. This leads to
unexpected checking asserts, as discussed in the PR, in case the state of
warnings-related options has been intentionally modified in between
push_options and pop_options via a call to #pragma GCC diagnostic. Address
that by skipping the verification for CL_WARNING-flagged options.

gcc/ChangeLog:

PR middle-end/115913
* optc-save-gen.awk (cl_optimization_compare): Skip options with
CL_WARNING flag.

gcc/testsuite/ChangeLog:

PR middle-end/115913
* c-c++-common/cpp/pr115913.c: New test.

Daily bump.

Ada: Fix segfault on uninitialized variable as operand of primitive operator

...of derived real type. It comes from an unexpected internal adjustment.

gcc/ada/
PR ada/118712
* sem_warn.adb (Check_References): Deal with small adjustments of
references.

gcc/testsuite/
* gnat.dg/warn33.adb: New test.
* gnat.dg/warn33_pkg.ads: New helper.

Daily bump.

Fortran: fix bogus diagnostics on renamed interface import [PR110993]

PR fortran/110993

gcc/fortran/ChangeLog:

* frontend-passes.cc (check_externals_procedure): Do not compare
interfaces of a non-bind(C) procedure against a bind(C) global one.
(check_against_globals): Use local name from rename-on-use in the
search for interfaces.

gcc/testsuite/ChangeLog:

* gfortran.dg/use_rename_14.f90: New test.

(cherry picked from commit 9104472b645f76a212af9f9c58378500f9ba937e)

Daily bump.

Fortran: fix passing of component ref to assumed-rank dummy [PR118683]

While the fix for pr117774 addressed the passing of an inquiry reference
to an assumed-rank dummy, it missed the similar case of passing a component
reference. The newer testcase gfortran.dg/pr81978.f90 uncovered this
latent issue with a UBSAN instrumented compiler.

PR fortran/118683

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_procedure_call): The bounds update for
passing to assumed-rank dummies shall also handle component
references besides inquiry references.

(cherry picked from commit 5d001fa122bf04cfd0611ff7723969a0421b3094)

testsuite/118127: Pass fortran tests on ppc64le for IEEE128 long doubles

Denormal behaviour is well defined for IEEE128 long doubles, so
XFAIL some gfortran tests only for targets with the IBM128 long double
ABI.

gcc/testsuite/ChangeLog:

PR testsuite/118127
* lib/target-supports.exp
(check_effective_target_long_double_is_ibm128): New
procedure.
* gfortran.dg/default_format_2.f90: xfail for
long_double_is_ibm128.
* gfortran.dg/default_format_denormal_2.f90: Likewise.
* gfortran.dg/large_real_kind_form_io_2.f90: Likewise.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
(cherry picked from commit d4d4e874dee2d5b0abe5ceb9f2a78e5602e86030)

libstdc++: Fix views::transform(move_only_fn{}) forwarding [PR118413]

The range adaptor perfect forwarding simplification mechanism is currently
only enabled for trivially copyable bound arguments, to prevent undesirable
copies of complex objects. But "trivially copyable" is the wrong property
to check for here, since a move-only type with a trivial move constructor
is considered trivially copyable, and after P2492R2 we can't assume copy
constructibility of the bound arguments. This patch makes the mechanism
more specifically check for trivial copy constructibility instead so
that it's properly disabled for move-only bound arguments.

PR libstdc++/118413

libstdc++-v3/ChangeLog:

* include/std/ranges (views::__adaptor::_Partial): Adjust
constraints on the "simple" partial specializations to require
is_trivially_copy_constructible_v instead of
is_trivially_copyable_v.
* testsuite/std/ranges/adaptors/adjacent_transform/1.cc (test04):
Extend P2494R2 test.
* testsuite/std/ranges/adaptors/transform.cc (test09): Likewise.

Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 09d1cbee10b8c51aed48f047f30717f622d6f811)

c++: re-enable NSDMI CONSTRUCTOR folding [PR118355]

In c++/102990 we had a problem where massage_init_elt got {},
digest_nsdmi_init turned that {} into { .value = (int) 1.0e+0 },
and we crashed in the call to fold_non_dependent_init because
a FIX_TRUNC_EXPR/FLOAT_EXPR got into tsubst*.  So we avoided
calling fold_non_dependent_init for a CONSTRUCTOR.

But that broke the following test, where we no longer fold the
CONST_DECL in
  { .type = ZERO }
to
  { .type = 0 }
and then process_init_constructor_array does:

            if (next != error_mark_node
                && (initializer_constant_valid_p (next, TREE_TYPE (next))
                    != null_pointer_node))
              {
                /* Use VEC_INIT_EXPR for non-constant initialization of
                   trailing elements with no explicit initializers.  */
                picflags |= PICFLAG_VEC_INIT;

because { .type = ZERO } isn't initializer_constant_valid_p.  Then we
create a VEC_INIT_EXPR and say we can't convert the argument.

So we have to fold the elements of the CONSTRUCTOR.  We just can't
instantiate the elements in a template.

This also fixes c++/118047.

PR c++/118047
PR c++/118355

gcc/cp/ChangeLog:

* typeck2.cc (massage_init_elt): Call fold_non_dependent_init
unless for a CONSTRUCTOR in a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-list10.C: New test.
* g++.dg/cpp0x/nsdmi-list9.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit e939005c496dfd4058fa57b6860bfadcabe4a111)

Daily bump.

c++: friend vs inherited guide confusion [PR117855]

We recently started using the lang_decl_fn::context field to track
inheritedness of a deduction guide (for C++23 inherited CTAD). This
new overloading of the field accidentally made DECL_FRIEND_CONTEXT
return non-NULL for inherited guides, which breaks the below testcase
during overload resolution with an inherited guide.

This patch fixes this by refining DECL_FRIEND_CONTEXT appropriately.

PR c++/117855

gcc/cp/ChangeLog:

* cp-tree.h (DECL_FRIEND_CONTEXT): Exclude deduction guides.

gcc/testsuite/ChangeLog:

* g++.dg/cpp23/class-deduction-inherited7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit ea578dd251eaf6304b0c95acc107f9a4d63bee8f)

AArch64: don't override march to assembler with mcpu if march is specified [PR110901]

When both -mcpu and -march are specified, the value of -march wins out.

This is done correctly for the calls to cc1 and for the assembler directives we
put out in assembly files.

However in the call to as we don't do this and instead use the arch from the
cpu. This leads to a situation that GCC cannot reliably be used to compile
assembly files which don't have a .arch directive.

This is quite common with .S files which use macros to selectively enable
codepath based on what the preprocessor sees.

The fix is to change MCPU_TO_MARCH_SPEC to not override the march if an march
is already specified.

gcc/ChangeLog:

PR target/110901
* config/aarch64/aarch64.h (MCPU_TO_MARCH_SPEC): Don't override if
march is set.

gcc/testsuite/ChangeLog:

PR target/110901
* gcc.target/aarch64/options_set_29.c: New test.

(cherry picked from commit 773beeaafb0ea31bd4e308b64781731d64b571ce)

AArch64: have -mcpu=native detect architecture extensions for unknown non-homogenous systems [PR113257]

in g:e91a17fe39c39e98cebe6e1cbc8064ee6846a3a7 we added the ability for
-mcpu=native on unknown CPUs to still enable architecture extensions.

This has worked great but was only added for homogenous systems.

However the same thing works for big.LITTLE as in such system the cores must
have the same extensions otherwise it doesn't fundamentally work.

i.e. task migration from one core to the other wouldn't work.

This extends the same handling to non-homogenous systems.

gcc/ChangeLog:

PR target/113257
* config/aarch64/driver-aarch64.cc (get_cpu_from_id, DEFAULT_CPU): New.
(host_detect_local_cpu): Use it.

gcc/testsuite/ChangeLog:

PR target/113257
* gcc.target/aarch64/cpunative/info_34: New test.
* gcc.target/aarch64/cpunative/native_cpu_34.c: New test.
* gcc.target/aarch64/cpunative/info_35: New test.
* gcc.target/aarch64/cpunative/native_cpu_35.c: New test.

Co-authored-by: Richard Sandiford <richard.sandiford@arm.com>
(cherry picked from commit 1ff85affe46623fe1a970de95887df22f4da9d16)

asan: Fix missing FakeStack flag cleanup

The FakeStack flag is not zeroed out when can_store_by_pieces()
returns false. Over time, this causes FakeStack::Allocate() to perform
the maximum number of loop iterations, significantly slowing down the
instrumented program.

Link: https://inbox.sourceware.org/gcc-patches/20250109001702.154685-1-iii@linux.ibm.com/
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
gcc/ChangeLog:

* asan.cc (asan_emit_stack_protection): Always zero the flag
unless it is cleared by the __asan_stack_free_N() libcall.

(cherry picked from commit 13d971601175541146fda54682cde4d3f9cd2759)

Daily bump.

libstdc++: perfectly forward std::ranges::clamp arguments

As reported in PR118185, std::ranges::clamp does not correctly forward
the projected value to the comparator. Add the missing forward.

libstdc++-v3/ChangeLog:

PR libstdc++/118185
PR libstdc++/100249
* include/bits/ranges_algo.h (__clamp_fn): Correctly forward the
projected value to the comparator.
* testsuite/25_algorithms/clamp/118185.cc: New test.

Signed-off-by: Giuseppe D'Angelo <giuseppe.dangelo@kdab.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit b342614139c0a981b369176980663941b9c27f39)

c++: explicit spec of constrained member tmpl [PR107522]

When defining a explicit specialization of a constrained member template
(of a class template) such as f and g in the below testcase, the
DECL_TEMPLATE_PARMS of the corresponding TEMPLATE_DECL are partially
instantiated, whereas its associated constraints are carried over
from the original template and thus are in terms of the original
DECL_TEMPLATE_PARMS. So during normalization for such an explicit
specialization we need to consider the (parameters of) the most general
template, since that's what the constraints are in terms of and since we
always use the full set of template arguments during satisfaction.

PR c++/107522

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Use the
most general template for an explicit specialization of a
member template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-explicit-spec7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 62daa81308c6c187059fcad98377146e30725fa5)

Daily bump.

Fortran: Fix UTF-8 output with A edit descriptor.

This adjusts the source len for the case where the user has
specified a field width with the A descriptor.

PR libfortran/118571

libgfortran/ChangeLog:

* io/write.c (write_utf8_char4): Adjust the src_len to the
format width w_len when greater than zero.

gcc/testsuite/ChangeLog:

* gfortran.dg/utf8_3.f03: New test.

(cherry picked from commit 4daf088123b2c4c3114a4b96d5353c3d72eb8ac9)

Fortran: do not copy back for parameter actual arguments [PR81978]

When an array is packed for passing as an actual argument, and the array
has the PARAMETER attribute (i.e., it is a named constant that can reside
in read-only memory), do not copy back (unpack) from the temporary.

PR fortran/81978

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_conv_array_parameter): Do not copy back data
if actual array parameter has the PARAMETER attribute.
* trans-expr.cc (gfc_conv_subref_array_arg): Likewise.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr81978.f90: New test.

(cherry picked from commit 0d1e62b83561baa185bf080515750a89dd3ac410)

c++: ICE with nested anonymous union [PR117153]

In a template, for

  union {
    union {
      T d;
    };
  };

build_anon_union_vars crates a malformed COMPONENT_REF: we have no
DECL_NAME for the nested anon union so we create something like "object.".
Most of the front end doesn't seem to care, but if such a tree gets to
potential_constant_expression, it can cause a crash.

We can use FIELD directly for the COMPONENT_REF's member.  tsubst_stmt
should build up a proper one in:

    if (VAR_P (decl) && !DECL_NAME (decl)
&& ANON_AGGR_TYPE_P (TREE_TYPE (decl)))
      /* Anonymous aggregates are a special case.  */
      finish_anon_union (decl);

PR c++/117153

gcc/cp/ChangeLog:

* decl2.cc (build_anon_union_vars): Use FIELD for the second operand
of a COMPONENT_REF.

gcc/testsuite/ChangeLog:

* g++.dg/other/anon-union6.C: New test.
* g++.dg/other/anon-union7.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit 10850f92b2a618ef1b1ad399530943ef4847823d)

testsuite: arm: Use -Os -fno-math-errno in vfp-1.c [PR116448]

gcc/testsuite/ChangeLog:

PR testsuite/116448
* gcc.target/arm/vfp-1.c: Use -Os -fno-math-errno.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 062c04c45b2156b27ddc3b6541d8484e9db63bc3)

rs6000: Fix ICE for invalid constants in built-in functions

For invalid constant operand values used in built-in functions, return
const0_rtx to signify an error occurred during expansion.

2025-01-16 Peter Bergner <bergner@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Return
const0_rtx when there is an error.

gcc/testsuite/
* gcc.target/powerpc/mma-builtin-error.c: New test.

(cherry picked from commit 0696af74b3392e2178215607337b116d1bb53e34)

rs6000: Fix loop limit for built-in constant checking

The loop checking for built-in constant operand restrictions was missing
some operands due to the loop limit being too small. Fixing that exposed
a testsuite failure which is caused by a typo in the pmxvi4ger8pp definition
where we had made the PMASK field too small.

2025-01-16 Peter Bergner <bergner@linux.ibm.com>

gcc/
* config/rs6000/rs6000-builtin.cc (rs6000_expand_builtin): Use correct
array size for the loop limit.
* config/rs6000/rs6000-builtins.def: Fix field size for PMASK operand.

(cherry picked from commit 1a2d63a78f99b7fdc2eff5bf9065682d5bbbaaca)

Daily bump.

hppa: Fix typo in ADDITIONAL_REGISTER_NAMES in pa32-regs.h

2025-01-23 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa32-regs.h (ADDITIONAL_REGISTER_NAMES): Change
register 86 name to "%fr31L".

Daily bump.

rtl: Remove invalid compare simplification [PR117186]

g:d882fe5150fbbeb4e44d007bb4964e5b22373021, posted at
https://gcc.gnu.org/pipermail/gcc-patches/2000-July/033786.html ,
added code to treat:

  (set (reg:CC cc) (compare:CC (gt:M (reg:CC cc) 0) (lt:M (reg:CC cc) 0)))

as a nop.  This PR shows that that isn't always correct.
The compare in the set above is between two 0/1 booleans (at least
on STORE_FLAG_VALUE==1 targets), whereas the unknown comparison that
produced the incoming (reg:CC cc) is unconstrained; it could be between
arbitrary integers, or even floats.  The fold is therefore replacing a
cc that is valid for both signed and unsigned comparisons with one that
is only known to be valid for signed comparisons.

  (gt (compare (gt cc 0) (lt cc 0) 0)

does simplify to:

  (gt cc 0)

but:

  (gtu (compare (gt cc 0) (lt cc 0) 0)

does not simplify to:

  (gtu cc 0)

The optimisation didn't come with a testcase, but it was added for
i386's cmpstrsi, now cmpstrnsi.  That probably doesn't matter as much
as it once did, since it's now conditional on -minline-all-stringops.
But the patch is almost 25 years old, so whatever the original
motivation was, it seems likely that other things now rely on it.

It therefore seems better to try to preserve the optimisation on rtl
rather than get rid of it.  To do that, we need to look at how the
result of the outer compare is used.  We'd therefore be looking at four
instructions (the gt, the lt, the compare, and the use of the compare),
but combine already allows that for 3-instruction combinations thanks
to:

  /* If the source is a COMPARE, look for the use of the comparison result
     and try to simplify it unless we already have used undobuf.other_insn.  */

When applied to boolean inputs, a comparison operator is
effectively a boolean logical operator (AND, ANDNOT, XOR, etc.).
simplify_logical_relational_operation already had code to simplify
logical operators between two comparison results, but:

* It only handled IOR, which doesn't cover all the cases needed here.
  The others are easily added.

* It treated comparisons of integers as having an ORDERED/UNORDERED result.
  Therefore:

  * it would not treat "true for LT + EQ + GT" as "always true" for
    comparisons between integers, because the mask excluded the UNORDERED
    condition.

  * it would try to convert "true for LT + GT" into LTGT even for comparisons
    between integers.  To prevent an ICE later, the code used:

       /* Many comparison codes are only valid for certain mode classes.  */
       if (!comparison_code_valid_for_mode (code, mode))
         return 0;

    However, this used the wrong mode, since "mode" is here the integer
    result of the comparisons (and the mode of the IOR), not the mode of
    the things being compared.  Thus the effect was to reject all
    floating-point-only codes, even when comparing floats.

  I think instead the code should detect whether the comparison is between
  integer values and remove UNORDERED from consideration if so.  It then
  always produces a valid comparison (or an always true/false result),
  and so comparison_code_valid_for_mode is not needed.  In particular,
  "true for LT + GT" becomes NE for comparisons between integers but
  remains LTGT for comparisons between floats.

* There was a missing check for whether the comparison inputs had
  side effects.

While there, it also seemed worth extending
simplify_logical_relational_operation to unsigned comparisons, since
that makes the testing easier.

As far as that testing goes: the patch exhaustively tests all
combinations of integer comparisons in:

  (cmp1 (cmp2 X Y) (cmp3 X Y))

for the 10 integer comparisons, giving 1000 fold attempts in total.
It then tries all combinations of (X in {-1,0,1} x Y in {-1,0,1})
on the result of the fold, giving 9 checks per fold, or 9000 in total.
That's probably more than is typical for self-tests, but it seems to
complete in neglible time, even for -O0 builds.

gcc/
PR rtl-optimization/117186
* rtl.h (simplify_context::simplify_logical_relational_operation): Add
an invert0_p parameter.
* simplify-rtx.cc (unsigned_comparison_to_mask): New function.
(mask_to_unsigned_comparison): Likewise.
(comparison_code_valid_for_mode): Delete.
(simplify_context::simplify_logical_relational_operation): Add
an invert0_p parameter.  Handle AND and XOR.  Handle unsigned
comparisons.  Handle always-false results.  Ignore the low bit
of the mask if the operands are always ordered and remove the
then-redundant check of comparison_code_valid_for_mode.  Check
for side-effects in the operands before simplifying them away.
(simplify_context::simplify_binary_operation_1): Remove
simplification of (compare (gt ...) (lt ...)) and instead...
(simplify_context::simplify_relational_operation_1): ...handle
comparisons of comparisons here.
(test_comparisons): New function.
(test_scalar_ops): Call it.

gcc/testsuite/
PR rtl-optimization/117186
* gcc.dg/torture/pr117186.c: New test.
* gcc.target/aarch64/pr117186.c: Likewise.

aarch64: Detect word-level modification in early-ra [PR118184]

REGMODE_NATURAL_SIZE is set to 64 bits for everything except
VLA SVE modes. This means that it's possible to modify (say)
the highpart of a TI pseudo or a V2DI pseudo independently
of the lowpart. Modifying such highparts requires a reload
if the highpart ends up in the upper 64 bits of an FPR,
since RTL semantics do not allow the highpart of a single
hard register to be modified independently of the lowpart.

early-ra missed a check for this case, which meant that it
effectively treated an assignment to (subreg:DI (reg:TI R) 0)
as an assignment to the whole of R.

gcc/
PR target/118184
* config/aarch64/aarch64-early-ra.cc (allocno_assignment_is_rmw):
New function.
(early_ra::record_insn_defs): Mark the live range information as
untrustworthy if an assignment would change part of an allocno
but preserve the rest.

gcc/testsuite/
* gcc.dg/torture/pr118184.c: New test.

Daily bump.

c++: Wrap force_target_expr in get_member_function_from_ptrfunc with save_expr [PR118509]

My October PR117259 fix to get_member_function_from_ptrfunc to use a
TARGET_EXPR rather than SAVE_EXPR unfortunately caused some regressions as
well as the following testcase shows.
What happens is that
get_member_function_from_ptrfunc -> build_base_path calls save_expr,
so since the PR117259 change in mnay cases it will call save_expr on
a TARGET_EXPR.  And, for some strange reason a TARGET_EXPR is not considered
an invariant, so we get a SAVE_EXPR wrapped around the TARGET_EXPR.
That SAVE_EXPR <TARGET_EXPR <...>> gets initially added only to the second
operand of ?:, so at that point it would still work fine during expansion.
But unfortunately an expression with that subexpression is handed to the
caller also through *instance_ptrptr = instance_ptr; and gets evaluated
once again when computing the first argument to the method.
So, essentially, we end up with
(TARGET_EXPR <D.2907, ...>, (... ? ... SAVE_EXPR <TARGET_EXPR <D.2907, ...>
... : ...)) (... SAVE_EXPR <TARGET_EXPR <D.2907, ...> ..., ...);
and while D.2907 is initialized during gimplification in the code dominating
everything that uses it, the extra temporary created for the SAVE_EXPR
is initialized only conditionally (if the ?: condition is true) but then
used unconditionally, so we get
pmf-4.C: In function ‘void foo(C, B*)’:
pmf-4.C:12:11: warning: ‘<anonymous>’ may be used uninitialized [-Wmaybe-uninitialized]
   12 |   (y->*x) ();
      |   ~~~~~~~~^~
pmf-4.C:12:11: note: ‘<anonymous>’ was declared here
   12 |   (y->*x) ();
      |   ~~~~~~~~^~
diagnostic and wrong-code issue too.

As the trunk fix to just treat TARGET_EXPR as invariant seems a little bit risky
and I'd like to get it tested on the trunk for a while, for 14.2.1 this patch
instead wraps those TARGET_EXPRs into SAVE_EXPRs.  Eventually that can be reverted
and the trunk fix backported.

2025-01-21  Jakub Jelinek  <jakub@redhat.com>

PR c++/118509
* typeck.cc (get_member_function_from_ptrfunc): Wrap force_target_expr
with save_expr.

* g++.dg/expr/pmf-4.C: New test.

c++/modules: Propagate FNDECL_USED_AUTO when propagating deduced return types [PR118049]

In the linked testcase, we're erroring because the declared return types
of the functions do not appear to match. This is because when merging
the deduced return types for 'foo' in 'auto-5_b.C', we overwrote the
return type for the declaration with the deduced return type from
'auto-5_a.C' but neglected to track that we were originally declared
with 'auto'.

As a drive-by improvement to QOI, also add checks for if the deduced
return types do not match; this is currently useful because we do not
check the equivalence of the bodies of functions yet.

PR c++/118049

gcc/cp/ChangeLog:

* module.cc (trees_in::is_matching_decl): Propagate
FNDECL_USED_AUTO as well.

gcc/testsuite/ChangeLog:

* g++.dg/modules/auto-5_a.C: New test.
* g++.dg/modules/auto-5_b.C: New test.
* g++.dg/modules/auto-5_c.C: New test.
* g++.dg/modules/auto-6_a.H: New test.
* g++.dg/modules/auto-6_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
(cherry picked from commit f054c36c4fcb693e04411dc691ef4172479143d6)

Daily bump.

Update gcc zh_CN.po

* zh_CN.po: Update.

d: Fix failing test with 32-bit compiler [PR114434]

Since the introduction of gdc.test/runnable/test23514.d, it's exposed an
incorrect compilation when adding a 64-bit constant to a link-time
address. The current cast to size_t causes a loss of precision, which
can result in incorrect compilation.

PR d/114434

gcc/d/ChangeLog:

* expr.cc (ExprVisitor::visit (PtrExp *)): Get the offset as a
dinteger_t rather than a size_t.
(ExprVisitor::visit (SymOffExp *)): Likewise.

(cherry picked from commit 9ab38952a2033d6d4a8e31c3c4d2ab1a25a406c6)

Daily bump.

i386: Reorder *movdi_internal ISA attribute by ascending alternative index

Reorder ISA attribute by ascending alternative index. No functional change.

gcc/ChangeLog:

* config/i386/i386.md (*movdi_internal): Reorder ISA attribute
by ascending alternative index.

(cherry picked from commit 9d4b1e3772547c8c836638d09fc9a84c3c73e277)

i386: Disable SImode/DImode moves from/to mask regs without avx512bw [PR118067]

SImode and DImode moves from/to mask registers are valid only with AVX512BW,
so mark relevant alternatives in *movsi_internal and *movdi_internal as such.

Even with the patch, the testcase still fails, but now with:

pr118067.c: In function ‘foo’:
pr118067.c:13:1: internal compiler error: maximum number of generated reload insns per insn achieved (90)
   13 | }
      | ^
0x2c3b581 internal_error(char const*, ...)
        ../../git/gcc/gcc/diagnostic-global-context.cc:517
0xb68938 lra_constraints(bool)
        ../../git/gcc/gcc/lra-constraints.cc:5411
0xb51a0d lra(_IO_FILE*, int)
        ../../git/gcc/gcc/lra.cc:2449
0xaf9f4d do_reload
        ../../git/gcc/gcc/ira.cc:5977
0xafa462 execute
        ../../git/gcc/gcc/ira.cc:6165

PR target/118067

gcc/ChangeLog:

* config/i386/i386.md (*movdi_internal):
Disable alternatives from/to mask registers without AVX512BW.
(*movsi_internal): Ditto.

(cherry picked from commit 219ddae16f9d724baeff86934f8981aa5ef7b95f)

c++: Friend classes don't shadow enclosing template class paramater [PR118255]

We currently reject the following code

=== code here ===
template <int non_template> struct S { friend class non_template; };
class non_template {};
S<0> s;
=== code here ===

While EDG agrees with the current behaviour, clang and MSVC don't (see
https://godbolt.org/z/69TGaabhd), and I believe that this code is valid,
since the friend clause does not actually declare a type, so it cannot
shadow anything. The fact that we didn't error out if the non_template
class was declared before S backs this up as well.

This patch fixes this by skipping the call to check_template_shadow for
hidden bindings.

PR c++/118255

gcc/cp/ChangeLog:

* name-lookup.cc (pushdecl): Don't call check_template_shadow
for hidden bindings.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/pr99116-1.C: Adjust test expectation.
* g++.dg/template/friend84.C: New test.

(cherry picked from commit b5a069203fc074ab75d994c4a7e0f2db6a0a00fd)

Daily bump.

d: Fix ICE in expand_d_format when diagnosing empty enum [PR117115]

This was fixed in upstream dmd, and merged in r15-6824. Backport the
individual fix from the upstream merge for releases/gcc-14.

PR d/117115

gcc/testsuite/ChangeLog:

* gdc.dg/pr117115.d: New test.

(cherry picked from commit 975c4f1a5de4ede89ee9499cd1a73d613a4aeae4)

AVR: Use INT_N to built-in define __int24.

This patch uses the INT_N interface to define __int24 in avr-modes.def.

Since the testsuite uses -Wpedantic and __int24 is a C/C++ extension,
uses of __int24 and __uint24 is now marked as __extension__.

PR target/118329
gcc/
* config/avr/avr-modes.def: Add INT_N (PSI, 24).
* config/avr/avr.cc (avr_init_builtin_int24)
<__int24>: Remove definition.
<__uint24>: Adjust definition to INT_N interface.
gcc/testsuite/
* gcc.target/avr/torture/get-mem.c: (__int24, __uint24):
Add __extension__ to respective typedefs.
* gcc.target/avr/torture/set-mem.c: Same.
* gcc.target/avr/torture/int24-mul.c: Same.
* gcc.target/avr/torture/pr109907-2.c: Same.
* gcc.target/avr/torture/pr61443.c: Same.
* gcc.target/avr/torture/pr63633-ice-mult.c: Same.

(cherry picked from commit 6580b89957ccabbb5aaf43736b36b9bd399fbc13)

c++: Allow pragmas in NSDMIs [PR118147]

This patch removes the (unnecessary) CPP_PRAGMA_EOL case from
cp_parser_cache_defarg, which currently has the result that any pragmas
in the NSDMI cause an error.

PR c++/118147

gcc/cp/ChangeLog:

* parser.cc (cp_parser_cache_defarg): Don't error when
CPP_PRAGMA_EOL.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-defer7.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
(cherry picked from commit f3ccc57e5f044031a1b07e79330de9220e93afe7)

c++: Make sure fold_sizeof_expr returns the correct type [PR117775]

We currently ICE upon the following code, that is valid under
-Wno-pointer-arith:

=== cut here ===
int main() {
decltype( [](auto) { return sizeof(void); } ) x;
return x.operator()(0);
}
=== cut here ===

The problem is that "fold_sizeof_expr (sizeof(void))" returns
size_one_node, that has a different TREE_TYPE from that of the sizeof
expression, which later triggers an assert in cxx_eval_store_expression.

This patch makes sure that fold_sizeof_expr always returns a tree with
the size_type_node type.

PR c++/117775

gcc/cp/ChangeLog:

* decl.cc (fold_sizeof_expr): Make sure the folded result has
type size_type_node.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/constexpr-117775.C: New test.

(cherry picked from commit 37f38b0f97374476a4818b68c8df991886428787)

Fix setting of call graph node AutoFDO count

We are initializing both the call graph node count and
the entry block count of the function with the head_count value
from the profile.

Count propagation algorithm may refine the entry block count
and we may end up with a case where the call graph node count
is set to zero but the entry block count is non-zero. That becomes
a problem because we have this code in execute_fixup_cfg:

profile_count num = node->count;
profile_count den = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count;
bool scale = num.initialized_p () && !(num == den);

Here if num is 0 but den is not 0, scale becomes true and we
lose the counts in

if (scale)
bb->count = bb->count.apply_scale (num, den);

This is what happened in the issue reported in PR116743
(a 10% regression in MySQL HAMMERDB tests).
3d9e6767939e9658260e2506e81ec32b37cba041 made an improvement in
AutoFDO count propagation, which caused a mismatch between
the call graph node count (zero) and the entry block count (non-zero)
and subsequent loss of counts as described above.

The fix is to update the call graph node count once we've done count propagation.

Tested on x86_64-pc-linux-gnu.

gcc/ChangeLog:
PR gcov-profile/116743
* auto-profile.cc (afdo_annotate_cfg): Fix mismatch between the call graph node count
and the entry block count.

(cherry picked from commit e683c6b029f809c7a1981b4341c95d9652c22e18)

Daily bump.

d: Fix record layout of compiler-generated TypeInfo_Class [PR115249]

In r14-8766, the layout of TypeInfo_Class changed in the runtime
library, but didn't get reflected in the compiler-generated data,
causing a corruption of runtime type introspection on BigEndian targets.

This adjusts the size of the `ClassFlags' field from uint to ushort, and
adds a new ushort `depth' field in the space where ClassFlags used to
occupy.

PR d/115249

gcc/d/ChangeLog:

* typeinfo.cc (create_tinfo_types): Update internal Typenfo
representation.
(TypeInfoVisitor::visit (TypeInfoClassDeclaration *)): Likewise.

(cherry picked from commit d740694ff89ab5c78652a1f66b058ca16634ddbc)

d: Fix ICE in dmd.expressionsem.resolveLoc

This was fixed in upstream, and merged in r15-6559-g332cf038fda109.
Backport the individual fix from the upstream merge for releases/gcc-14.

PR d/116373

gcc/d/ChangeLog:

* dmd/expressionsem.d (resolveLoc): Check for null pointer before
resolving bounds of slice.

gcc/testsuite/ChangeLog:

* gdc.dg/pr116373.d: New test.

(cherry picked from commit c7dab40d7569c51ac4e62ceea05c7c049da426bb)

libstdc++: Fix reversed args in unreachable assumption [PR109849]

libstdc++-v3/ChangeLog:

PR libstdc++/109849
* include/bits/vector.tcc (vector::_M_range_insert): Fix
reversed args in length calculation.

(cherry picked from commit 6f85a97248fdff15aadc9514c1118eee0293d256)

libstdc++: Fix more pedwarns in headers for C++98

Some tests e.g. 17_intro/headers/c++1998/all_pedantic_errors.cc FAIL
with GLIBCXX_TESTSUITE_STDS=98 due to numerous C++11 extensions still in
use in the library headers. The recent changes to not make them system
headers means we get warnings now.

This change adds more diagnostic pragmas to suppress those warnings.

libstdc++-v3/ChangeLog:

* include/bits/istream.tcc: Add diagnostic pragmas around uses
of long long and extern template.
* include/bits/locale_facets.h: Likewise.
* include/bits/locale_facets.tcc: Likewise.
* include/bits/locale_facets_nonio.tcc: Likewise.
* include/bits/ostream.tcc: Likewise.
* include/bits/stl_algobase.h: Likewise.
* include/c_global/cstdlib: Likewise.
* include/ext/pb_ds/detail/resize_policy/hash_prime_size_policy_imp.hpp:
Likewise.
* include/ext/pointer.h: Likewise.
* include/ext/stdio_sync_filebuf.h: Likewise.
* include/std/istream: Likewise.
* include/std/ostream: Likewise.
* include/tr1/cmath: Likewise.
* include/tr1/type_traits: Likewise.
* include/tr1/functional_hash.h: Likewise. Remove semi-colons
at namespace scope that aren't needed after macro expansion.
* include/tr1/tuple: Remove semi-colon at namespace scope.
* include/bits/vector.tcc: Change LL suffix to just L.

(cherry picked from commit 68854071236d3a1064b46a5b22546956d3be32cd)

libstdc++: Fix std::deque::emplace calling wrong _M_insert_aux [PR90389]

We have several overloads of std::deque::_M_insert_aux, one of which is
variadic and called by std::deque::emplace. With a suitable set of
arguments to emplace, it's possible for one of the non-variadic
_M_insert_aux overloads to be selected by overload resolution, making
emplace ill-formed.

Rename the variadic _M_insert_aux to _M_emplace_aux so that calls to
emplace never select an _M_insert_aux overload. Also add an inline
_M_insert_aux for the const lvalue overload that is called from
insert(const_iterator, const value_type&).

libstdc++-v3/ChangeLog:

PR libstdc++/90389
* include/bits/deque.tcc (_M_insert_aux): Rename variadic
overload to _M_emplace_aux.
* include/bits/stl_deque.h (_M_insert_aux): Define inline.
(_M_emplace_aux): Declare.
* testsuite/23_containers/deque/modifiers/emplace/90389.cc: New
test.

(cherry picked from commit 5f44b1776e748a7528020557036740905a11b1df)

Daily bump.

match: Keep conditional in simplification to constant [PR118140].

In PR118140 we simplify

  _ifc__33 = .COND_IOR (_41, d_lsm.7_11, _46, d_lsm.7_11);

to 1:

Match-and-simplified .COND_IOR (_41, d_lsm.7_11, _46, d_lsm.7_11) to 1

when _46 == 1.  This happens by removing the conditional and applying
a | 1 = 1.  Normally we re-introduce the conditional and its else value
if needed but that does not happen here as we're not dealing with a
vector type.  For correctness's sake, we must not remove the conditional
even for non-vector types.

This patch re-introduces a COND_EXPR in such cases.  For PR118140 this
result in a non-vectorized loop.

PR middle-end/118140

gcc/ChangeLog:

* gimple-match-exports.cc (maybe_resimplify_conditional_op): Add
COND_EXPR when we simplified to a scalar gimple value but still
have an else value.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr118140.c: New test.
* gcc.target/riscv/rvv/autovec/pr118140.c: New test.

(cherry picked from commit 14cb0610559fa33f211e1546260458496fdc5e71)

Daily bump.

testsuite: The expect framework might introduce CR in output

When running tests using the "sim" config, the command is launched in
non-readonly mode and the text retrieved from the expect command will
then replace all LF with CRLF. (The problem can be found in sim_load
where it calls remote_spawn without an input file).

libstdc++-v3/ChangeLog:

* testsuite/27_io/print/1.cc: Allow both LF and CRLF in test.
* testsuite/27_io/print/3.cc: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 4b29be7216daa5921aae340388ef6416b1631f0a)

testsuite: libstdc++: Use effective-target libatomic

Test assumes libatomic.a is always available, but for some embedded
targets, there is no libatomic.a and the test thus fail.

libstdc++-v3/ChangeLog:

* testsuite/29_atomics/atomic_float/compare_exchange_padding.cc:
Use effective-target libatomic_available.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit 4b0ef49d02f1f16325aa53fee3b0f6860728d2f8)

Daily bump.

testsuite: Fix flag used for modules test

GCC14 doesn't have the new spelling '-fmodules' for enabling modules;
use the old '-fmodules-ts' spelling instead.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr114630_a.C: Use -fmodules-ts instead of
-fmodules in testcase.
* g++.dg/modules/pr114630_b.C: Likewise.
* g++.dg/modules/pr114630_c.C: Likewise.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

c++/modules: Handle chaining already-imported local types [PR114630]

In the linked testcase, an ICE occurs because when reading the
(duplicate) function definition for _M_do_parse from module Y, the local
type definitions have already been streamed from module X and setup as
regular backreferences, rather than being found with find_duplicate,
causing issues with managing DECL_CHAIN.

It is tempting to just skip setting up the DECL_CHAIN for this case.
However, for the future it would be best to ensure that the block vars
for the duplicate definition are accurate, so that we could implement
ODR checking on function definitions at some point.

So to solve this, this patch creates a copy of the streamed-in local
type and chains that; it will be discarded along with the rest of the
duplicate function after we've finished processing.

A couple of suggested implementations from the discussion on the PR that
don't work:

- Replacing the `DECL_CHAIN` assertion with `(*chain && *chain != decl)`
  doesn't handle the case where type definitions are followed by regular
  local variables, since those won't have been imported as separate
  backreferences and so the chains will diverge.

- Correcting the purviewness of GMF template instantiations to force Y
  to emit copies of the local types rather than backreferences into X is
  insufficient, as it's still possible that the local types got streamed
  in a separate cluster to the function definition, and so will be again
  referred to via regular backreferences when importing.

- Likewise, preventing the emission of function definitions where an
  import has already provided that same definition also is insufficient,
  for much the same reason.

PR c++/114630

gcc/cp/ChangeLog:

* module.cc (trees_in::core_vals) <BLOCK>: Chain a new node if
DECL_CHAIN already is set.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr114630.h: New test.
* g++.dg/modules/pr114630_a.C: New test.
* g++.dg/modules/pr114630_b.C: New test.
* g++.dg/modules/pr114630_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>

Daily bump.

Fortran: Cray pointer comparison wrongly optimized away [PR106692]

PR fortran/106692

gcc/fortran/ChangeLog:

* trans-expr.cc (gfc_conv_expr_op): Inhibit excessive optimization
of Cray pointers by treating them as volatile in comparisons.

gcc/testsuite/ChangeLog:

* gfortran.dg/cray_pointers_13.f90: New test.

(cherry picked from commit c7754a2fb2e60987524947fe189f3ffac035ea1d)

libstdc++: backport inline keyword on std::find

This is a backport version of the same patch as
g:18aff7644ad1e44dc146d36a2b7e397977aa47ac

In GCC 12 there was a ~40% regression in the performance of hashmap->find.

This regression came about accidentally:

Before GCC 12 the find function was small enough that IPA would inline it even
though it wasn't marked inline.  In GCC-12 an optimization was added to perform
a linear search when the entries in the hashmap are small.

This increased the size of the function enough that IPA would no longer inline.
Inlining had two benefits:

1.  The return value is a reference. so it has to be returned and dereferenced
    even though the search loop may have already dereference it.
2.  The pattern is a hard pattern to track for branch predictors.  This causes
    a large number of branch misses if the value is immediately checked and
    branched on. i.e. if (a != m.end()) which is a common pattern.

The patch fixes both these issues by adding the inline keyword to _M_locate
to allow the inliner to consider inlining again.

This and the other patches have been ran through serveral benchmarks where
the size, number of elements searched for and type (reference vs value) etc
were tested.

The change shows no statistical regression, but an average find improvement of
~27% and a range between ~10-60% improvements.

Thanks,
Tamar

libstdc++-v3/ChangeLog:

* include/bits/hashtable.h (find): Add inline keyword.

AArch64: correct Cortex-X4 MIDR

The Parts Num field for the MIDR for Cortex-X4 is wrong. It's currently the
parts number for a Cortex-A720 (which does have the right number).

The correct number can be found in the Cortex-X4 Technical Reference Manual [1]
on page 382 in Issue Number 5.

[1] https://developer.arm.com/documentation/102484/latest/

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Fix cortex-x4 parts
num.

testsuite: arm: Add pattern for armv8-m.base to cmse-15.c test

Since armv8-m.base uses thumb1 that does not suport sibcall/tailcall,
a pattern is needed that uses PUSH/BL/POP sequence instead of a single
B instruction to reuse an already existing function in the compile unit.

gcc/testsuite/ChangeLog:

* gcc.target/arm/cmse/cmse-15.c: Added pattern for armv8-m.base.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>
(cherry picked from commit cfd7c54bdfe109f7e801122a093d0d2a85324fc5)

Disable a broken multiversioning optimisation

This patch skips redirect_to_specific clone for aarch64 and riscv,
because the optimisation has two flaws:

1. It checks the value of the "target" attribute, even on targets that
don't use this attribute for multiversioning.

2. The algorithm used is too aggressive, and will eliminate the
indirection in some cases where the runtime choice of callee version
can't be determined statically at compile time.  A correct would need to
verify that:
- if the current caller version were selected at runtime, then the
   chosen callee version would be eligible for selection.
- if any higher priority callee version were selected at runtime, then
   a higher priority caller version would have been eligble for
   selection (and hence the current caller version wouldn't have been
   selected).

The current checks only verify a more restrictive version of the first
condition, and don't check the second condition at all.

Fixing the optimisation properly would require implementing target hooks
to check for implications between version attributes, which is too
complicated for this stage.  However, I would like to see this hook
implemented in the future, since it could also help deduplicate other
multiversioning code.

Since this behavior has existed for x86 and powerpc for a while, I
think it's best to preserve the existing behavior on those targets,
unless any maintainer for those targets disagrees.

gcc/ChangeLog:

* multiple_target.cc
(redirect_to_specific_clone): Assert that "target" attribute is
used for FMV before checking it.
(ipa_target_clone): Skip redirect_to_specific_clone on some
targets.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-pragma.C: New test.

tree-optimization/117912 - bogus address equivalences for __builtin_object_size

VN again is the culprit for exploiting address equivalences before
__builtin_object_size got the chance to do its job. This time
it isn't about union members but adjacent structure fields where
an address to one after the last element of an array field can
spill over to the next field.

The following protects all out-of-bound accesses on the upper bound
side (singling out TYPE_MAX_VALUE + 1 is more expensive). It
ignores other out-of-bound addresses that would invoke UB.

Zero-sized arrays are a bit awkward because the C++ represents them
with a -1U upper bound.

There's a similar issue for zero-sized components whose address can
be the same as the adjacent field in C.

PR tree-optimization/117912
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): For addresses
of zero-sized components do not set ->off if the object size pass
didn't run.
For OOB ARRAY_REF accesses in address expressions avoid setting
->off if the object size pass didn't run.
(valueize_refs_1): Likewise.

* c-c++-common/torture/pr117912-1.c: New testcase.
* c-c++-common/torture/pr117912-2.c: Likewise.
* c-c++-common/torture/pr117912-3.c: Likewise.

(cherry picked from commit 233972ab3b5338d7a5d1d7af9108c1f366170e44)

doc: cpp: fix version test example syntax

gcc/ChangeLog:

* doc/cpp.texi (Common Predefined Macros): Fix syntax.

Daily bump.

libstdc++: Use feature test macro for pmr::polymorphic_allocator<>

Check the __glibcxx_polymorphic_allocator macro instead of just checking
whether __cplusplus > 201703L.

libstdc++-v3/ChangeLog:

* include/bits/memory_resource.h (polymoprhic_allocator): Use
feature test macro for P0339R6 features.

(cherry picked from commit b26d92f4f71594206385d6f645ff626c0bf9b59c)

libstdc++: Improve Doxygen docs for std::allocator_traits specializations

The main fix here is to use @header so that the docs show the correct
header file instead of an internal header like alloc_traits.h.

libstdc++-v3/ChangeLog:

* include/bits/alloc_traits.h: Improve doxygen docs for
allocator_traits specializations.
* include/bits/memory_resource.h: Likewise.

(cherry picked from commit cd8e0ea7273425931a0843f4355ad61e177e1bf2)

libstdc++: Undeprecate std::pmr::polymorphic_allocator::destroy (P2875R4)

This member function was previously deprecated, but that was reverted by
P2875R4, approved earlier this year in Tokyo. Since it's not going to be
deprecated in C++26, and so presumably not removed, there is no point in
giving deprecated warnings for C++23 mode.

libstdc++-v3/ChangeLog:

* include/bits/memory_resource.h (polymorphic_allocator::destroy):
Remove deprecated attribute.

(cherry picked from commit c3e237338eb7ffc90f3cc8d32a3971d17f6d0b31)

libstdc++: Give std::memory_order a fixed underlying type [PR89624]

Prior to C++20 this enum type doesn't have a fixed underlying type,
which means it can be modified by -fshort-enums, which then means the
HLE bits are outside the range of valid values for the type.

As it has a fixed type of int in C++20 and later, do the same for
earlier standards too. This is technically a change for C++17 down,
because the implicit underlying type (without -fshort-enums) was
unsigned before. I doubt it matters in practice. That incompatibility
already exists between C++17 and C++20 and nobody has noticed or
complained. Now at least the underlying type will be int for all -std
modes.

libstdc++-v3/ChangeLog:

PR libstdc++/89624
* include/bits/atomic_base.h (memory_order): Use int as
underlying type.
* testsuite/29_atomics/atomic/89624.cc: New test.

(cherry picked from commit 99dd1be14172445795f0012b935359e7014a2215)

libstdc++: Fix typo in comment in src/c++17/fs_dir.cc

libstdc++-v3/ChangeLog:

* src/c++17/fs_dir.cc: Fix typo in comment.

(cherry picked from commit 93069606949f45cb5f60bc7f4efc4860c23c8e1c)

libstdc++: Make std::println use locale from ostream (LWG 4088)

This was just approved in Wrocław.

libstdc++-v3/ChangeLog:

* include/std/ostream (println): Pass stream's locale to
std::format, as per LWG 4088.
* testsuite/27_io/basic_ostream/print/1.cc: Check std::println
with custom locale. Remove unused brit_punc class.

(cherry picked from commit 1fd7e36e990396c22823cedd068def2aa3b112ce)

libstdc++: Fix some typos and grammatical errors in docs

Also remove some redundant 'void' parameters from code examples.

libstdc++-v3/ChangeLog:

* doc/xml/manual/using_exceptions.xml: Fix typos and grammatical
errors.
* doc/html/manual/using_exceptions.html: Regenerate.

(cherry picked from commit 96566cc46d633c2026976e585b5743e880a8f99b)