git.ipfire.org Git - thirdparty/gcc.git/log

fortran: Fix up function types for realloc and sincos{,f,l} builtins [PR108349]

As reported in the PR, the FUNCTION_TYPE for __builtin_realloc in the
Fortran FE is wrong since r0-100026-gb64fca63690ad which changed
-  tmp = tree_cons (NULL_TREE, pvoid_type_node, void_list_node);
-  tmp = tree_cons (NULL_TREE, size_type_node, tmp);
-  ftype = build_function_type (pvoid_type_node, tmp);
+  ftype = build_function_type_list (pvoid_type_node,
+                                    size_type_node, pvoid_type_node,
+                                    NULL_TREE);
   gfc_define_builtin ("__builtin_realloc", ftype, BUILT_IN_REALLOC,
                      "realloc", false);
The return type is correct, void *, but the first argument should be
void * too and only second one size_t, while the above change changed
realloc to be void *__builtin_realloc (size_t, void *);
I went through all other changes from that commit and found that
__builtin_sincos{,f,l} got broken as well, instead of the former
void __builtin_sincos{,f,l} (ftype, ftype *, ftype *);
where ftype is {double,float,long double} it is now incorrectly
void __builtin_sincos{,f,l} (ftype *, ftype *);

The following patch fixes that, plus some formatting issues around
the spots I've changed.

2023-01-11  Jakub Jelinek  <jakub@redhat.com>

PR fortran/108349
* f95-lang.c (gfc_init_builtin_function): Fix up function types
for BUILT_IN_REALLOC and BUILT_IN_SINCOS{F,,L}.  Formatting fixes.

(cherry picked from commit 0986c351aa8a9f08b3cb614baec13564dd62c114)

generic-match-head: Don't assume GENERIC folding is done only early [PR108237]

We ICE on the following testcase, because a valid V2DImode
!= comparison is folded into an unsupported V2DImode > comparison.
The match.pd pattern which does this looks like:
/* Transform comparisons of the form (X & Y) CMP 0 to X CMP2 Z
   where ~Y + 1 == pow2 and Z = ~Y.  */
(for cst (VECTOR_CST INTEGER_CST)
(for cmp (eq ne)
      icmp (le gt)
  (simplify
   (cmp (bit_and:c@2 @0 cst@1) integer_zerop)
    (with { tree csts = bitmask_inv_cst_vector_p (@1); }
     (if (csts && (VECTOR_TYPE_P (TREE_TYPE (@1)) || single_use (@2)))
      (with { auto optab = VECTOR_TYPE_P (TREE_TYPE (@1))
                         ? optab_vector : optab_default;
              tree utype = unsigned_type_for (TREE_TYPE (@1)); }
       (if (target_supports_op_p (utype, icmp, optab)
            || (optimize_vectors_before_lowering_p ()
                && (!target_supports_op_p (type, cmp, optab)
                    || !target_supports_op_p (type, BIT_AND_EXPR, optab))))
        (if (TYPE_UNSIGNED (TREE_TYPE (@1)))
         (icmp @0 { csts; })
         (icmp (view_convert:utype @0) { csts; })))))))))
and that optimize_vectors_before_lowering_p () guarded stuff there
already deals with this problem, not trying to fold a supported comparison
into a non-supported one.  The reason it doesn't work in this case is that
it isn't GIMPLE folding which does this, but GENERIC folding done during
forwprop4 - forward_propagate_into_comparison -> forward_propagate_into_comparison_1
-> combine_cond_expr_cond -> fold_binary_loc -> generic_simplify
and we simply assumed that GENERIC folding happens only before
gimplification.

The following patch fixes that by checking cfun properties instead of
always returning true in those cases.

2023-01-04  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/108237
* generic-match-head.c: Include tree-pass.h.
(canonicalize_math_p): Define to false if cfun and
cfun->curr_properties has PROP_gimple_opt_math resp. PROP_gimple_lvec
property set.

* gcc.c-torture/compile/pr108237.c: New test.

(cherry picked from commit 345dffd0d4ebff7e705dfff1a8a72017a167120a)

tree-ssa-dom: can_infer_simple_equiv fixes [PR108068]

As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find
if a floating point constant is zero and it shouldn't try to infer
equivalences from comparison against it if signed zeros are honored.
This doesn't work at all for decimal types, because real_zerop always
returns false for them (one can have different representations of decimal
zero beyond -0/+0), and it doesn't work for vector compares either,
as real_zerop checks if all elements are zero, while we need to avoid
infering equivalences from comparison against vector constants which have
at least one zero element in it (if signed zeros are honored).
Furthermore, as mentioned by Joseph, for decimal types many other values
aren't singleton.

So, this patch stops infering anything if element mode is decimal, and
otherwise uses instead of real_zerop a new function, real_maybe_zerop,
which will work even for decimal types and for complex or vector will
return true if any element is or might be zero (so it returns true
for anything but constants for now).

2022-12-23 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/108068
* tree.h (real_maybe_zerop): Declare.
* tree.c (real_maybe_zerop): Define.
* tree-ssa-dom.c (record_edge_info): Use it instead of
real_zerop or TREE_CODE (op1) == SSA_NAME || real_zerop. Always set
can_infer_simple_equiv to false for decimal floating point types.

* gcc.dg/dfp/pr108068.c: New test.

(cherry picked from commit fd1b0aefda5b65f3f841ca6e61ccea6a72daa060)

cse: Fix up CSE const_anchor handling [PR108193]

The following testcase ICEs on aarch64, because insert_const_anchor
inserts invalid CONST_INT into the CSE tables - 0x80000000 for SImode.
The second hunk of the patch fixes that, the first one is to avoid
triggering undefined behavior at compile time during compute_const_anchors
computations - performing those additions and subtractions in
HOST_WIDE_INT means it can overflow for certain constants.

2022-12-22 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/108193
* cse.c (compute_const_anchors): Change n type to
unsigned HOST_WIDE_INT, adjust comparison against it to avoid
warnings. Formatting fix.
(insert_const_anchor): Use gen_int_mode instead of GEN_INT.

* gfortran.dg/pr108193.f90: New test.

(cherry picked from commit 0cb5d7cdbab8e5f8359764ef5f62d93c2bc88552)

openmp: Don't try to destruct DECL_OMP_PRIVATIZED_MEMBER vars [PR108180]

DECL_OMP_PRIVATIZED_MEMBER vars are artificial vars with DECL_VALUE_EXPR
of this->field used just during gimplification and omp lowering/expansion
to privatize individual fields in methods when needed.
As the following testcase shows, when not in templates, they were handled
right, but in templates we actually called cp_finish_decl on them and
that can result in their destruction, which is obviously undesirable,
we should only destruct the privatized copies of them created in omp
lowering.

Fixed thusly.

2022-12-21 Jakub Jelinek <jakub@redhat.com>

PR c++/108180
* pt.c (tsubst_expr): Don't call cp_finish_decl on
DECL_OMP_PRIVATIZED_MEMBER vars.

* testsuite/libgomp.c++/pr108180.C: New test.

(cherry picked from commit 1119902b6c7c1c50123ed85ec1def8be4772d68c)

testsuite: Fix up pr64536.c for LLP64 targets [PR108151]

Apparently llp64 had 2 further warnings, fixed thusly.

2022-12-19 Jakub Jelinek <jakub@redhat.com>

PR testsuite/108151
* gcc.dg/pr64536.c (bar): Cast long to __INTPTR_TYPE__
before casting to long *.

(cherry picked from commit 6e85f89a7d59a99a3395b6e153b99262a58b2f6c)

testsuite: Fix up pr64536.c for LLP64 targets [PR108151]

The test casts a pointer to long, which is ok for ilp32 and lp64
targets but not for llp64 targets. Nothing reads the values later,
it is a link test, so all we care about is that it is the same
cast on s390x-linux where it used to fail before the PR64536 fix,
and that we don't warn about it.

2022-12-19 Jakub Jelinek <jakub@redhat.com>

PR testsuite/108151
* gcc.dg/pr64536.c (bar): Use casts to __INTPTR_TYPE__ rather than
long when casting pointer to integral type.

(cherry picked from commit ea37e96a37b50dad17b91d46edc518bbb9132d8e)

loop-invariant: Split preheader edge if the preheader bb ends with jump [PR106751]

The RTL loop passes only request simple preheaders, but don't require
fallthru preheaders, while move_invariant_reg apparently assumes the
latter, that it can just append instruction(s) to the end of the preheader
basic block.

The following patch fixes that by splitting the preheader edge if
the preheader bb ends with a JUMP_INSN (asm goto in this case).
Without that we get control flow in the middle of a bb.

2022-12-16 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/106751
* loop-invariant.c (move_invariant_reg): If preheader bb ends
with a JUMP_INSN, split the preheader edge and emit invariants
into the new preheader basic block.

* gcc.c-torture/compile/pr106751.c: New test.

(cherry picked from commit ddcaa60983b50378bde1b7e327086fe0ce101795)

c++: Ensure !!var is not an lvalue [PR107065]

The TRUTH_NOT_EXPR case in cp_build_unary_op is one of the spots where
we somewhat fold immediately using invert_truthvalue_loc.
I've tried using
  return build1_loc (location, TRUTH_NOT_EXPR, boolean_type_node, arg);
in there instead, but unfortunately that regressed
Wlogical-not-parentheses-*.c pr49706.c pr62199.c pr65120.c sequence-pt-1.C
tests, so at least for backporting that doesn't seem to be a way to go.

So, this patch instead wraps it into NON_LVALUE_EXPR if needed (which also
need a tweak for some tests in the pr47906.c test, but nothing major),
with the intent to make it backportable, and later I'll try to do further
steps to avoid folding here prematurely.  Most of the problems with
build1 TRUTH_NOT_EXPR are that it doesn't even invert comparisons as most
common case and lots of warning code isn't able to deal with ! around
comparisons; so perhaps one way to do this would be fold by hand only
invertable comparisons and for the rest create TRUTH_NOT_EXPR.

2022-12-15  Jakub Jelinek  <jakub@redhat.com>

PR c++/107065
gcc/cp/
* typeck.c (cp_build_unary_op) <case TRUTH_NOT_EXPR>: If
invert_truthvalue_loc returns obvalue_p, wrap it into NON_LVALUE_EXPR.
* parser.c (cp_parser_binary_expression): Don't call
warn_logical_not_parentheses if current.lhs is a NON_LVALUE_EXPR
of a decl with boolean type.
gcc/testsuite/
* g++.dg/cpp0x/pr107065.C: New test.

(cherry picked from commit 8b775b4c48a3cc4ef5c50e56144aea02da2e9cc6)

ivopts: Fix IP_END handling for asm goto [PR107997]

The following testcase ICEs, because the latch bb ends with
asm goto which has both fallthrough to the header and one or more labels
in the header too. In that case there is just a single edge out of the
latch block, but still the asm goto is stmt_ends_bb_p statement, yet
ivopts decides to emit an IV bump at the IP_END position and inserts
it into the same bb as the asm goto after it, which then fails verification
(control flow in the middle of bb).

The following patch fixes it by splitting the latch -> header edge in that
case and inserting into the newly created bb, where split_edge ->
redirect_edge_and_branch is able to deal with this case correctly.

2022-12-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/107997
* tree-ssa-loop-ivopts.c: Include cfganal.h.
(create_new_iv) <case IP_END>: If ip_end_pos bb is non-empty and ends
with a stmt which ends bb, instead of adding iv update after it split
the latch edge and insert iterator into the new latch bb.

* gcc.c-torture/compile/pr107997.c: New test.

(cherry picked from commit 7676235f690e624b7ed41a22b22ce8ccfac1492f)

cfgbuild: Fix DEBUG_INSN handling in find_bb_boundaries [PR106719]

The following testcase FAILs on aarch64-linux.  We have some atomic
instruction followed by 2 DEBUG_INSNs (if -g only of course) followed
by NOTE_INSN_EPILOGUE_BEG followed by some USE insn.
Now, split3 pass replaces the atomic instruction with a code sequence
which ends with a conditional jump and the split3 pass calls
find_many_sub_basic_blocks.
For -g0, find_bb_boundaries sees the flow_transfer_insn (the new conditional
jump), then NOTE_INSN_EPILOGUE_BEG which can live in between basic blocks
and then the USE insn, so splits block after the NOTE_INSN_EPILOGUE_BEG
and puts the NOTE in between the blocks.
For -g, if sees a DEBUG_INSN after the flow_transfer_insn, so sets
debug_insn to it, then walks over another DEBUG_INSN, NOTE_INSN_EPILOGUE_BEG
until it finally sees the USE insn, and triggers the:
          rtx_insn *prev = PREV_INSN (insn);

          /* If the first non-debug inside_basic_block_p insn after a control
             flow transfer is not a label, split the block before the debug
             insn instead of before the non-debug insn, so that the debug
             insns are not lost.  */
          if (debug_insn && code != CODE_LABEL && code != BARRIER)
            prev = PREV_INSN (debug_insn);
code I've added for PR81325.  If there are only DEBUG_INSNs, that is
the right thing to do, but if in between debug_insn and insn there are
notes which can stay in between basic blocks or simnilarly JUMP_TABLE_DATA
or their associated CODE_LABELs, it causes -fcompare-debug differences.

The following patch fixes it by clearing debug_insn if JUMP_TABLE_DATA
or associated CODE_LABEL is seen (I'm afraid there is no good answer
what to do with DEBUG_INSNs before those; the code then removes them:
              /* Clean up the bb field for the insns between the blocks.  */
              for (x = NEXT_INSN (flow_transfer_insn);
                   x != BB_HEAD (fallthru->dest);
                   x = next)
                {
                  next = NEXT_INSN (x);
                  /* Debug insns should not be in between basic blocks,
                     drop them on the floor.  */
                  if (DEBUG_INSN_P (x))
                    delete_insn (x);
                  else if (!BARRIER_P (x))
                    set_block_for_insn (x, NULL);
                }
but if there are NOTEs, the patch just reorders the NOTEs and DEBUG_INSNs,
such that the NOTEs come first (so that they stay in between basic blocks
like with -g0) and DEBUG_INSNs after those (so that bb is split before
them, so they will be in the basic block after NOTE_INSN_BASIC_BLOCK).

2022-12-08  Jakub Jelinek  <jakub@redhat.com>

PR debug/106719
* cfgbuild.c (find_bb_boundaries): If there are NOTEs in between
debug_insn (seen after flow_transfer_insn) and insn, move NOTEs
before all the DEBUG_INSNs and split after NOTEs.  If there are
other insns like jump table data, clear debug_insn.

* gcc.dg/pr106719.c: New test.

(cherry picked from commit d9f9d5d30feb33c359955d7030cc6be50ef6dc0a)

asan: Fix up error recovery for too large frames [PR107317]

asan_emit_stack_protection and functions it calls have various asserts that
verify sanity of the stack protection instrumentation. But, that
verification can easily fail if we've diagnosed a frame offset overflow.
asan_emit_stack_protection just emits some extra code in the prologue,
if we've reported errors, we aren't producing assembly, so it doesn't
really matter if we don't include the protection code, compilation
is going to fail anyway.

2022-11-24 Jakub Jelinek <jakub@redhat.com>

PR middle-end/107317
* asan.c: Include diagnostic-core.h.
(asan_emit_stack_protection): Return NULL early if seen_error ().

* gcc.dg/asan/pr107317.c: New test.

(cherry picked from commit b6330a7685476fc30b8ae9bbf3fca1a9b0d4be95)

i386: Uglify some local identifiers in *intrin.h [PR107748]

While reporting PR107748 (where is a problem with non-uglified names,
but I've left it out because it needs fixing anyway), I've noticed
various spots where identifiers in *intrin.h headers weren't uglified.
The following patch fixed those that are related to unions (I've grepped
for [a-zA-Z]\.[a-zA-Z] spots).
The reason we need those to be uglified is the same as why the arguments
of the inlines are __ prefixed and most of automatic vars in the inlines
- say a, v or u aren't part of implementation namespace and so users could
#define u whatever->something
#include <x86intrin.h>
and it should still work, as long as u is not e.g. one of the names
of the functions/macros the header provides (_mm* etc.).

2022-11-21 Jakub Jelinek <jakub@redhat.com>

PR target/107748
* config/i386/smmintrin.h (_mm_extract_ps): Uglify names of local
variables and union members.

(cherry picked from commit ec8ec09f9414be871e322fecf4ebf53e3687bd22)

reg-stack: Fix a -fcompare-debug bug in reg-stack [PR107183]

As the following testcase shows, the swap_rtx_condition function
in reg-stack can result in different code generation between -g and -g0.
The function is doing the changes as it goes, so does analysis and
changes together, which makes it harder to deal with DEBUG_INSNs,
where normally analysis phase ignores them and the later phase
doesn't.
swap_rtx_condition walks instructions two different ways, one is
using next_flags_user function which stops on non-call instructions
that mention the flags register, and the other is a loop on fnstsw
where it stops on instructions mentioning it and tries to find
sahf instruction that uses it (in both cases calls stop it and so
does end of basic block).
Now both of these currently stop on DEBUG_INSNs that mention
the flags register resp. the fnstsw result register.
On success the function recurses on next flags user instruction
if still live and if the recursion failed, reverts the changes
it did too and fails.
If it were just for the next_flags_user case, the fix could be
just not doing
      INSN_CODE (insn) = -1;
      if (recog_memoized (insn) == -1)
        fail = 1;
on DEBUG_INSNs (assuming all changes to those are fine),
swap_rtx_condition_1 just changes one comparison to a different
one.  But due to the possibility of fnstsw result being used
in theory before sahf in some DEBUG_INSNs, this patch takes
a different approach.  swap_rtx_condition has now a new argument
and two modes.  The first mode is when debug_seen is >= 0, in this
case both next_flags_user and the loop for fnstsw -> sahf will
ignore but note DEBUG_INSNs (that mention flags register or fnstsw
result).  If no such DEBUG_INSN is found during the whole call
including recursive invocations (so e.g. for -g0 but probably most
often for -g as well), it behaves as before, if it returns true
all the changes are done and nothing further needs to be done later.
If any DEBUG_INSNs are seen along the way, even when returning success
all the changes are reverted, so it just reports that the function
would be successful if DEBUG_INSNs were ignored.
In this case, compare_for_stack_reg needs to call it again in
debug_seen = -1 mode, which tells the function to update everything
including DEBUG_INSNs.  For the fnstsw -> sahf case which I hope
will be very rare I just reset the DEBUG_INSNs, I don't really
know how to express it easily otherwise.  For the rest
swap_rtx_condition_1 is done even on the DEBUG_INSNs.

2022-11-20  Jakub Jelinek  <jakub@redhat.com>

PR target/107183
* reg-stack.c (next_flags_user): Add DEBUG_SEEN argument.
If >= 0 and a DEBUG_INSN would be otherwise returned, set
DEBUG_SEEN to 1 and ignore it.
(swap_rtx_condition): Add DEBUG_SEEN argument.  In >= 0
mode only set DEBUG_SEEN to 1 if problematic DEBUG_ISNSs
were seen and revert all changes on success in that case.
Don't try to recog_memoized DEBUG_INSNs.
(compare_for_stack_reg): Adjust swap_rtx_condition caller.
If it returns true and debug_seen is 1, call swap_rtx_condition
again with debug_seen -1.

* gcc.dg/ubsan/pr107183.c: New test.

(cherry picked from commit 6b5c98c1c0003bd470a4428bede6c862637a94b8)

c, c++: Fix up excess precision handling of scalar_to_vector conversion [PR107358]

As mentioned earlier in the C++ excess precision support mail, the following
testcase is broken with excess precision both in C and C++ (though just in C++
it was triggered in real-world code).
scalar_to_vector is called in both FEs after the excess precision promotions
(or stripping of EXCESS_PRECISION_EXPR), so we can then get invalid
diagnostics that say float vector + float involves truncation (on ia32
from long double to float).

The following patch fixes that by calling scalar_to_vector on the operands
before the excess precision promotions, let scalar_to_vector just do the
diagnostics (it does e.g. fold_for_warn so it will fold
EXCESS_PRECISION_EXPR around REAL_CST to constants etc.) but will then
do the actual conversions using the excess precision promoted operands
(so say if we have vector double + (float + float) we don't actually do
vector double + (float) ((long double) float + (long double) float)
but
vector double + (double) ((long double) float + (long double) float)

2022-10-24 Jakub Jelinek <jakub@redhat.com>

PR c++/107358
gcc/c/
* c-typeck.c (build_binary_op): Pass operands before excess precision
promotions to scalar_to_vector call.
gcc/testsuite/
* c-c++-common/pr107358.c: New test.

(cherry picked from commit 65e3274e363cb2c6bfe6b5e648916eb7696f7e2f)

c++: Fix up constexpr handling of char/signed char/short pre/post inc/decrement [PR105774]

signed char, char or short int pre/post inc/decrement are represented by
normal {PRE,POST}_{INC,DEC}REMENT_EXPRs in the FE and only gimplification
ensures that the {PLUS,MINUS}_EXPR is done in unsigned version of those
types:
    case PREINCREMENT_EXPR:
    case PREDECREMENT_EXPR:
    case POSTINCREMENT_EXPR:
    case POSTDECREMENT_EXPR:
      {
        tree type = TREE_TYPE (TREE_OPERAND (*expr_p, 0));
        if (INTEGRAL_TYPE_P (type) && c_promoting_integer_type_p (type))
          {
            if (!TYPE_OVERFLOW_WRAPS (type))
              type = unsigned_type_for (type);
            return gimplify_self_mod_expr (expr_p, pre_p, post_p, 1, type);
          }
        break;
      }
This means during constant evaluation we need to do it similarly (either
using unsigned_type_for or using widening to integer_type_node).
The following patch does the latter.

2022-10-24  Jakub Jelinek  <jakub@redhat.com>

PR c++/105774
* constexpr.c (cxx_eval_increment_expression): For signed types
that promote to int, evaluate PLUS_EXPR or MINUS_EXPR in int type.

* g++.dg/cpp1y/constexpr-105774.C: New test.

(cherry picked from commit da8c362c4c18cff2f2dfd5c4706bdda7576899a4)

libgomp: Fix up creation of artificial teams

When not in explicit parallel/target/teams construct, we in some cases create
an artificial parallel with a single thread (either to handle target nowait
or for task reduction purposes).  In those cases, it handled again artificially
created implicit task (created by gomp_new_icv for cases where we needed to write
to some ICVs), but as the testcases show, didn't take into account possibility
of this being done from explicit task(s).  The code would destroy/free the previous
task and replace it with the new implicit task.  If task is an explicit task
(when teams is NULL, all explicit tasks behave like if (0)), it is a pointer to
a local stack variable, so freeing it doesn't work, and additionally we shouldn't
lose the explicit tasks - the new implicit task should instead replace the
ancestor task which is the first implicit one.

2022-10-12  Jakub Jelinek  <jakub@redhat.com>

* task.c (gomp_create_artificial_team): Fix up handling of invocations
from within explicit task.
* target.c (GOMP_target_ext): Likewise.
* testsuite/libgomp.c/task-7.c: New test.
* testsuite/libgomp.c/task-8.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-17.c: New test.
* testsuite/libgomp.c-c++-common/task-reduction-18.c: New test.

(cherry picked from commit a58a965eb73253759f6a3e1c7380392557da89c8)

openmp: Fix ICE with taskgroup at -O0 -fexceptions [PR107001]

The following testcase ICEs because with -O0 -fexceptions GOMP_taskgroup_end
call isn't directly followed by GOMP_RETURN statement, but there are some
conditionals to handle exceptions and we fail to find the correct GOMP_RETURN.

The fix is to treat taskgroup similarly to target data, both of these constructs
emit a try { body } finally { end_call } around the construct's body during
gimplification and we need to see proper construct nesting during gimplification
and omp lowering (including nesting of regions checks), but during omp expansion
we don't really need their nesting anymore, all we need is emit something at
the start of the region and the end of the region is the end API call we've
already emitted during gimplification.  For target data, we weren't adding
GOMP_RETURN statement during omp lowering, so after that pass it is treated
merely like stand-alone omp directives.  This patch does the same for
taskgroup too.

2022-09-24  Jakub Jelinek  <jakub@redhat.com>

PR c/107001
* omp-low.c (lower_omp_taskgroup): Don't add GOMP_RETURN statement
at the end.
* omp-expand.c (build_omp_regions_1): Clarify GF_OMP_TARGET_KIND_DATA
is not stand-alone directive.  For GIMPLE_OMP_TASKGROUP, also don't
update parent.
(omp_make_gimple_edges) <case GIMPLE_OMP_TASKGROUP>: Reset
cur_region back after new_omp_region.

* c-c++-common/gomp/pr107001.c: New test.

(cherry picked from commit ad2aab5c816a6fd56b46210c0a4a4c6243da1de9)

openmp, c: Tighten up c_tree_equal [PR106981]

This patch changes c_tree_equal to work more like cp_tree_equal, be
more strict in what it accepts.  The ICE on the first testcase was
due to INTEGER_CST wi::wide (t1) == wi::wide (t2) comparison which
ICEs if the two constants have different precision, but as the second
testcase shows, being too lenient in it can also lead to miscompilation
of valid OpenMP programs where we think certain expression is the same
even when it isn't and can be guaranteed at runtime to represent different
memory location.  So, the patch looks through only NON_LVALUE_EXPRs
and for constants as well as casts requires that the types match before
actually comparing the constant values or recursing on the cast operands.

2022-09-24  Jakub Jelinek  <jakub@redhat.com>

PR c/106981
gcc/c/
* c-typeck.c (c_tree_equal): Only strip NON_LVALUE_EXPRs at the
start.  For CONSTANT_CLASS_P or CASE_CONVERT: return false if t1 and
t2 have different types.
gcc/testsuite/
* c-c++-common/gomp/pr106981.c: New test.
libgomp/
* testsuite/libgomp.c-c++-common/pr106981.c: New test.

(cherry picked from commit 3c5bccb608c665ac3f62adb1817c42c845812428)

c++: Implement P2327R1 - De-deprecating volatile compound operations

From what I can see, this has been voted in as a DR and as it means
we warn less often than before in -std={gnu,c}++2{0,3} modes or with
-Wvolatile, I wonder if it shouldn't be backported to affected release
branches as well.

2022-08-16 Jakub Jelinek <jakub@redhat.com>

* typeck.c (cp_build_modify_expr): Implement
P2327R1 - De-deprecating volatile compound operations. Don't warn
for |=, &= or ^= with volatile lhs.
* expr.c (mark_use) <case MODIFY_EXPR>: Adjust warning wording,
leave out simple.

* g++.dg/cpp2a/volatile1.C: Adjust for de-deprecation of volatile
compound |=, &= and ^= operations.
* g++.dg/cpp2a/volatile3.C: Likewise.
* g++.dg/cpp2a/volatile5.C: Likewise.

(cherry picked from commit 6e790ca4615443fa395ac5cdba1ab6c87810985c)

cgraphunit: Don't emit asm thunks for -dx [PR106261]

When -dx option is used (didn't know we have it and no idea what is it
useful for), we just expand functions to RTL and then omit all further
RTL passes, so the normal functions aren't actually emitted into assembly,
just variables.
The following testcase ICEs, because we don't emit the methods, but do
emit thunks pointing to that and those thunks have unwind info and rely on
at least some real functions to be emitted (which is normally the case,
thunks are only emitted for locally defined functions) because otherwise
there are no CIEs, only FDEs and dwarf2out is upset about it.

The following patch fixes that by not emitting assembly thunks for -dx
either.

2022-07-27 Jakub Jelinek <jakub@redhat.com>

PR debug/106261
* cgraphunit.c (cgraph_node::assemble_thunks_and_aliases): Don't
output asm thunks for -dx.

* g++.dg/debug/pr106261.C: New test.

(cherry picked from commit f9671b60f9395cb1dca128b92f5dd215f5aeaae1)

wide-int: Fix up wi::shifted_mask [PR106144]

As the following self-test testcase shows, wi::shifted_mask sometimes
doesn't create canonicalized wide_ints, which then fail to compare equal
to canonicalized wide_ints with the same value.
In particular, wi::mask (128, false, 128) gives { -1 } with len 1 and prec 128,
while wi::shifted_mask (0, 128, false, 128) gives { -1, -1 } with len 2
and prec 128.
The problem is that the code is written with the assumption that there are
3 bit blocks (or 2 if start is 0), but doesn't consider the possibility
where there are 2 bit blocks (or 1 if start is 0) where the highest block
isn't present.  In that case, there is the optional block of negate ? 0 : -1
elts, followed by just one elt (either one from the if (shift) or just
negate ? -1 : 0) and the rest is implicit sign-extension.
Only if end < prec there is 1 or more bits above it that have different bit
value and so we need to emit all the elts till end and then one more elt.

if (end == prec) would work too, because we have:
  if (width > prec - start)
    width = prec - start;
  unsigned int end = start + width;
so end is guaranteed to be end <= prec, dunno what is preferred.

2022-07-01  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/106144
* wide-int.cc (wi::shifted_mask): If end >= prec, return right after
emitting element for shift or if shift is 0 first element after start.
(wide_int_cc_tests): Add tests for equivalency of wi::mask and
wi::shifted_mask with 0 start.

(cherry picked from commit e52592073f6df3d7a3acd9f0436dcc32a8b7493d)

ifcvt: Don't introduce trapping or faulting reads in noce_try_sign_mask [PR106032]

noce_try_sign_mask as documented will optimize
  if (c < 0)
    x = t;
  else
    x = 0;
into x = (c >> bitsm1) & t;
The optimization is done if either t is unconditional
(e.g. for
  x = t;
  if (c >= 0)
    x = 0;
) or if it is cheap.  We already check that t doesn't have side-effects,
but if t is conditional, we need to punt also if it may trap or fault,
as we make it unconditional.

I've briefly skimmed other noce_try* optimizations and didn't find one that
would suffer from the same problem.

2022-06-21  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/106032
* ifcvt.c (noce_try_sign_mask): Punt if !t_unconditional, and
t may_trap_or_fault_p, even if it is cheap.

* gcc.c-torture/execute/pr106032.c: New test.

(cherry picked from commit a0c30fe3b888f20215f3e040d21b62b603804ca9)

expand: Fix up expand_cond_expr_using_cmove [PR106030]

If expand_cond_expr_using_cmove can't find a cmove optab for a particular
mode, it tries to promote the mode and perform the cmove in the promoted
mode.

The testcase in the patch ICEs on arm because in that case we pass temp which
has the promoted mode (SImode) as target to expand_operands where the
operands have the non-promoted mode (QImode).
Later on the function uses paradoxical subregs:
  if (GET_MODE (op1) != mode)
    op1 = gen_lowpart (mode, op1);

  if (GET_MODE (op2) != mode)
    op2 = gen_lowpart (mode, op2);
to change the operand modes.

The following patch fixes it by passing NULL_RTX as target if it has
promoted mode.

2022-06-21  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/106030
* expr.c (expand_cond_expr_using_cmove): Pass NULL_RTX instead of
temp to expand_operands if mode has been promoted.

* gcc.c-torture/compile/pr106030.c: New test.

(cherry picked from commit 2df1df945fac85d7b3d084001414a66a2709d8fe)

libgomp: Fix up target-31.c test [PR106045]

The i variable is used inside of the parallel in:
      #pragma omp simd safelen(32) private (v)
      for (i = 0; i < 64; i++)
        {
          v = 3 * i;
          ll[i] = u1 + v * u2[0] + u2[1] + x + y[0] + y[1] + v + h[0] + u3[i];
        }
where i is predetermined linear (so while inside of the body
it is safe, private per SIMD lane var) the final value is written to
the shared variable, and in:
      for (i = 0; i < 64; i++)
        if (ll[i] != u1 + 3 * i * u2[0] + u2[1] + x + y[0] + y[1] + 3 * i + 13 + 14 + i)
          #pragma omp atomic write
            err = 1;
which is a normal loop and so it isn't in any way privatized there.
So we have a data race, fixed by adding private (i) clause to the
parallel.

2022-06-21  Jakub Jelinek  <jakub@redhat.com>
    Paul Iannetta  <piannetta@kalrayinc.com>

PR libgomp/106045
* testsuite/libgomp.c/target-31.c: Add private (i) clause.

(cherry picked from commit 85d613da341b76308edea48359a5dbc7061937c4)

Daily bump.

libstdc++: Throw instead of segfaulting in std::thread constructor [PR 67791]

This turns a mysterious segfault into an exception with a more useful
message. If the exception isn't caught, the user sees this instead of
just a segfault:

terminate called after throwing an instance of 'std::system_error'
what(): Enable multithreading to use std::thread: Operation not permitted
Aborted (core dumped)

libstdc++-v3/ChangeLog:

PR libstdc++/67791
* src/c++11/thread.cc (thread::_M_start_thread(_State_ptr, void (*)())):
Check that gthreads is available before calling __gthread_create.

(cherry picked from commit 4bbd5d0c5fb2b7527938ad44a6d8a2f2ef8bbe12)

Daily bump.

libstdc++: Fix outdated docs about demangling exception messages

The string returned by std::bad_exception::what() hasn't been a mangled
name since PR libstdc++/14493 was fixed for GCC 4.2.0, so remove the
docs showing how to demangle it.

libstdc++-v3/ChangeLog:

* doc/xml/manual/extensions.xml: Remove std::bad_exception from
example program.
* doc/html/manual/ext_demangling.html: Regenerate.

(cherry picked from commit 688d126b69215db29774c249b052e52d765782b3)

libstdc++: Reduce Doxygen output for PDF

Including the header source code in the doxygen-generated PDF file makes
it too large, and causes pdflatex to run out of memory. If we only set
SOURCE_BROWSER=YES for the HTML docs then we won't include the sources
in the PDF file.

There are several macros defined for std::valarray that are only used to
generate repetitive code and then #undef'd. Those aren't useful in the
doxygen docs, especially the ones that reuse the same name in different
files. Omitting them avoids warnings about duplicate labels in the
refman.tex file.

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (SOURCE_BROWSER): Only set to YES for
HTML docs.
* include/bits/gslice_array.h (_DEFINE_VALARRAY_OPERATOR): Omit
from doxygen docs.
* include/bits/indirect_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/mask_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/bits/slice_array.h (_DEFINE_VALARRAY_OPERATOR):
Likewise.
* include/std/valarray (_DEFINE_VALARRAY_UNARY_OPERATOR)
(_DEFINE_VALARRAY_AUGMENTED_ASSIGNMENT)
(_DEFINE_VALARRAY_EXPR_AUGMENTED_ASSIGNMENT)
(_DEFINE_BINARY_OPERATOR): Likewise.

(cherry picked from commit afa69618d1627435841c9164b019ef98000e0365)

libstdc++: Fix dangling reference in filesystem::path::filename()

The new -Wdangling-reference warning noticed this.

libstdc++-v3/ChangeLog:

* include/bits/fs_path.h (path::filename()): Fix dangling
reference.

(cherry picked from commit 49237fe6ef677a81eae701f937546210c90b5914)

libstdc++: Fix GDB Xmethod for std::shared_ptr::use_count() [PR109064]

libstdc++-v3/ChangeLog:

PR libstdc++/109064
* python/libstdcxx/v6/xmethods.py (SharedPtrUseCountWorker):
Remove self-recursion in __init__. Add missing _supports.
* testsuite/libstdc++-xmethods/shared_ptr.cc: Check use_count()
and unique().

libstdc++: Fix uses_allocator_construction_args for pair<T&&, U&&> [PR108952]

This implements LWG 3527 which fixes the handling of pair<T&&, U&&> in
std::uses_allocator_construction_args.

libstdc++-v3/ChangeLog:

PR libstdc++/108952
* include/std/memory (uses_allocator_construction_args):
Implement LWG 3527.
* testsuite/20_util/pair/astuple/get-2.cc: New test.
* testsuite/20_util/scoped_allocator/108952.cc: New test.
* testsuite/20_util/uses_allocator/lwg3527.cc: New test.

(cherry picked from commit 8e342c04550466ab088c33746091ce7f3498ee44)

libstdc++: Fix name of <experimental/optional> in comment

libstdc++-v3/ChangeLog:

* include/experimental/optional: Fix header name in comment.

(cherry picked from commit 38f321793ae18d25399f0396ac1371caa7cc7043)

libstdc++: Fix std::common_iterator assignment [PR100823]

This fixes the following conformance problems reported in the PR:

- Move constructor and move assignment should be defined.
- Copy assignment from a valueless object should be allowed.

Assignment is completely rewritten by this patch, as the previous
version had a number of problems. The converting assignment failed to
handle the case of assigning a new value to a valueless object, which
should work. It only accepted lvalue arguments, so wasn't usable to
implement the move assignment operator. Finally, it enforced the
precondition that the argument is not valueless, which is correct for
the converting assignment but not for the copy assignment.

A new _M_assign member is added to handle all cases of assignment
(copying from an lvalue, moving from an rvalue, and converting from a
different type). The not valueless precondition is checked in the
converting assignment before calling _M_assign, so isn't enforced for
copy and move assignment. The new function no longer uses a switch, so
handles valueless objects as the LHS or RHS of the assignment.

libstdc++-v3/ChangeLog:

PR libstdc++/100823
* include/bits/stl_iterator.h (common_iterator): Define move
constructor and move assignment operator.
(common_iterator::_M_assign): New function implementing
assignment.
(common_iterator::operator=): Use _M_assign.
(common_iterator::_S_valueless): New constant.
* testsuite/24_iterators/common_iterator/100823.cc: New test.

(cherry picked from commit 56c999860bbbb2fd5091ba0985e2e5eaa90c6478)

libstdc++: Fix minor bugs in std::common_iterator

The noexcept-specifier for some std::common_iterator constructors was
incorrectly using an rvalue as the first argument of
std::is_nothrow_assignable_v. This gave the wrong answer for some types,
e.g. std::common_iterator<int*, S>, because an rvalue of scalar type
cannot be assigned to.

Also fix the friend declaration to use the same constraints as on the
definition of the class template. G++ fails to diagnose this error, due
to PR c++/96830.

Finally, the copy constructor was using std::move for its argument
in some cases, which should be removed.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (common_iterator): Fix incorrect
uses of is_nothrow_assignable_v. Fix inconsistent constraints on
friend declaration. Do not move argument in copy constructor.
* testsuite/24_iterators/common_iterator/1.cc: Check for
noexcept constructibnle/assignable.

(cherry picked from commit 3b5567c3ec7e5759bdecc6a6fc0be2b65a93636e)

libstdc++: Fix unsafe use of dirent::d_name [PR107814]

Copy the fix for PR 104731 to the equivalent experimental::filesystem
test.

libstdc++-v3/ChangeLog:

PR libstdc++/107814
* testsuite/experimental/filesystem/iterators/error_reporting.cc:
Use a static buffer with space after it.

(cherry picked from commit 1cac00d013856fea4cee0f13c4959c8e21afd2d9)

Daily bump.

testsuite: remove stray ';' [PR109608]

GCC 10 is still pedantic about empty declarations.

PR testsuite/109608

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-pmf3.C: Remove stray ';'.

Daily bump.

c-family: -Wsequence-point and COMPONENT_REF [PR107163]

The patch for PR91415 fixed -Wsequence-point to treat shifts and ARRAY_REF
as sequenced in C++17, and COMPONENT_REF as well. But this is unnecessary
for COMPONENT_REF, since the RHS is just a FIELD_DECL with no actual
evaluation, and in this testcase handling COMPONENT_REF as sequenced blows
up fast in a deep inheritance tree. Instead, look through it.

PR c++/107163

gcc/c-family/ChangeLog:

* c-common.c (verify_tree): Don't use sequenced handling
for COMPONENT_REF.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wsequence-point-5.C: New test.

c++: constexpr PMF conversion [PR105996]

Here, we were calling build_reinterpret_cast regardless of whether there was
actually a cast, and that now sets REINTERPRET_CAST_P. But that
optimization seems dodgy anyway, as it involves NOP_EXPR from one
RECORD_TYPE to another and we try to reserve NOP_EXPR for fundamental types.
And the generated code seems the same, so let's drop it. And also strip
location wrappers.

PR c++/105996

gcc/cp/ChangeLog:

* typeck.c (build_ptrmemfunc): Drop 0-offset optimization
and location wrappers.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-pmf3.C: New test.

c++: constant, array, lambda, template [PR108975]

When a lambda refers to a constant local variable in the enclosing scope, we
tentatively capture it, but if we end up pulling out its constant value, we
go back at the end of the lambda and prune any unneeded captures. Here
while parsing the template we decided that the dim capture was unneeded,
because we folded it away, but then we brought back the use in the template
trees that try to preserve the source representation with added type info.
So then when we tried to instantiate that use, we couldn't find what it was
trying to use, and crashed.

Fixed by not trying to prune when parsing a template; we'll prune at
instantiation time.

PR c++/108975

gcc/cp/ChangeLog:

* lambda.c (prune_lambda_captures): Don't bother in a template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-const11.C: New test.

c++: namespace-scoped friend in local class [PR69410]

do_friend was only considering class-qualified identifiers for the
qualified-id case, but we also need to skip local scope when there's an
explicit namespace scope.

PR c++/69410

gcc/cp/ChangeLog:

* friend.c (do_friend): Handle namespace as scope argument.
* decl.c (grokdeclarator): Pass down in_namespace.

gcc/testsuite/ChangeLog:

* g++.dg/lookup/friend24.C: New test.

c++: &enum::enumerator [PR101869]

We don't want to call build_offset_ref with an enum.

PR c++/101869

gcc/cp/ChangeLog:

* semantics.c (finish_qualified_id_expr): Don't try to build a
pointer-to-member if the scope is an enumeration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/enum43.C: New test.

Daily bump.

PR target/108589 - Check REG_P for AARCH64_FUSE_ADDSUB_2REG_CONST1

This adds a check for REG_P on SET_DEST for the new idiom recognizer
for AARCH64_FUSE_ADDSUB_2REG_CONST1. The reported ICE is only
observable with checking=rtl.

Bootstrapped/regtested aarch64-linux, committed.

PR target/108589

gcc/ChangeLog:

* config/aarch64/aarch64.c (aarch_macro_fusion_pair_p): Check
REG_P on SET_DEST.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/pr108589.c: New test.

(cherry picked from commit a39c6ec97906766ad65d15d4856fd41121ee7a45)

aarch64: disable LDP via tuning structure for -mcpu=ampere1

AmpereOne (-mcpu=ampere1) breaks LDP instructions into two uops.
Given the chance that this causes instructions to slip into the next
decoding cycle and the additional overheads when handling
cacheline-crossing LDP instructions, we disable the generation of LDP
isntructions through the tuning structure from instruction combining
(such as in peephole2).

Given the code-density benefits in builtins and prologue/epilogue
expansion, we allow LDPs there.

This commit:
* adds a new tuning option AARCH64_EXTRA_TUNE_NO_LDP_COMBINE
* allows -moverride=tune=... to override this

These changes are benchmark-driven, yielding the following changes
(with a net-overall improvement):
   503.bwaves_r.      -0.88%
   507.cactuBSSN_r     0.35%
   508.namd_r          3.09%
   510.parest_r       -2.99%
   511.povray_r        5.54%
   519.lbm_r          15.83%
   521.wrf_r           0.56%
   526.blender_r       2.47%
   527.cam4_r          0.70%
   538.imagick_r       0.00%
   544.nab_r          -0.33%
   549.fotonik3d_r.   -0.42%
   554.roms_r          0.00%
   -------------------------
   = total             1.79%

Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
Co-Authored-By: Di Zhao <di.zhao@amperecomputing.com>
gcc/ChangeLog:

* config/aarch64/aarch64-tuning-flags.def (AARCH64_EXTRA_TUNING_OPTION):
Add AARCH64_EXTRA_TUNE_NO_LDP_COMBINE.
* config/aarch64/aarch64.c (aarch64_operands_ok_for_ldpstp):
Check for the above tuning option when processing loads.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ampere1-no_ldp_combine.c: New test.

(cherry picked from commit f200c56787f2c6f93ffb739d57d01a294ab72f68)

aarch64: update ampere1 vectorization cost

The original submission of AmpereOne (-mcpu=ampere1) costs occurred
prior to exhaustive testing of vectorizable workloads against
hardware.

Adjust the vector costs to achieve the best results and more closely
match the underlying hardware.

gcc/ChangeLog:

* config/aarch64/aarch64.c: Update vector costs for ampere1.

Co-Authored-By: Jiangning Liu <jiangning.liu@amperecomputing.com>
Co-Authored-By: Manolis Tsamis <manolis.tsamis@vrull.eu>
(cherry picked from commit ff1f2f2412bda118f7ddc10e69bd4284d9b24b9e)

aarch64: Add support for Ampere-1A (-mcpu=ampere1a) CPU

This patch adds support for Ampere-1A CPU:
- recognize the name of the core and provide detection for -mcpu=native,
- updated extra_costs,
- adds a new fusion pair for (A+B+1 and A-B-1).

Ampere-1A and Ampere-1 have more timing difference than the extra
costs indicate, but these don't propagate through to the headline
items in our extra costs (e.g. the change in latency for scalar sqrt
doesn't have a corresponding table entry).

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add ampere1a.
* config/aarch64/aarch64-cost-tables.h: Add ampere1a_extra_costs.
* config/aarch64/aarch64-fusion-pairs.def (AARCH64_FUSION_PAIR):
Define a new fusion pair for A+B+1/A-B-1 (i.e., add/subtract two
registers and then +1/-1).
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64.c (aarch_macro_fusion_pair_p): Implement
idiom-matcher for the new fusion pair.
* doc/invoke.texi: Add ampere1a.

(cherry picked from commit 590a06afbf0e96813b5879742f38f3665512c854)

aarch64: update Ampere-1 core definition

This brings the extensions detected by -mcpu=native on Ampere-1 systems
in sync with the defaults generated for -mcpu=ampere1.

Note that some early kernel versions on Ampere1 may misreport the
presence of PAUTH and PREDRES (i.e., -mcpu=native will add 'nopauth'
and 'nopredres').

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Update
Ampere-1 core entry.

(cherry picked from commit db2f5d661239737157cf131de7d4df1c17d8d88d)

aarch64: fix off-by-one in reading cpuinfo

Fixes: 341573406b39
Don't subtract one from the result of strnlen() when trying to point
to the first character after the current string. This issue would
cause individual characters (where the 128 byte buffers are stitched
together) to be lost.

gcc/ChangeLog:

* config/aarch64/driver-aarch64.c (readline): Fix off-by-one.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/cpunative/info_18: New test.
* gcc.target/aarch64/cpunative/native_cpu_18.c: New test.

(cherry picked from commit b1cfbccc41de6aec950c0f662e7e85ab34bfff8a)

aarch64: enable Ampere-1 CPU

This adds support and a basic tuning model for the Ampere Computing
"Ampere-1" CPU.

The Ampere-1 implements the ARMv8.6 architecture in A64 mode and is
modelled as a 4-wide issue (as with all modern micro-architectures,
the chosen issue rate is a compromise between the maximum dispatch
rate and the maximum rate of uops issued to the scheduler).

This adds the -mcpu=ampere1 command-line option and the relevant cost
information/tuning tables for the Ampere-1.

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): New Ampere-1
core.
* config/aarch64/aarch64-tune.md: Regenerate.
* config/aarch64/aarch64-cost-tables.h: Add extra costs for
Ampere-1.
* config/aarch64/aarch64.c: Add tuning structures for Ampere-1.
* doc/invoke.texi: Add documentation for Ampere-1 core.

(cherry picked from commit 67b0d47e20e655c0dd53a76ea88aab60fafb2059)

Daily bump.

rs6000: Fix vector parity support [PR108699]

The failures on the original failed case builtin-bitops-1.c
and the associated test case pr108699.c here show that the
current support of parity vector mode is wrong on Power.
The hardware insns vprtyb[wdq] which operate on the least
significant bit of each byte per element, they doesn't match
what RTL opcode parity needs, but the current implementation
expands it with them wrongly.

This patch is to fix the handling with one more insn vpopcntb.

PR target/108699

gcc/ChangeLog:

* config/rs6000/altivec.md (*p9v_parity<mode>2): Rename to ...
(rs6000_vprtyb<mode>2): ... this.
* config/rs6000/rs6000-builtin.def (VPRTYBD): Replace parityv2di2 with
rs6000_vprtybv2di2.
(VPRTYBW): Replace parityv4si2 with rs6000_vprtybv4si2.
(VPRTYBQ): Replace parityv1ti2 with rs6000_vprtybv1ti2.
* config/rs6000/vector.md (parity<mode>2 with VEC_IP): Expand with
popcountv16qi2 and the corresponding rs6000_vprtyb<mode>2.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/p9-vparity.c: Add scan-assembler-not for vpopcntb
to distinguish parity byte from parity.
* gcc.target/powerpc/pr108699.c: New test.

(cherry picked from commit cdd2d6643f7fef40e335a7027edfea7276cde608)

Daily bump.

Fortran: fix compile-time simplification of SET_EXPONENT [PR109511]

gcc/fortran/ChangeLog:

PR fortran/109511
* simplify.c (gfc_simplify_set_exponent): Fix implementation of
compile-time simplification of intrinsic SET_EXPONENT for argument
X < 1 and for I < 0.

gcc/testsuite/ChangeLog:

PR fortran/109511
* gfortran.dg/set_exponent_1.f90: New test.

(cherry picked from commit fa4cb42870df60deb8888dbd51e2ddc6d6ab9e6a)

Daily bump.

Fortran: simplification of NEAREST for large argument [PR109186]

gcc/fortran/ChangeLog:

PR fortran/109186
* simplify.c (gfc_simplify_nearest): Fix off-by-one error in setting
up real kind-specific maximum exponent for mpfr.

gcc/testsuite/ChangeLog:

PR fortran/109186
* gfortran.dg/nearest_6.f90: New test.

(cherry picked from commit 4410a08b80cc40342eeaa5b6af824cd4352b218c)

Fortran: procedures with BIND(C) attribute require explicit interface [PR85877]

gcc/fortran/ChangeLog:

PR fortran/85877
* resolve.c (resolve_fl_procedure): Check for an explicit interface
of procedures with the BIND(C) attribute (F2018:15.4.2.2).

gcc/testsuite/ChangeLog:

PR fortran/85877
* gfortran.dg/pr85877.f90: New test.

(cherry picked from commit 5426ab34643d9e6502f3ee572891a03471fa33ed)

Daily bump.

Fortran: fix bounds check for copying of class expressions [PR106945]

In the bounds check for copying of class expressions, the number of elements
determined from a descriptor, returned as type gfc_array_index_type (i.e. a
signed type), should be converted to the type of the passed element count,
which is of type size_type_node (i.e. unsigned), for use in comparisons.

gcc/fortran/ChangeLog:

PR fortran/106945
* trans-expr.c (gfc_copy_class_to_class): Convert element counts in
bounds check to common type for comparison.

gcc/testsuite/ChangeLog:

PR fortran/106945
* gfortran.dg/pr106945.f90: New test.

(cherry picked from commit 2cf5f485e0351bb1faf46196a99e524688f3966e)