git.ipfire.org Git - thirdparty/gcc.git/log

Daily bump.

libstdc++: Fix constraints for rvalue stream insertion/extraction

The __rval_streamable() function was an attempt to test for
convertibility cheaply and without confusing diagnostics. It doesn't
work with Clang though, and is probably ill-formed.

Replace that helper function with a check for derivation from ios_base,
and use that in the alias templates __rvalue_stream_insertion_t and
__rvalue_stream_extraction_t. Use concepts for the constraints when
available.

libstdc++-v3/ChangeLog:

* include/std/istream (__rvalue_stream_extraction_t): Replace
use of __rval_streamable.
* include/std/ostream (__rvalue_stream_insertion_t): Likewise.
(__rval_streamable): Remove.
(_Require_derived_from_ios_base, __derived_from_ios_base): New
helper for checking constraints.
* testsuite/27_io/basic_istream/extractors_other/char/4.cc: Fix
reference to the wrong subclause of the standard.
* testsuite/27_io/basic_istream/extractors_other/wchar_t/4.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_other/char/6.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_other/wchar_t/6.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_other/char/99692.cc:
New test.
* testsuite/27_io/filesystem/path/io/dr2989.cc: Adjust pruned
errors.

(cherry picked from commit a87ceadf17b4a899f3e74e2da8b6b209461d2742)

libcpp: Fix up pragma preprocessing [PR100450]

Since the r0-85991-ga25a8f3be322fe0f838947b679f73d6efc2a412c
https://gcc.gnu.org/legacy-ml/gcc-patches/2008-02/msg01329.html
changes, so that we handle macros inside of pragmas that should expand
macros, during preprocessing we print those pragmas token by token,
with CPP_PRAGMA printed as
      fputs ("#pragma ", print.outf);
      if (space)
        fprintf (print.outf, "%s %s", space, name);
      else
        fprintf (print.outf, "%s", name);
where name is some identifier (so e.g. print
#pragma omp parallel
or
#pragma omp for
etc.).  Because it ends in an identifier, we need to handle it like
an identifier (i.e. CPP_NAME) for the decision whether a space needs
to be emitted in between that #pragma whatever or #pragma whatever whatever
and following token, otherwise the attached testcase is preprocessed as
#pragma omp forreduction(+:red)
rather than
#pragma omp for reduction(+:red)
The cpp_avoid_paste function is only called for this purpose.

2021-05-07  Jakub Jelinek  <jakub@redhat.com>

PR c/100450
* lex.c (cpp_avoid_paste): Handle token1 CPP_PRAGMA like CPP_NAME.

* c-c++-common/gomp/pr100450.c: New test.

(cherry picked from commit 170c850e4bd46745e2a5130b5eb09f9fceb98416)

Daily bump.

libstdc++: Implement LWG 1203 for rvalue iostreams

This implements the resolution of LWG 1203 so that the constraints for
rvalue stream insertion/extraction are simpler, and the return type is
the original rvalue stream type not its base class.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/std/istream (operator>>(Istream&&, x&)): Simplify, as
per LWG 1203.
* include/std/ostream (operator<<(Ostream&&, const x&)):
Likewise.
* testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
Adjust dg-error pattern.
* testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
Likewise.
* testsuite/27_io/basic_istream/extractors_other/char/4.cc: Define
is_extractable trait to replace std::__is_extractable. Make it
work with rvalue streams as well as lvalues, to replace f() and
g() helper functions.
* testsuite/27_io/basic_istream/extractors_other/wchar_t/4.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_other/char/6.cc:
Define is_insertable trait to replace std::__is_insertable. Make
it work with rvalue streams as well as lvalues, to replace f()
and g() helper functions.
* testsuite/27_io/basic_ostream/inserters_other/wchar_t/6.cc:
Likewise.
* testsuite/27_io/filesystem/path/io/dr2989.cc: Prune additional
errors from new constraints.
* testsuite/27_io/rvalue_streams-2.cc: Remove PR 80675 checks,
which are no longer expected to compile.
* testsuite/27_io/rvalue_streams.cc: Adjust existing test.
Verify LWG 1203 changes.

(cherry picked from commit aa475c4ac80733f85ba47b109fc1900f05e810e2)

libstdc++: Add tests for std::invoke feature test macro

libstdc++-v3/ChangeLog:

* testsuite/20_util/function_objects/invoke/3.cc: Check feature
test macro.
* testsuite/20_util/function_objects/invoke/version.cc: New test.

(cherry picked from commit 29745bf06276b9628d08ef1c9e28890cc56df4aa)

libstdc++: Fix null dereferences in std::promise

This fixes some ubsan errors in std::promise:

future:1153:34: runtime error: member call on null pointer of type 'struct element_type'
future:1153:34: runtime error: member access within null pointer of type 'struct element_type'

The problem is that the check for a null pointer is done inside the
_State::__Setter function, which is only evaluated after evaluating the
_M_future->_M_set_result postfix-expression.

This change adds a new promise::_M_state() helper for accessing
_M_future, and moves the check for no shared state into there, instead
of inside the __setter functions. The __setter functions are made
always_inline, to avoid the situation where the linker selects the old
version of set_value (without the _S_check call) and the new version of
__setter (without the _S_check call) and so there is no check. With the
always_inline attribute the old version of set_value will either inline
the old __setter or call an extern definition of it, and the new
set_value will do the check itself, so both versions do the check.

libstdc++-v3/ChangeLog:

* include/std/future (promise::set_value): Check for existence
of shared state before dereferncing it.
(promise::set_exception, promise::set_value_at_thread_exit)
(promise::set_exception_at_thread_exit): Likewise.
(promise<R&>::set_value, promise<R&>::set_exception)
(promise<R&>::set_value_at_thread_exit)
(promise<R&>::set_exception_at_thread_exit): Likewise.
(promise<void>::set_value, promise<void>::set_exception)
(promise<void>::set_value_at_thread_exit)
(promise<void>::set_exception_at_thread_exit): Likewise.
* testsuite/30_threads/promise/members/at_thread_exit2.cc:
Remove unused variable.

(cherry picked from commit 058d6acefe8bac4a66c8e7fb4951276db188e2d8)

libstdc++: Fix undefined behaviour in std::string

This fixes a ubsan error when constructing a string with a null pointer:

bits/basic_string.h:534:21: runtime error: applying non-zero offset 18446744073709551615 to null pointer

The _M_construct function only cares whether the second pointer is
non-null, so create a non-null value without undefined arithmetic.

We can also pass the random_access_iterator_tag directly to the
_M_construct function, to avoid going via the tag dispatching
_M_construct_aux, because we know we have pointers not integers here.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (basic_string(const CharT*, const A&)):
Do not do arithmetic on null pointer.

(cherry picked from commit 789c57bc5fe023fc6dc72ade4afcb0916ff788d3)

libstdc++: Fix null dereference in pb_ds containers

This fixes ubsan errors:

ext/pb_ds/detail/cc_hash_table_map_/cc_ht_map_.hpp:533:15: runtime error: member access within null pointer of type 'struct entry'

libstdc++-v3/ChangeLog:

* include/ext/pb_ds/detail/cc_hash_table_map_/cc_ht_map_.hpp
(find_key_pointer(key_const_reference, false_type))
(find_key_pointer(key_const_reference, true_type)): Do not
dereference null pointer.

(cherry picked from commit ca871701c2822c3c4537745d4aa44a7b8f408337)

libstdc++: Fix undefined behaviour in testsuite

Fix some test bugs found by ubsan.

libstdc++-v3/ChangeLog:

* testsuite/20_util/from_chars/3.cc: Use unsigned type to avoid
overflow.
* testsuite/24_iterators/reverse_iterator/2.cc: Do not add
non-zero value to null pointer.
* testsuite/25_algorithms/copy_backward/move_iterators/69478.cc:
Use past-the-end iterator for result.
* testsuite/25_algorithms/move_backward/69478.cc: Likewise.
* testsuite/25_algorithms/move_backward/93872.cc: Likewise.

(cherry picked from commit 6fb8b67089119b737ccb38f75f403b8d279e2229)

libstdc++: Do not use deduced return type for std::visit [PR 100384]

This avoids errors outside the immediate context when std::visit is an
overload candidate because of ADL, but not actually viable.

The solution is to give std::visit a non-deduced return type. New
helpers are introduced for that, and existing ones refactored slightly.

libstdc++-v3/ChangeLog:

PR libstdc++/100384
* include/std/variant (__get_t): New alias template yielding the
return type of std::get<N> on a variant.
(__visit_result_t): New alias template yielding the result of
std::visit.
(__same_types): Move into namespace __detail::__variant.
(__check_visitor_results): Likewise. Use __invoke_result_t and
__get_t.
(__check_visitor_result): Remove.
(visit): Use __visit_result_t for return type.
* testsuite/20_util/variant/100384.cc: New test.

(cherry picked from commit af5b2b911dd80ae9cc87404b7e7ab807cf6655d4)

IBM Z: Fix error checking for builtin vec_permi

The builtin vec_permi is peculiar in that its immediate operand is
encoded differently than the immediate operand that is backing the
builtin. This fixes the check for the immediate operand, adding a
regression test in the process.

This partially reverts commit 3191c1f4488d1f7563b563d7ae2a102a26f16d82

2021-05-06 Marius Hillenbrand <mhillen@linux.ibm.com>

gcc/ChangeLog:

* config/s390/s390-builtins.def (O_M5, O1_M5, ...): Remove unused macros.
(s390_vec_permi_s64, s390_vec_permi_b64, s390_vec_permi_u64)
(s390_vec_permi_dbl, s390_vpdi): Use the O3_U2 type for the immediate
operand.
* config/s390/s390.c (s390_const_operand_ok): Remove unused
values.

gcc/testsuite/ChangeLog:

* gcc.target/s390/zvector/imm-range-error-1.c: Fix test for
__builtin_s390_vpdi.
* gcc.target/s390/zvector/vec-permi.c: New test for builtin
vec_permi.

(cherry picked from commit 3c33c00f43bfe585d9414dfb620f0f518e55a457)

modulo-sched: skip loops with strange register defs [PR100225]

PR84878 fix adds an assertion which can fail, e.g. when stack pointer
is adjusted inside the loop. We have to prevent it and search earlier
for any 'strange' instruction. The solution is to skip the whole loop
if using 'note_stores' we found that one of hard registers is in
'df->regular_block_artificial_uses' set.

Also patch properly prohibit not single-set instruction in loop body.

gcc/ChangeLog:

PR rtl-optimization/100225
PR rtl-optimization/84878
* modulo-sched.c (sms_schedule): Use note_stores to skip loops
where we have an instruction which touches (writes) any hard
register from df->regular_block_artificial_uses set.
Allow not-single-set instruction only right before basic block
tail.

gcc/testsuite/ChangeLog:

PR rtl-optimization/100225
PR rtl-optimization/84878
* gcc.dg/pr100225.c: New test.

libgomp/ChangeLog:

* testsuite/libgomp.oacc-c-c++-common/atomic_capture-3.c: New test.

(cherry picked from commit 4cf3b10f27b1994cf4a9eb12079d85412ebc7cad)

IBM Z: Handle hard registers in s390_md_asm_adjust()

gen_fprx2_to_tf() and gen_tf_to_fprx2() cannot handle hard registers,
since the subregs they create do not pass validation. Change
s390_md_asm_adjust() to manually copy between hard VRs and FPRs instead
of using these two functions.

gcc/ChangeLog:

PR target/100217
* config/s390/s390.c (s390_hard_fp_reg_p): New function.
(s390_md_asm_adjust): Handle hard registers.

gcc/testsuite/ChangeLog:

PR target/100217
* gcc.target/s390/vector/long-double-asm-in-out-hard-fp-reg.c: New test.
* gcc.target/s390/vector/long-double-asm-inout-hard-fp-reg.c: New test.

Daily bump.

Fix PR target/100402

This is a regression for 64-bit Windows present from mainline down to the 9
branch and introduced by the fix for PR target/99234.  Again SEH, but with
a twist related to the way MinGW implements setjmp/longjmp, which turns out
to be piggybacked on SEH with recent versions of MinGW, i.e. the longjmp
performs a bona-fide unwinding of the stack, because it calls RtlUnwindEx
with the second argument initially passed to setjmp, which is the result of
__builtin_frame_address (0) in the MinGW header file:

  define setjmp(BUF) _setjmp((BUF), __builtin_frame_address (0))

This means that we directly expose the frame pointer to the SEH machinery
here (unlike with regular exception handling where we use an intermediate
CFA) and thus that we cannot do whatever we want with it.  The old code
would leave it unaligned, i.e. not multiple of 16, whereas the new code
aligns it, but this breaks for some reason; at least it appears that a
.seh_setframe directive with 0 as second argument always works, so the
fix aligns it this way.

gcc/
PR target/100402
* config/i386/i386.c (ix86_compute_frame_layout): For a SEH target,
always return the establisher frame for __builtin_frame_address (0).
gcc/testsuite/
* gcc.c-torture/execute/20210505-1.c: New test.

PR fortran/100274 - ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131

When the check for the length of formal and actual character arguments
found a mismatch and emitted a warning, it would skip further checks
like that could lead to errors. Fix that by continuing the checking.
Also catch a NULL pointer dereference.

gcc/fortran/ChangeLog:

PR fortran/100274
* interface.c (gfc_compare_actual_formal): Continue checks after
emitting warning for argument length mismatch.
* trans-expr.c (gfc_conv_procedure_call): Check for NULL pointer
dereference.

gcc/testsuite/ChangeLog:

PR fortran/100274
* gfortran.dg/argument_checking_25.f90: New test.

(cherry picked from commit a8b79cc939d6786293f654c42a2d1b0ab040de0e)

libstdc++: Implement LWG 3517/3520 for join_view/transform_view

libstdc++-v3/ChangeLog:

* include/std/ranges (transform_view::_Iterator::iter_swap):
Remove as per LWG 3520.
(join_view::_Iterator::iter_swap): Add indirectly_swappable
constraint as per LWG 3517.

(cherry picked from commit 2663727d853438ee4d67b200a08f94a318745486)

PR rtl-optimization/100263: Ensure register can change mode

For move2add_valid_value_p we also have to ask the target whether a
register can be accessed in a different mode than it was set before.

gcc/ChangeLog:

PR rtl-optimization/100263
* postreload.c (move2add_valid_value_p): Ensure register can
change mode.

(cherry picked from commit bb283170e7a1f39bf533651418daf10ad18eccfc)

Fix PR rtl-optimization/100411

This is the bootstrap failure of GCC 11 on MinGW64 configured with --enable-
tune=nocona. The bottom line is that SEH does not support CFI for epilogues
but the x86 back-end nevertheless attaches it to instructions, so we have to
filter it out and this is done by detecting the end of the prologue by means
of the NOTE_INSN_PROLOGUE_END note.

But the compiler manages to generate a second epilogue before this note in
the RTL stream and this fools the aforementioned logic. The root cause is
cross-jumping, which inserts a jump before the end of the prologue, in fact
just before the note; the rest (CFG cleanup, BB reordering, etc) is downhill
from there.

gcc/
PR rtl-optimization/100411
* cfgcleanup.c (try_crossjump_to_edge): Also skip end of prologue
and beginning of function markers.

tree-optimization/100253 - fix bogus aligned vectorized loads/stores

At some point DR_MISALIGNMENT was supposed to be -1 when the
access was not element aligned. That's obviously not true at this
point so this adjusts both store and load vectorizing to no longer
assume this which in turn allows simplifying the code.

2021-04-29 Richard Biener <rguenther@suse.de>

PR tree-optimization/100253
* tree-vect-stmts.c (vectorizable_load): Do not assume
element alignment when DR_MISALIGNMENT is -1.
(vectorizable_store): Likewise.

* g++.dg/pr100253.C: New testcase.

(cherry picked from commit af4ccaa7515b8e72449448c509916575831e6292)

tree-optimization/100278 - handle mismatched code in TBAA adjust of PRE

PRE has code to adjust TBAA behavior for refs that expects the base
operation code to match. The testcase shows a case where we have
a VAR_DECL vs. a MEM_REF so add code to give up in such cases.

2021-04-27 Richard Biener <rguenther@suse.de>

PR tree-optimization/100278
* tree-ssa-pre.c (compute_avail): Give up when we cannot
adjust TBAA beacuse of mismatching bases.

* gcc.dg/tree-ssa/pr100278.c: New testcase.

(cherry picked from commit acfe5290406cc70485df8899d14982278a9371f8)

ipa/100308 - properly update the callgraph when pruning EH in IPA CP

This makes sure to fall into the delete_unreachable_blocks_update_callgraph
handling to remove blocks becoming unreachable when removing EH edges
by tracking blocks to need EH cleanup and doing that after releasing
dominance info.

This fixes an ICE seen with gfortran.dg/gomp/pr88933.f90 when enhancing
DSE.

2021-04-28 Richard Biener <rguenther@suse.de>

PR ipa/100308
* ipa-prop.c (ipcp_modif_dom_walker::before_dom_children):
Track blocks to cleanup EH in new m_need_eh_cleanup.
(ipcp_modif_dom_walker::cleanup_eh): New.
(ipcp_transform_function): Release dominator info before
doing EH cleanup.

(cherry picked from commit 8ddce3f7d0db060885df24e41dd289173ec774a0)

tree-optimization/100414 - compute dominance info in phiopt

phiopt now has dominator queries but fails to compute dominance
info.

2021-05-04 Richard Biener <rguenther@suse.de>

PR tree-optimization/100414
* tree-ssa-phiopt.c (get_non_trapping): Do not compute dominance
info here.
(tree_ssa_phiopt_worker): But unconditionally here.

* gcc.dg/pr100414.c: New testcase.

(cherry picked from commit 7a3897661151cf8cc77d11f7a98fc64259210748)

tree-optimization/100329 - avoid reassociating asm goto defs

This avoids reassociating asm goto defs because we have no idea
on which outgoing edge to insert defs.

2021-05-04 Richard Biener <rguenther@suse.de>

PR tree-optimization/100329
* tree-ssa-reassoc.c (can_reassociate_p): Do not reassociate
asm goto defs.
(insert_stmt_after): Assert we're not running into asm goto.

* gcc.dg/torture/pr100329.c: New testcase.

(cherry picked from commit a310bb73edc9548e08d1fa28e7a56246caf27757)

Daily bump.

libstdc++: Implement proposed resolution for LWG 3532

libstdc++-v3/ChangeLog:

* include/std/ranges (split_view::_InnerIter::operator++):
Depend on _Base instead of _Vp directly, as per LWG 3532.

(cherry picked from commit 71834be5b68e0c9839f0647e1bbf1eec4e4bbf49)

nvptx: Fix up nvptx build against latest libstdc++ [PR100375]

The r12-220-gd96db15967e78d7cecea3b1cf3169ceb924678ac change
deprecated some non-standard std::pair constructors and that apparently
broke nvptx.c build, where pseudo_node_t is std::pair<struct basic_block_def *, int>
and so nullptr (or NULL) needs to be used for the first argument of the
ctors instead of 0.

2021-05-02 Jakub Jelinek <jakub@redhat.com>

PR target/100375
* config/nvptx/nvptx.c (nvptx_sese_pseudo): Use nullptr instead of 0
as first argument of pseudo_node_t constructors.

(cherry picked from commit 7911a905276781c20f704f5a91b5125e0184d072)

Daily bump.

c++: base-clause parsing and implicit 'this' [PR100362]

My r11-6815 change to defer access checking when processing a
base-clause removed a pair of pushclass / popclass calls that seemed to
be unnecessary now that we'd also defer access checking while parsing
the base-clause.

But it turns out these calls make a difference in the below testcase,
where we have a local class whose base-clause implicitly uses the 'this'
of the enclosing class.  Before r11-6815, while parsing the base-clause
of the local class, maybe_resolve_dummy would fail to resolve the dummy
'this' object because the current scope would be the local class.  Now,
since the current scope is the lambda, maybe_resolve_dummy succeeds and
returns the 'this' for the enclosing class Qux.  Later, during deferred
instantiation of the local class, we get confused trying to resolve the
access of 'a_' through this non-dummy 'this'.

So this patch just reinstates the calls to pushclass / popclass that
were removed in r11-6815.

gcc/cp/ChangeLog:

PR c++/100362
* parser.c (cp_parser_class_head): Reinstate calls to pushclass
and popclass when parsing the base-clause that were removed in
r11-6815.

gcc/testsuite/ChangeLog:

PR c++/100362
* g++.dg/cpp1y/lambda-generic-100362.C: New test.

(cherry picked from commit 2a6fc19e655e696bf0df9b7aaedf9848b23f07f3)

Fortran: Async I/O - avoid unlocked unlocking [PR100352]

Follow up to PR100352, which moved unit unlocking to st_*_done_worker to
avoid lock order reversal; however, as async_io uses a different lock,
the (unlocked locked) unit lock shall not be unlocked there.

libgfortran/ChangeLog:

PR libgomp/100352
* io/transfer.c (st_read_done_worker, st_write_done_worker): Add new
arg whether to unlock unit.
(st_read_done, st_write_done): Call it with true.
* io/async.c (async_io): Call it with false.
* io/io.h (st_write_done_worker, st_read_done_worker): Update prototype.

(cherry picked from commit a13a50047ef1814a7bda2392f728bf28f81b17ce)

Daily bump.

VAX: Accept ASHIFT in address expressions

Fix regressions:

FAIL: gcc.c-torture/execute/20090113-2.c   -O1  (internal compiler error)
FAIL: gcc.c-torture/execute/20090113-2.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/execute/20090113-3.c   -O1  (internal compiler error)
FAIL: gcc.c-torture/execute/20090113-3.c   -O1  (test for excess errors)

triggering if LRA is used rather than old reload and caused by:

(plus:SI (plus:SI (mult:SI (reg:SI 30 [ _10 ])
            (const_int 4 [0x4]))
        (reg/f:SI 26 [ _6 ]))
    (const_int 12 [0xc]))

coming from:

(insn 58 57 59 10 (set (reg:SI 33 [ _13 ])
        (zero_extract:SI (mem:SI (plus:SI (plus:SI (mult:SI (reg:SI 30 [ _10 ])
                            (const_int 4 [0x4]))
                        (reg/f:SI 26 [ _6 ]))
                    (const_int 12 [0xc])) [4 _6->bits[_10]+0 S4 A32])
            (reg:QI 56)
            (reg:SI 53)))
".../gcc/testsuite/gcc.c-torture/execute/20090113-2.c":64:12 490 {*extzv_non_const}
     (expr_list:REG_DEAD (reg:QI 56)
        (expr_list:REG_DEAD (reg:SI 53)
            (expr_list:REG_DEAD (reg:SI 30 [ _10 ])
                (expr_list:REG_DEAD (reg/f:SI 26 [ _6 ])
                    (nil))))))

being converted into:

(plus:SI (plus:SI (ashift:SI (reg:SI 30 [ _10 ])
            (const_int 2 [0x2]))
        (reg/f:SI 26 [ _6 ]))
    (const_int 12 [0xc]))

which is an rtx the VAX backend currently does not recognize as a valid
machine address, although apparently it is only inside MEM rtx's that
indexed addressing is supposed to be canonicalized to a MULT rather than
ASHIFT form.  Handle the ASHIFT form too throughout the backend then.

The change appears to also improve code generation with old reload and
code size stats are as follows, collected from 18153 executables built
in `check-c' GCC testing:

              samples average  median
--------------------------------------
regressions        47  0.702%  0.521%
unchanged       17503  0.000%  0.000%
progressions      603 -0.920% -0.403%
--------------------------------------
total           18153 -0.029%  0.000%

with a small number of outliers (over 5% size change):

old     new     change  %change filename
----------------------------------------------------
1885    1645    -240   -12.7320 pr53505.exe
1331    1221    -110    -8.2644 pr89634.exe
1553    1473    -80     -5.1513 stdatomic-vm.exe
1413    1341    -72     -5.0955 pr45830.exe
1415    1343    -72     -5.0883 stdatomic-vm.exe
25765   24463   -1302   -5.0533 strlen-5.exe
25765   24463   -1302   -5.0533 strlen-5.exe
25765   24463   -1302   -5.0533 strlen-5.exe
1191    1131    -60     -5.0377 20050527-1.exe

(all changes on the expansion side are below 5%).

gcc/
* config/vax/vax.c (print_operand_address, vax_address_cost_1)
(index_term_p): Handle ASHIFT too.

(cherry picked from commit c605a8bf92708e81d771426a87b3baddc32082dd)

Daily bump.

libstdc++: Fix inconsistent feature test macros

The __cpp_lib_constexpr_string and __cpp_lib_semaphore feature test
macros are not defined consistently in <version> and the relevant header
for the feature.

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (__cpp_lib_constexpr_string):
Only define for C++17 and later.
* include/std/version (__cpp_lib_semaphore): Fix condition
to match the one in <semaphore>.

(cherry picked from commit 3215d4f5b3d08e0087a88df9e155c779927ace1a)

aix: Alias -m64 to -maix64 and -m32 to -maix32.

GCC on AIX historically has used -maix64 and -maix32 to switch to 64 bit mode
or 32 bit mode, unlike other ports that use -m64 and -m32. The Alias()
directive for options cannot be used because aix64 is expected in multiple
parts of the compiler infrastructure and one cannot switch to -m64 due to
backward compatibility.

This patch defines DRIVER_SELF_SPECS to translate -m64 to -maix64 and
-m32 to -maix32 so that the command line option compatible with other
targets can be used while continuing to allow the historical options.

gcc/ChangeLog:

* config/rs6000/aix.h (SUBTARGET_DRIVER_SELF_SPECS): New.
* config/rs6000/aix64.opt (m64): New.
(m32): New.

(cherry picked from commit 0366e2b40e9ea5fc61c9a694de0c8c76a238b03c)

early-remat.c: Fix new/delete mismatch [PR100230]

This simple patch fixes a mistmatched operator new/delete in
early-remat.c which triggers ASan errors on (at least) AArch64 when
compiling SVE code.

gcc/ChangeLog:

PR rtl-optimization/100230
* early-remat.c (early_remat::sort_candidates): Use delete[]
instead of delete for array allocated with new[].

(cherry picked from commit 5d87c2251c441f056e0a44f928ffcb8a8a679b6b)

c++/98032 - add testcase

This adds another testcase for PR95719.

2021-04-30 Richard Biener <rguenther@suse.de>

PR c++/98032
* g++.dg/pr98032.C: New testcase.

(cherry picked from commit dfc70841eb0ca42637826177f329cf6c98ee00ad)

tree-optimization/96513 - add testcase for fixed bug

This adds a testcase for a bug that was fixed with the
hybrid SLP detection rewrite.

2021-04-30 Richard Biener <rguenther@suse.de>

PR tree-optimization/96513
* gcc.dg/torture/pr96513.c: New testcase.

Daily bump.

Update gcc sv.po.

* sv.po: Update.

Update gcc fr.po.

* fr.po: Update.

libstdc++: Add missing 'inline' specifiers to net::ip functions [PR 100259]

libstdc++-v3/ChangeLog:

PR libstdc++/100259
* include/experimental/internet (net::ip::make_error_code)
(net::ip::make_error_condition, net::ip::make_network_v4)
(net::ip::operator==(const udp&, const udp&)): Add 'inline'.

(cherry picked from commit 3f4aa4579a6c03e0a0b0a6aec68aa5a301264d45)

libstdc++: Define __cpp_lib_constexpr_string macro

As noted in r11-1339-gb6ab9ecd550227684643b41e9e33a4d3466724d8 we define
a non-standard __cpp_lib_constexpr_char_traits feature test macro to
indicate support for P0426R1 and P1032R1. At some point last year the
__cpp_lib_constexpr_string macro was retconned to indicate support for
those papers. This adds the new macro (which we didn't previously
define, because it referred to P0980R1 "Making std::string constexpr"
which we don't support).

libstdc++-v3/ChangeLog:

* include/bits/basic_string.h (__cpp_lib_constexpr_string): Define.
* include/std/version (__cpp_lib_constexpr_string): Define.
* testsuite/21_strings/char_traits/requirements/constexpr_functions_c++17.cc:
Check for __cpp_lib_constexpr_string.
* testsuite/21_strings/char_traits/requirements/constexpr_functions_c++20.cc:
Likewise.
* testsuite/21_strings/char_traits/requirements/version.cc: New test.

(cherry picked from commit 3da80ed7efd582575e7850a403ce693ec882d087)

arm: fix UB due to missing mode check [PR100311]

Some places in the compiler iterate over all the fixed registers to
check if that register can be used in a particular mode. The idiom is
to iterate over the register and then for that register, if it
supports the current mode to check all that register and any
additional registers needed (HARD_REGNO_NREGS). If these two checks
are not fully aligned then it is possible to generate a buffer overrun
when testing data objects that are sized by the number of hard regs in
the machine.

The VPR register is a case where these checks were not consistent and
because this is the last HARD register the result was that we ended up
overflowing the fixed_regs array.

gcc:
PR target/100311
* config/arm/arm.c (arm_hard_regno_mode_ok): Only allow VPR to be
used in HImode.
(cherry picked from commit 59f5d16f2c5db4d9592c8ce6453afe81334bb012)

testsuite: Remove dg-options from pr100305.c [PR100305]

The test FAILs on i?86-linux (due to -Wpsabi warnings).  But, on closer
inspection it seems there is another problem, the dg-options in the testcase
means that the test is compiled with -O0 -O, -O1 -O, -O2 -O, -O3 -O, -Os -O
etc. options, so effectively is tested multiple times with the same options.

Fixed by dropping the dg-options line, then we have -w by default and iterate
over all the optimization levels (including the -O).

2021-04-29  Jakub Jelinek  <jakub@redhat.com>

PR target/100305
* gcc.c-torture/compile/pr100305.c: Remove dg-options.  Add PR line.

(cherry picked from commit 62a44a9797edce11b1f7051ea0016ee975d41233)

aarch64: Fix ICE in aarch64_add_offset_1_temporaries [PR100302]

In PR94121 I've changed aarch64_add_offset_1 to use absu_hwi instead of
abs_hwi because offset can be HOST_WIDE_INT_MIN. As can be seen with
the testcase below, aarch64_add_offset_1_temporaries suffers from the same
problem and should be in sync with aarch64_add_offset_1, i.e. for
HOST_WIDE_INT_MIN it needs a temporary.

2021-04-29 Jakub Jelinek <jakub@redhat.com>

PR target/100302
* config/aarch64/aarch64.c (aarch64_add_offset_1_temporaries): Use
absu_hwi instead of abs_hwi.

* gcc.target/aarch64/sve/pr100302.c: New test.

(cherry picked from commit 1bb3e2c0ce6ed363c72caf814a6ba6d7b17c3e0a)

c++: Fix up detach clause vs. data-sharing clause checking [PR100319]

The standard says that "The event-handle will be considered as if it
was specified on a firstprivate clause." which means that it can't
be explicitly specified in some other data-sharing clause.
The checking is implemented correctly for C, but for C++ when detach_seen
is true (i.e. the construct had detach clause) we were comparing
OMP_CLAUSE_DECL (c) with t, which was previously initialized to
OMP_CLAUSE_DECL (c), which means it complained about any explicit
data-sharing clause on the same construct with a detach clause.

Fixed by remembering the detach clause in detach_seen (instead of a boolean
flag) and comparing against its OMP_CLAUSE_DECL.

2021-04-29 Jakub Jelinek <jakub@redhat.com>

PR c++/100319
* semantics.c (finish_omp_clauses): Fix up check that variable
mentioned in detach clause doesn't appear in data-sharing clauses.

* c-c++-common/gomp/task-detach-3.c: New test.

(cherry picked from commit 1b462deabf70e0f4bebb1f85118827d9c2eeffb5)

[omp, simt] Fix expand_GOMP_SIMT_*

When running the test-case included in this patch using an
nvptx accelerator, it fails in execution.

The problem is that the expansion of GOMP_SIMT_XCHG_BFLY is optimized away
during pass_jump as "trivially dead insns".

This is caused by this code in expand_GOMP_SIMT_XCHG_BFLY:
...
  class expand_operand ops[3];
  create_output_operand (&ops[0], target, mode);
  ...
  expand_insn (targetm.code_for_omp_simt_xchg_bfly, 3, ops);
...
which doesn't guarantee that target is assigned to by the expanded insn.

F.i., if target is:
...
(gdb) call debug_rtx ( target )
(subreg/s/u:QI (reg:SI 40 [ _61 ]) 0)
...
then after expand_insn, we have:
...
(gdb) call debug_rtx ( ops[0].value )
(reg:QI 57)
...

See commit 3af3bec2e4d "internal-fn: Avoid dropping the lhs of some
calls [PR94941]" for a similar problem.

Fix this in the same way, by adding:
...
  if (!rtx_equal_p (target, ops[0].value))
    emit_move_insn (target, ops[0].value);
...
where applicable in the expand_GOMP_SIMT_* functions.

Tested libgomp on x86_64 with nvptx accelerator.

gcc/ChangeLog:

2021-04-28  Tom de Vries  <tdevries@suse.de>

PR target/100232
* internal-fn.c (expand_GOMP_SIMT_ENTER_ALLOC)
(expand_GOMP_SIMT_LAST_LANE, expand_GOMP_SIMT_ORDERED_PRED)
(expand_GOMP_SIMT_VOTE_ANY, expand_GOMP_SIMT_XCHG_BFLY)
(expand_GOMP_SIMT_XCHG_IDX): Ensure target is assigned to.

(cherry picked from commit 4d7c874e2c64ebf7631049ace642d246843febae)

aarch64: Fix address mode for vec_concat pattern [PR100305]

The load_pair_lanes<mode> patterns match a vec_concat of two
adjacent 64-bit memory locations as a single 128-bit load.
The Utq constraint made sure that the address was suitable
for a 128-bit vector, but this meant that it allowed some
addresses that aren't valid for the 64-bit element mode.

Two obvious fixes were:

(1) Continue to accept addresses that aren't valid for the element
    modes.  This would mean changing the mode of operands[1] before
    printing it.  It would also mean using a custom predicate instead
    of the current memory_operand.

(2) Restrict addresses to the intersection of those that are valid
    element and vector addresses.

The problem with (1) is that, as well as being more complicated,
it doesn't deal with the fact that we still have a memory_operand
for the second element.  If we encourage the first operand to be
outside the range of a normal element memory_operand, we'll have
to reload the second operand to make it valid.  This reload will
often be dead code, but will be kept around because the RTL
pattern makes it look as though the second element address
is still needed.

This patch therefore does (2) instead.

As mentioned in the PR notes, I think we have a general problem
with the way that the aarch64 port deals with paired addresses.
There's nothing to guarantee that the two addresses will be
reloaded in a way that keeps them “obviously” adjacent, so the
rtx_equal_p conditions could fail if something rechecked them
later.

For this particular pattern, I think it would be better to teach
simplify-rtx.c to fold the vec_concat to a normal vector memory
reference, to remove any suggestion that targets should try to
match the unsimplified form.  That obviously wouldn't be suitable
for backports though.

gcc/
PR target/100305
* config/aarch64/constraints.md (Utq): Require the address to
be valid for both the element mode and for V2DImode.

gcc/testsuite/
PR target/100305
* gcc.c-torture/compile/pr100305.c: New test.

(cherry picked from commit 668df9e769e7d89bcefa07f72b68dcae9a8f3970)

aarch64: Handle SVE attributes in comp_type_attributes [PR100270]

Even though "SVE type" and "SVE sizeless type" are marked as
affecting type identity, the middle end doesn't truly believe
it unless we also handle them in comp_type_attributes.

gcc/
PR target/100270
* config/aarch64/aarch64.c (aarch64_comp_type_attributes): Handle
SVE attributes.

gcc/testsuite/
PR target/100270
* gcc.target/aarch64/sve/acle/general-c/pr100270_1.c: New test.
* gcc.target/aarch64/sve/acle/general-c/sizeless-2.c: Change
expected error message when subtracting pointers to different
vector types. Expect warnings when mixing them elsewhere.
* gcc.target/aarch64/sve/acle/general/attributes_7.c: Remove
XFAILs. Tweak error messages for some cases.

(cherry picked from commit 4cea5b8cb715e40e10174e6de405f26202fa3d6a)

Fortran/OpenMP: Fix var-list expr parsing with array/dt

gcc/fortran/ChangeLog:

* openmp.c (gfc_match_omp_variable_list): Gobble whitespace before
checking whether a '%' or parenthesis-open follows as next character.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/map-5.f90: New test.

(cherry picked from commit e4aefface2a0e34d84b85844b11652eb28f2cf0c)

Daily bump.

Update gcc .po files.

* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
zh_TW.po: Update.

bpf: allow BSS symbols to be global symbols

Prior to this, a BSS declaration such as:

  int foo;
  static int bar;

Generates:

  .global foo
  .local  foo
  .comm   foo,4,4
  .local  bar
  .comm   bar,4,4

Creating symbols:

  0000000000000000 b foo
  0000000000000004 b bar

Both symbols are local. However, libbpf bpf_object__variable_offset
rquires symbols to be STB_GLOBAL & STT_OBJECT for data section lookup.
This patch makes the same declaration generate:

  .global foo
  .type   foo, @object
  .lcomm  foo,4,4
  .local  bar
  .comm   bar,4,4

Creating symbols:

  0000000000000000 B foo
  0000000000000004 b bar

And libbpf will be okay with looking up the global symbol "foo".

2021-04-22  YiFei Zhu  <zhuyifei1999@gmail.com>

gcc/

* config/bpf/bpf.h (ASM_OUTPUT_ALIGNED_BSS): Use .type and .lcomm.

(cherry picked from commit 886b6c1e8af502b69e3f318b9830b73b88215878)

bpf: align function entry point to 64 bits

Libbpf does not treat paddings after functions well. If function
symbols does not cover a whole text section, it will emit error
similar to:

libbpf: sec '.text': failed to find program symbol at offset 56

Each instruction in BPF is a multiple of 8 bytes, so align the
functions to 8 bytes, similar to how clang does it.

2021-04-22 YiFei Zhu <zhuyifei1999@gmail.com>

gcc/

* config/bpf/bpf.h (FUNCTION_BOUNDARY): Set to 64.

(cherry picked from commit 0a662e103e911af935aa5c601051c135986ce3de)

libstdc++: Add missing noexcept on std::thread member function [PR 100298]

The new inline definition of std::thread::hardware_concurrency() for
non-gthreads targets is missing the noexcept-specifier that is on the
declaration.

libstdc++-v3/ChangeLog:

PR libstdc++/100298
* include/bits/std_thread.h (thread::hardware_concurrency): Add
missing noexcept to inline definition for non-gthreads targets.

(cherry picked from commit 5cc28000cfcc219fb4c45dbc5388ec05109049af)

arm: fix UB when compiling thumb2 with PIC [PR100236]

arm_compute_save_core_reg_mask contains UB in that the saved PIC
register number is used to create a bit mask. However, for some target
options this register is undefined and we end up with a shift of ~0.

On native compilations this is benign since the shift will still be
large enough to move the bit outside of the range of the mask, but if
cross compiling from a system that truncates out-of-range shifts to
zero (or worse, raises a trap for such values) we'll get potentially
wrong code (or a fault).

gcc:
PR target/100236
* config/arm/arm.c (THUMB2_WORK_REGS): Check PIC_OFFSET_TABLE_REGNUM
is valid before including it in the mask.
(cherry picked from commit 01d0bda8bdf3cd804e1e00915d432ad0cdc49399)

Revert "libstdc++: Add workaround for ia32 floating atomics miscompilations [PR100184]"

This reverts commit a21f3b38c3b9a5c28c79be37b040e7d06d827d76.

i386: Fix atomic FP peepholes [PR100182]

64bit loads to/stores from x87 and SSE registers are atomic also on 32-bit
targets, so there is no need for additional atomic moves to a temporary
register.

Introduced load peephole2 patterns assume that there won't be any additional
loads from the load location outside the peepholed sequence and wrongly
removed the source location initialization.

OTOH, introduced store peephole2 patterns assume there won't be any additional
loads from the stored location outside the peepholed sequence and wrongly
removed the destination location initialization.  Note that we can't use plain
x87 FST instruction to initialize destination location because FST converts
the value to the double-precision format, changing bits during move.

The patch restores removed initializations in load and store patterns.
Additionally, plain x87 FST in store peephole2 patterns is prevented by
limiting the store operand source to SSE registers.

2021-04-27  Uroš Bizjak  <ubizjak@gmail.com>

gcc/
PR target/100182
* config/i386/sync.md (FILD_ATOMIC/FIST_ATOMIC FP load peephole2):
Copy operand 3 to operand 4.  Use sse_reg_operand
as operand 3 predicate.
(FILD_ATOMIC/FIST_ATOMIC FP load peephole2 with mem blockage): Ditto.
(LDX_ATOMIC/STX_ATOMIC FP load peephole2): Ditto.
(LDX_ATOMIC/LDX_ATOMIC FP load peephole2 with mem blockage): Ditto.
(FILD_ATOMIC/FIST_ATOMIC FP store peephole2):
Copy operand 1 to operand 0.
(FILD_ATOMIC/FIST_ATOMIC FP store peephole2 with mem blockage): Ditto.
(LDX_ATOMIC/STX_ATOMIC FP store peephole2): Ditto.
(LDX_ATOMIC/LDX_ATOMIC FP store peephole2 with mem blockage): Ditto.

gcc/testsuite/
PR target/100182
* gcc.target/i386/pr100182.c: New test.
* gcc.target/i386/pr71245-1.c (dg-final): Xfail scan-assembler-not.
* gcc.target/i386/pr71245-2.c (dg-final): Ditto.

Synchronize Rocket Lake's processor_names and processor_cost_table with processor_type

gcc/ChangeLog

* common/config/i386/i386-common.c (processor_names):
Sync processor_names with processor_type.
* config/i386/i386-options.c (processor_cost_table):
Sync processor_cost_table with processor_type.

Daily bump.

libstdc++: Fix up lambda in join_view::_Iterator::operator++ [PR100290]

Currently, the return type of this lambda is decltype(auto), so the
lambda ends up returning a copy of _M_parent->_M_inner rather than a
reference to it when _S_ref_glvalue is false. This means _M_inner and
ranges::end(__inner_range) are respectively an iterator and sentinel for
different ranges, so comparing them is undefined.

libstdc++-v3/ChangeLog:

PR libstdc++/100290
* include/std/ranges (join_view::_Iterator::operator++): Correct
the return type of the lambda to avoid returning a copy of
_M_parent->_M_inner.
* testsuite/std/ranges/adaptors/join.cc (test10): New test.

(cherry picked from commit 85ef4b8d4eb3313a48b79c7e752891f9646bb246)

c++: do_class_deduction and dependent init [PR93383]

Here we're crashing during CTAD with a dependent initializer (performed
from convert_template_argument) because one of the initializer's
elements has an empty TREE_TYPE, which ends up making resolve_args
unhappy.

Besides the case where we're initializing one template placeholder
from another, which is already specifically handled earlier in
do_class_deduction, it seems we can't in general correctly resolve a
template placeholder using a dependent initializer, so this patch makes
the function just punt until instantiation time instead.

gcc/cp/ChangeLog:

PR c++/89565
PR c++/93383
PR c++/95291
PR c++/99200
PR c++/99683
* pt.c (do_class_deduction): Punt if the initializer is
type-dependent.

gcc/testsuite/ChangeLog:

PR c++/89565
PR c++/93383
PR c++/95291
PR c++/99200
PR c++/99683
* g++.dg/cpp2a/nontype-class39.C: Remove dg-ice directive.
* g++.dg/cpp2a/nontype-class45.C: New test.
* g++.dg/cpp2a/nontype-class46.C: New test.
* g++.dg/cpp2a/nontype-class47.C: New test.
* g++.dg/cpp2a/nontype-class48.C: New test.

(cherry picked from commit bcd77b7b9f35bd5b559ed593c3b3e346c1e6f364)

Fortran - allow target of pointer from evaluation of function-reference

Fortran allows the target of a pointer from the evaluation of a
function-reference in a variable definition context (e.g. F2018:R902).

gcc/fortran/ChangeLog:

PR fortran/100218
* expr.c (gfc_check_vardef_context): Extend check to allow pointer
from a function reference.

gcc/testsuite/ChangeLog:

PR fortran/100218
* gfortran.dg/ptr-func-4.f90: New test.

(cherry picked from commit 32c4d970ea3a9fc330d6aa8fd83f9dae0b9afc64)

PR fortran/100154 - ICE in gfc_conv_procedure_call, at fortran/trans-expr.c:6131

Add appropriate static checks for the character and status arguments to
the GNU Fortran intrinsic extensions fget[c], fput[c]. Extend variable
check to allow a function reference having a data pointer result.

gcc/fortran/ChangeLog:

PR fortran/100154
* check.c (variable_check): Allow function reference having a data
pointer result.
(arg_strlen_is_zero): New function.
(gfc_check_fgetputc_sub): Add static check of character and status
arguments.
(gfc_check_fgetput_sub): Likewise.
* intrinsic.c (add_subroutines): Fix argument name for the
character argument to intrinsic subroutines fget[c], fput[c].

gcc/testsuite/ChangeLog:

PR fortran/100154
* gfortran.dg/pr100154.f90: New test.

(cherry picked from commit d0e7833b94953ba6b4a915150666969ad9fc66af)

c++: Prevent bogus -Wtype-limits warning with NTTP [PR100161]

Recently, we made sure that we never call value_dependent_expression_p
on an expression that isn't potential_constant_expression.  That caused
this bogus warning with a non-type template parameter, something that
users don't want to see.

The problem is that in tsubst_copy_and_build/LE_EXPR 't' is "i < n",
which, due to 'i', is not p_c_e, therefore we call t_d_e_p.  But the
type of 'n' isn't dependent, so we think the whole 't' expression is
not dependent.  It seems we need to test both op0 and op1 separately
to suppress this warning.

gcc/cp/ChangeLog:

PR c++/100161
* pt.c (tsubst_copy_and_build) <case PLUS_EXPR>: Test op0 and
op1 separately for value- or type-dependence.

gcc/testsuite/ChangeLog:

PR c++/100161
* g++.dg/warn/Wtype-limits6.C: New test.

(cherry picked from commit 244dfb95119106e9267f37583caac565c39eb0ec)

c++: Don't allow defining types in enum-base [PR96380]

In r11-2064 I made cp_parser_enum_specifier commit to tentative parse
when seeing a '{'.  That still looks like the correct thing to do, but
it caused an ICE-on-invalid as well as accepts-invalid.

When we have something sneaky like this, which is broken in multiple
ways:

  template <class>
  enum struct c : union enum struct c { e = b, f = a };

we parse the "enum struct c" part (that's OK) and then we see that
we have an enum-base, so we consume ':' and then parse the type-specifier
that follows the :.  "union enum" is clearly invalid, but we're still
parsing tentatively and we parse everything up to the ;, and then
throw away the underlying type.  We parsed everything because we were
tricked into parsing an enum-specifier in an enum-base of another
enum-specifier!  Not good.

Since the grammar for enum-base doesn't allow a defining-type-specifier,
only a type-specifier, we should set type_definition_forbidden_message
which fixes all the problems in this PR.

gcc/cp/ChangeLog:

PR c++/96380
* parser.c (cp_parser_enum_specifier): Don't allow defining
types in enum-base.

gcc/testsuite/ChangeLog:

PR c++/96380
* g++.dg/cpp0x/enum_base4.C: New test.
* g++.dg/cpp0x/enum_base5.C: New test.

(cherry picked from commit 001c63d15e31bc0a1545426d889a0b9f671b4961)

libgomp/testsuite: Fix checks for dg-excess-errors

For the tests modified below, the effective target line has to be effective
when compiling for an offload target, except that variable-not-offloaded.c
would compile with unified-share memory and pr86416-*.c if long double/float128
is supported.
The previous check used a run-time device ability check. This new variant
now enables those dg- lines when _compiling_ for nvptx or gcn.

libgomp/ChangeLog:

* testsuite/lib/libgomp.exp (offload_target_to_openacc_device_type):
New, based on check_effective_target_offload_target_nvptx.
(check_effective_target_offload_target_nvptx): Call it.
(check_effective_target_offload_target_amdgcn): New.
* testsuite/libgomp.c-c++-common/function-not-offloaded.c:
Require target offload_target_nvptx || offload_target_amdgcn.
* testsuite/libgomp.c-c++-common/variable-not-offloaded.c: Likewise.
* testsuite/libgomp.c/pr86416-1.c: Likewise.
* testsuite/libgomp.c/pr86416-2.c: Likewise.

(cherry picked from commit 95dfc3ac7baca19157bb976ee4c8c69753e1e178)

aarch64: Fix up last commit [PR100200]

Pedantically signed vs. unsigned mismatches in va_arg are only well defined
if the value can be represented in both signed and unsigned integer types.

2021-04-27 Jakub Jelinek <jakub@redhat.com>

PR target/100200
* config/aarch64/aarch64.c (aarch64_print_operand): Cast -UINTVAL
back to HOST_WIDE_INT.

(cherry picked from commit 1c0c371d0ea297af2e3180c64cd18f2bfce919b1)

[PATCH] Backport fix for PR target/98952

The test in the PowerPC 32-bit trampoline support is backwards.  It aborts
if the trampoline size is greater than the expected size.  It should abort
when the trampoline size is less than the expected size.  I fixed the test
so the operands are reversed.  I then folded the load immediate into the
compare instruction.

I verified this by creating a 32-bit trampoline program and manually
changing the size of the trampoline to be 48 instead of 40.  The program
aborted with the larger size.  I updated this code and ran the test again
and it passed.

I added a test case that runs on PowerPC 32-bit Linux systems and it calls
the __trampoline_setup function with a larger buffer size than the
compiler uses.  The test is not run on 64-bit systems, since the function
__trampoline_setup is not called.  I also limited the test to just Linux
systems, in case trampolines are handled differently in other systems.

libgcc/
2021-04-27  Michael Meissner  <meissner@linux.ibm.com>

PR target/98952
* config/rs6000/tramp.S (__trampoline_setup, elfv1 #ifdef): Fix
trampoline size comparison in 32-bit by reversing test and
combining load immediate with compare.  Fix backported from trunk
change on 4/23, 886b6c1e8af502b69e3f318b9830b73b88215878.
(__trampoline_setup, elfv2 #ifdef): Fix trampoline size comparison
in 32-bit by reversing test and combining load immediate with
compare.

gcc/testsuite/
2021-04-27  Michael Meissner  <meissner@linux.ibm.com>

PR target/98952
* gcc.target/powerpc/pr98952.c: New test.  Test backported from
trunk change on 4/23, 886b6c1e8af502b69e3f318b9830b73b88215878.

aarch64: Fix UB in the compiler [PR100200]

The following patch fixes UBs in the compiler when negativing
a CONST_INT containing HOST_WIDE_INT_MIN. I've changed the spots where
there wasn't an obvious earlier condition check or predicate that
would fail for such CONST_INTs.

2021-04-27 Jakub Jelinek <jakub@redhat.com>

PR target/100200
* config/aarch64/predicates.md (aarch64_sub_immediate,
aarch64_plus_immediate): Use -UINTVAL instead of -INTVAL.
* config/aarch64/aarch64.md (casesi, rotl<mode>3): Likewise.
* config/aarch64/aarch64.c (aarch64_print_operand,
aarch64_split_atomic_op, aarch64_expand_subvti): Likewise.

(cherry picked from commit 618ae596ebcd1de03857d20485d1324931852569)

veclower: Fix up vec_shl matching of VEC_PERM_EXPR [PR100239]

The following testcase ICEs at -O0, because lower_vec_perm sees the
  _1 = { 0, 0, 0, 0, 0, 0, 0, 0 };
  _2 = VEC_COND_EXPR <_1, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
  _3 = { 6, 0, 0, 0, 0, 0, 0, 0 };
  _4 = VEC_PERM_EXPR <{ 0, 0, 0, 0, 0, 0, 0, 0 }, _2, _3>;
and as the ISA is SSE2, there is no support for the particular permutation
nor for variable mask permutation.  But, the code to match vec_shl matches
it, because the permutation has the first operand a zero vector and the
mask picks all elements randomly from that vector.
So, in the end that isn't a vec_shl, but the permutation could be in theory
optimized into the first argument.  As we keep it as is, it will fail
during expansion though, because that for vec_shl correctly requires that
it actually is a shift:
      unsigned firstidx = 0;
      for (unsigned int i = 0; i < nelt; i++)
        {
          if (known_eq (sel[i], nelt))
            {
              if (i == 0 || firstidx)
                return NULL_RTX;
              firstidx = i;
            }
          else if (firstidx
                   ? maybe_ne (sel[i], nelt + i - firstidx)
                   : maybe_ge (sel[i], nelt))
            return NULL_RTX;
        }

      if (firstidx == 0)
        return NULL_RTX;
      first = firstidx;
The if (firstidx == 0) return NULL; is what is missing a counterpart
on the lower_vec_perm side.
As with optimize != 0 we fold it in other spots, I think it is not needed
to optimize this cornercase in lower_vec_perm (which would mean we'd need
to recurse on the newly created _4 = { 0, 0, 0, 0, 0, 0, 0, 0 };
whether it is supported or not).

2021-04-27  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/100239
* tree-vect-generic.c (lower_vec_perm): Don't accept constant
permutations with all indices from the first zero element as vec_shl.

* gcc.dg/pr100239.c: New test.

(cherry picked from commit 83d26d0e1b3625ab6c2d83610a13976b52f63e0a)

cfgcleanup: Fix -fcompare-debug issue in outgoing_edges_match [PR100254]

The following testcase fails with -fcompare-debug. The problem is that
outgoing_edges_match behaves differently between -g0 and -g, if
some load/store with REG_EH_REGION is followed by DEBUG_INSNs, the
REG_EH_REGION check is not done, while when there are no DEBUG_INSNs, it is
done.

We already compute last1 and last2 as BB_END (bb{1,2}) with skipped debug
insns and notes, so this patch just uses those.

2021-04-27 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/100254
* cfgcleanup.c (outgoing_edges_match): Check REG_EH_REGION on
last1 and last2 insns rather than BB_END (bb1) and BB_END (bb2) insns.

* g++.dg/opt/pr100254.C: New test.

(cherry picked from commit e600df51a15b2ec7a72731921a2464ffe59cf5ab)

vmsdbgout: Remove useless register keywords

register keyword was removed in C++17, and in vmsdbgout.c it served no
useful purpose.

2021-04-26 Jakub Jelinek <jakub@redhat.com>

PR debug/100255
* vmsdbgout.c (ASM_OUTPUT_DEBUG_STRING, vmsdbgout_begin_block,
vmsdbgout_end_block, lookup_filename, vmsdbgout_source_line): Remove
register keywords.

(cherry picked from commit 297bfacdb448c0d29b8dfac2818350b90902bc75)

testsuite: Add -fchecking to dg-ice tests

In --enable-checking=release builds (which is the default on release
branches), I'm getting various extra FAILs that don't appear in
--enable-checking=yes builds.

XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++14 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++17 (test for excess errors)
XPASS: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (internal compiler error)
FAIL: g++.dg/cpp0x/constexpr-52830.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/vt-88982.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1y/auto-fn61.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp1z/constexpr-lambda26.C  -std=c++2a (test for excess errors)
FAIL: g++.dg/cpp2a/nontype-class39.C  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++14 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++17 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-1.c  -std=c++98 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++14 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++17 (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++2a (test for excess errors)
FAIL: c-c++-common/goacc/kernels-decompose-ice-2.c  -std=c++98 (test for excess errors)

These are tests that have dg-ice and most of those ICEs are checking ICEs
which go away in release checking when -fno-checking is the default.

The following patch adds -fchecking option to those.

2021-04-21  Jakub Jelinek  <jakub@redhat.com>

* g++.dg/cpp1z/constexpr-lambda26.C: Add dg-additional-options
-fchecking.
* g++.dg/cpp1y/auto-fn61.C: Likewise.
* g++.dg/cpp2a/nontype-class39.C: Likewise.
* g++.dg/cpp0x/constexpr-52830.C: Likewise.
* g++.dg/cpp0x/vt-88982.C: Likewise.
* c-c++-common/goacc/kernels-decompose-ice-1.c: Add -fchecking to
dg-additional-options.
* c-c++-common/goacc/kernels-decompose-ice-2.c: Likewise.

(cherry picked from commit ca4bf1dd4398dc65d8fff8b9f5c67733729cee95)

cprop: Fix -fcompare-debug bug in constprop_register [PR100148]

The following testcase shows different behavior between -g and -g0
in constprop_register, if a flags register setter is separated
from a conditional jump using those flags with -g by a DEBUG_INSN.
As it uses just NEXT_INSN, for -g it will look at the DEBUG_INSN which is
not a conditional jump, while otherwise it would look at the conditional
jump and call cprop_jump.

2021-04-21 Jakub Jelinek <jakub@redhat.com>

PR rtl-optimization/100148
* cprop.c (constprop_register): Use next_nondebug_insn instead of
NEXT_INSN.

* g++.dg/opt/pr100148.C: New test.

(cherry picked from commit 022f6ee3ad67ee30f62c8c2aeeb4156494f3284e)

Bump BASE-VER

2021-05-27 Jakub Jelinek <jakub@redhat.com>

* BASE-VER: Set to 11.1.1.

c++: Remove #error for release builds

* module.cc: Remove #error that triggers if DEV-PHASE is empty.

Update ChangeLog and version files for release

Update gennews for GCC 10 and GCC 11.

2021-04-27 Jakub Jelinek <jakub@redhat.com>

* gennews (files): Add files for GCC 10 and GCC 11.

(cherry picked from commit bbadf83e5a2a1e3bf713dd41391e149aea2d61db)

Daily bump.

testsuite/substr_{9,10}.f90: Move to gfortran.dg/

gcc/testsuite/
* substr_9.f90: Move to ...
* gfortran.dg/substr_9.f90: ... here.
* substr_10.f90: Move to ...
* gfortran.dg/substr_10.f90: ... here.

(cherry picked from commit ac456fd981db6b0c2f7ee1ab0d17d36087a74dc2)

libstdc++: Fix semaphore to work with system_clock timeouts

The __cond_wait_until_impl function takes a steady_clock timeout, but
then sometimes tries to compare it to a time from the system_clock,
which won't compile. Additionally, that function gets called with
system_clock timeouts, which also won't compile. This makes the function
accept timeouts for either clock, and compare to the time from the right
clock.

This fixes the compilation error that was causing two tests to fail on
non-futex targets, so we can revert the r12-11 change to disable them.

libstdc++-v3/ChangeLog:

* include/bits/atomic_timed_wait.h (__cond_wait_until_impl):
Handle system_clock as well as steady_clock.
* testsuite/30_threads/semaphore/try_acquire_for.cc: Re-enable.
* testsuite/30_threads/semaphore/try_acquire_until.cc:
Re-enable.

(cherry picked from commit 6924588774a02dec63fb4b0ad19aed303accc2d2)

libstdc++: Add options for libatomic to test

This fixes a linker error on AIX:

FAIL: 30_threads/semaphore/try_acquire_posix.cc (test for excess errors)
Excess errors:
ld: 0711-317 ERROR: Undefined symbol: .__atomic_fetch_add_8
ld: 0711-317 ERROR: Undefined symbol: .__atomic_load_8
ld: 0711-317 ERROR: Undefined symbol: .__atomic_fetch_sub_8
ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information.
collect2: error: ld returned 8 exit status

libstdc++-v3/ChangeLog:

* testsuite/30_threads/semaphore/try_acquire_posix.cc: Add
options for libatomic.

(cherry picked from commit 58871c03318e080962f022f5d77db3c4fde3e351)

Daily bump.

libstdc++: Fix "bare" notifications dropped by waiters check

For types that track whether or not there extant waiters (e.g.
semaphore) internally, the __atomic_notify_address_bare() call was
introduced to avoid the overhead of loading the atomic count of
waiters. For platforms that don't have Futex, however, there was
still a check for waiters, and seeing that there are none (because
in the bare case, the count is not incremented), the notification
is dropped. This commit addresses that case.

libstdc++-v3/ChangeLog:
* include/bits/atomic_wait.h: Always notify waiters in the
case of 'bare' address notification.

(cherry picked from commit 2d856af6f9f38fd730754e5024aa82aab3948069)

libstdc++: Remove #error from <semaphore> implementation [PR 100179]

This removes the #error from <bits/semaphore_base.h> for the case where
neither __atomic_semaphore nor __platform_semaphore is defined.

Also rename the _GLIBCXX_REQUIRE_POSIX_SEMAPHORE macro to
_GLIBCXX_USE_POSIX_SEMAPHORE for consistency with the similar
_GLIBCXX_USE_CXX11_ABI macro that can be used to request an alternative
(ABI-changing) implementation.

libstdc++-v3/ChangeLog:

PR libstdc++/100179
* include/bits/semaphore_base.h: Remove #error.
* include/std/semaphore: Do not define anything unless one of
the implementations is available.

(cherry picked from commit 4b2db8077136d2f8b5a0db026e6161810be327b3)

libstdc++: Add workaround for ia32 floating atomics miscompilations [PR100184]

gcc on ia32 miscompiles various atomics involving floating point,
unfortunately I'm afraid it is too late to fix that for 11.1 and
as I'm quite lost on it, it might take a while for 12 too
(disabling all the 8 peephole2s would be easiest, but then we'd
run into optimization regressions).

While 1.cc just FAILs, with dejagnu 1.6.1 wait_notify.cc hangs the
make check even after the timeout fires.  The following patch therefore
xfails the former and skips the latter.

Tested on x86_64-linux where
make check RUNTESTFLAGS='conformance.exp=atomic_float/*.cc'
is still
                === libstdc++ Summary ===

# of expected passes            8
and on i686-linux, where it is now
                === libstdc++ Summary ===

# of expected passes            5
# of expected failures          1
# of unsupported tests          1

2021-04-22  Jakub Jelinek  <jakub@redhat.com>

PR target/100182
* testsuite/29_atomics/atomic_float/1.cc: Add dg-xfail-run-if for
ia32.
* testsuite/29_atomics/atomic_float/wait_notify.cc: Add dg-skip-if for
ia32.

(cherry picked from commit 0f4588141fcbe4e0f1fa12776b47200870f6c621)

gfortran.dg/pr68078.f90: Avoid increasing RLIMIT_AS

pr68078.f90 tests out-of-memory handling and calls set_vm_limit to set the
soft limit. However, setrlimit was then called with hard limit RLIM_INFINITY,
which failed when the current hard limit was lower.

gcc/testsuite/
* gfortran.dg/set_vm_limit.c (set_vm_limit): Call getrlimit, use
obtained hard limit, and only call setrlimit if new softlimit is lower.

(cherry picked from commit faf7d413a3f3337be1a3ac5cdf33e0e3b87b426e)

testsuite/100176 - fix struct-layout-1_generate.c compile

With -Werror=return-type we run into compile fails complaining about
missing return stmts.

2021-04-21 Richard Biener <rguenther@suse.de>

PR testsuite/100176
* objc.dg/gnu-encoding/struct-layout-encoding-1_generate.c: Add
missing return.

(cherry picked from commit 5668843346c74cabf830e46b45fad24db4566fd6)

Avoid -latomic for amdgcn offloading

libatomic isn't built for amdgcn but reduction-16.c adds it
via -foffload=-latomic when offloading for nvptx is enabled.
The following avoids linker errors when offloading to amdgcn is enabled
as well.

2021-04-21 Richard Biener <rguenther@suse.de>

libgomp/
* testsuite/libgomp.c-c++-common/reduction-16.c: Use -latomic
only on nvptx-none.

(cherry picked from commit d42088e453042f4f8ba9190a7e29efd937ea2181)

[libstdc++] Fix test timeout in stop_calback/destroy.cc

A change was made to __atomic_semaphore::_S_do_try_acquire() to
(ideally) let the compare_exchange reload the value of __old rather than
always reloading it twice. This causes _M_acquire to spin indefinitely
if the value of __old is already 0.

libstdc++-v3/ChangeLog:
* include/bits/semaphore_base.h: Always reload __old in
__atomic_semaphore::_S_do_try_acquire().
* testsuite/30_threads/stop_token/stop_callback/destroy.cc:
re-enable testcase.

(cherry picked from commit 7eeb8c04e53fa880ee559efb727517ce778d17a0)

Daily bump.