]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Thu, 2 Feb 2023 09:03:35 +0000 (10:03 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9097-gd31bd7138610a883310dce212bb0bdaaa8da7304 (2nd Feb 2023)

2 years agoDaily bump.
GCC Administrator [Thu, 2 Feb 2023 00:21:59 +0000 (00:21 +0000)] 
Daily bump.

2 years agoipa: Release body more carefully when removing nodes (PR 107944)
Martin Jambor [Wed, 1 Feb 2023 17:58:09 +0000 (18:58 +0100)] 
ipa: Release body more carefully when removing nodes (PR 107944)

The code removing function bodies when the last call graph clone of a
node is removed is too aggressive when there are nodes up the
clone_of chain which still need them.  Fixed by expanding the check.

gcc/ChangeLog:

2023-01-18  Martin Jambor  <mjambor@suse.cz>

PR ipa/107944
* cgraph.cc (cgraph_node::remove): Check whether nodes up the
lcone_of chain also do not need the body.

(cherry picked from commit db959e250077ae6b4fc08f53fb322719582c5de6)

2 years agolibgomp.texi: Reverse-offload updates
Tobias Burnus [Wed, 1 Feb 2023 14:29:11 +0000 (15:29 +0100)] 
libgomp.texi: Reverse-offload updates

libgomp/
* libgomp.texi (5.0 Impl. Status): Update 'requires' and 'ancestor'.
(GCN): Add item about 'omp requires'.
(nvptx): Likewise; add item about reverse offload.

(cherry picked from commit eda38850a7980d78d966a39b58961349bea7c984)

2 years agoFortran: Extend align-clause checks of OpenMP's allocate directive
Tobias Burnus [Wed, 1 Feb 2023 14:27:42 +0000 (15:27 +0100)] 
Fortran: Extend align-clause checks of OpenMP's allocate directive

gcc/fortran/ChangeLog:

* openmp.cc (resolve_omp_clauses): Check also for
power of two.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/allocate-3.f90: Fix ALIGN
usage, remove unused -fdump-tree-original.
* testsuite/libgomp.fortran/allocate-4.f90: New.

(cherry picked from commit bf2cf6f3f1851054237ee7df99bdf60bf5a3e3ae)

2 years agoc++: ICE with -Wlogical-op [PR107755]
Marek Polacek [Tue, 31 Jan 2023 19:36:30 +0000 (14:36 -0500)] 
c++: ICE with -Wlogical-op [PR107755]

Here we crash in the middle end because warn_logical_operator calls
build_range_check which calls various fold_* functions and those
don't work too well when we're still processing template trees.  For
instance here we crash because we're converting a RECORD_TYPE to bool.
At this point VIEW_CONVERT_EXPR<struct Foo>(b) hasn't yet been converted
to Foo::operator bool (&b).

I was excited to fix this with instantiation_dependent_expression_p
which can now be called from c-family/ as well, but the problem isn't
that the expression is dependent.  So, p_t_d it is.

PR c++/107755

gcc/cp/ChangeLog:

* call.cc (build_new_op): Don't call warn_logical_operator when
processing a template.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wlogical-op-4.C: New test.

(cherry picked from commit 5ce8961b46f050a96e8c542b34b1cf024ba95f1b)

2 years agoc++, openmp: Handle some OMP_*/OACC_* constructs during constant expression evaluatio...
Jakub Jelinek [Wed, 1 Feb 2023 10:24:11 +0000 (11:24 +0100)] 
c++, openmp: Handle some OMP_*/OACC_* constructs during constant expression evaluation [PR108607]

While potential_constant_expression_1 handled most of OMP_* codes (by saying that
they aren't potential constant expressions), OMP_SCOPE was missing in that list.
I've also added OMP_SCAN, though that is less important (similarly to OMP_SECTION
it ought to appear solely inside of OMP_{FOR,SIMD} resp. OMP_SECTIONS).
As the testcase shows, it isn't enough, potential_constant_expression_1
can catch only some cases, as soon as one uses switch or ifs where at least
one of the possible paths could be constant expression, we can run into the
same codes during cxx_eval_constant_expression, so this patch handles those
there as well.

2023-02-01  Jakub Jelinek  <jakub@redhat.com>

PR c++/108607
* constexpr.cc (cxx_eval_constant_expression): Handle OMP_*
and OACC_* constructs as non-constant.
(potential_constant_expression_1): Handle OMP_SCAN and OMP_SCOPE.

* g++.dg/gomp/pr108607.C: New test.

(cherry picked from commit bfc070595bfb00abef88a002eee5d9117f5b86a7)

2 years agoDaily bump.
GCC Administrator [Wed, 1 Feb 2023 00:22:04 +0000 (00:22 +0000)] 
Daily bump.

2 years agoc++: fix ICE with -Wduplicated-cond [PR107593]
Marek Polacek [Tue, 31 Jan 2023 16:54:03 +0000 (11:54 -0500)] 
c++: fix ICE with -Wduplicated-cond [PR107593]

Here we crash because a CAST_EXPR, representing T(), doesn't have
its operand, and operand_equal_p's STRIP_ANY_LOCATION_WRAPPER doesn't
expect that.  (o_e_p is called from warn_duplicated_cond_add_or_warn.)

In the past we've adjusted o_e_p to better cope with template codes,
but in this case I think we just want to avoid attempting to warn
about inst-dependent expressions; I don't think I've ever envisioned
-Wduplicated-cond to warn about them.  Also destroy the chain when
an inst-dependent expression is encountered to not warn in
Wduplicated-cond4.C.

The ICE started with r12-6022, two-stage name lookup for overloaded
operators, which gave dependent operators a TREE_TYPE (in particular,
DEPENDENT_OPERATOR_TYPE), so we no longer bail out here in o_e_p:

  /* Similar, if either does not have a type (like a template id),
     they aren't equal.  */
  if (!TREE_TYPE (arg0) || !TREE_TYPE (arg1))
    return false;

PR c++/107593
PR c++/108597

gcc/c-family/ChangeLog:

* c-common.h (instantiation_dependent_expression_p): Declare.
* c-warn.cc (warn_duplicated_cond_add_or_warn): If the condition
is dependent, invalidate the chain.

gcc/c/ChangeLog:

* c-objc-common.cc (instantiation_dependent_expression_p): New.

gcc/cp/ChangeLog:

* cp-tree.h (instantiation_dependent_expression_p): Don't
declare here.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wduplicated-cond3.C: New test.
* g++.dg/warn/Wduplicated-cond4.C: New test.
* g++.dg/warn/Wduplicated-cond5.C: New test.

2 years agoDaily bump.
GCC Administrator [Tue, 31 Jan 2023 00:22:54 +0000 (00:22 +0000)] 
Daily bump.

2 years agoCorrectly detect shifts out of range
Andrew MacLeod [Mon, 30 Jan 2023 19:59:30 +0000 (14:59 -0500)] 
Correctly detect shifts out of range

get_shift_range was incorrectly communicating that it couldn't calculate
a range when the shift values was always out fo range.  Fix this and
alwasy return [0, 0] when the shift value is always out of range.

PR tree-optimization/108306
gcc/
* range-op.cc (operator_lshift::fold_range): Return [0, 0] not
varying for shifts that are always out of void range.
(operator_rshift::fold_range): Return [0, 0] not
varying for shifts that are always out of void range.

gcc/testsuite/
* gcc.dg/pr108306.c: New.

2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Mon, 30 Jan 2023 09:09:50 +0000 (10:09 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9090-g591ec4820aa4e6d757ddc76cae1d92d445daf72c (30th Jan 2023)

2 years agoOpenMP/Fortran: Fix has_device_addr clause splitting [PR108558]
Tobias Burnus [Fri, 27 Jan 2023 10:32:19 +0000 (11:32 +0100)] 
OpenMP/Fortran: Fix has_device_addr clause splitting [PR108558]

gcc/fortran/ChangeLog:

PR fortran/108558
* trans-openmp.cc (gfc_split_omp_clauses): Handle has_device_addr.

libgomp/ChangeLog:

PR fortran/108558
* testsuite/libgomp.fortran/has_device_addr.f90: New test.

(cherry picked from commit 2325c8920bbc99edcc9fffaa79577c528df41eb8)

2 years agoChange AVX512FP16 to AVX512-FP16 in the document.
liuhongt [Mon, 30 Jan 2023 01:38:38 +0000 (09:38 +0800)] 
Change AVX512FP16 to AVX512-FP16 in the document.

The official name is AVX512-FP16.

gcc/ChangeLog:

* config/i386/i386.opt: Change AVX512FP16 to AVX512-FP16.
* doc/invoke.texi: Ditto.

2 years agoDaily bump.
GCC Administrator [Mon, 30 Jan 2023 00:21:21 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDisable gather/scatter for zen4
Jan Hubicka [Mon, 16 Jan 2023 14:40:45 +0000 (15:40 +0100)] 
Disable gather/scatter for zen4

this patch adds more tunes for zen4:
 - new tunes for avx512 scater instructions.
   In micro benchmarks these seems consistent loss compared to open-coded coe
 - disable use of gather for zen4
   While these are win for a micro benchmarks (based on TSVC), enabling gather
   is a loss for parest. So for now it seems safe to keep it off.
 - disable pass to avoid FMA chains for znver4 since fmadd was optimized and does not seem
   to cause regressions.

* config/i386/i386.cc (ix86_vectorize_builtin_scatter): Guard scatter
by TARGET_USE_SCATTER.
* config/i386/i386.h (TARGET_USE_SCATTER_2PARTS,
TARGET_USE_SCATTER_4PARTS, TARGET_USE_SCATTER): New macros.
* config/i386/x86-tune.def (TARGET_USE_SCATTER_2PARTS,
TARGET_USE_SCATTER_4PARTS, TARGET_USE_SCATTER): New tunes.
(X86_TUNE_AVOID_256FMA_CHAINS, X86_TUNE_AVOID_512FMA_CHAINS): Disable
for znver4.  (X86_TUNE_USE_GATHER): Disable for zen4.

(cherry picked from commit 967592488c64a86f37bef3dabebb56364f14acdd)

2 years agoZen4 tuning part 2
Jan Hubicka [Thu, 22 Dec 2022 09:55:46 +0000 (10:55 +0100)] 
Zen4 tuning part 2

Adds tunes needed for zen4 microarchitecture.  I added two new knobs.
TARGET_AVX512_SPLIT_REGS which is used to specify that internally 512 vectors
are split to 256 vectors.  This affects vectorization costs and reassociation
width. It probably should also affect RTX costs however I doubt it is very useful
since RTL optimizers are usually not judging between 256 and 512 vectors.

I also added X86_TUNE_AVOID_256FMA_CHAINS. Since fma has improved in zen4 this
flag may not be a win except for very specific benchmarks. I am still doing some
more detailed testing here.

Oherwise I disabled gathers on zen4 for 2 parts nad 4 parts. We can open code them
and since the latencies has only increased since zen3 opencoding is better than
actual instrucction.  This shows at 4 tsvc benchmarks.

I ended up setting AVX256_OPTIMAL. This is a compromise.  There are some tsvc
benchmarks that increase noticeably (up to 250%) however there are also few
regressions.  Most of these can be solved by incrasing vec_perm cost in the
vectorizer.  However this does not cure about 14% regression on x264 that is
quite important.  Here we produce vectorized loops for avx512 that probably
would be faster if the loops in question had high enough iteration count.
We hit this problem with avx256 too: since the loop iterates few times, only
prologues/epilogues are used.  Adding another round of prologue/epilogue
code does not make it better.

Finally I enabled avx stores for constnat sized memcpy and memset.  I am not
sure why this is an opt-in feature.  I think for most hardware this is a win.

gcc/ChangeLog:

2022-12-22  Jan Hubicka  <hubicka@ucw.cz>

* config/i386/i386-expand.cc (ix86_expand_set_or_cpymem): Add
TARGET_AVX512_SPLIT_REGS
* config/i386/i386-options.cc (ix86_option_override_internal):
Honor x86_TONE_AVOID_256FMA_CHAINS.
* config/i386/i386.cc (ix86_vec_cost): Honor TARGET_AVX512_SPLIT_REGS.
(ix86_reassociation_width): Likewise.
* config/i386/i386.h (TARGET_AVX512_SPLIT_REGS): New tune.
* config/i386/x86-tune.def (X86_TUNE_USE_GATHER_2PARTS): Disable
for znver4.
(X86_TUNE_USE_GATHER_4PARTS): Likewise.
(X86_TUNE_AVOID_256FMA_CHAINS): Set for znver4.
(X86_TUNE_AVOID_512FMA_CHAINS): New utne; set for znver4.
(X86_TUNE_AVX256_OPTIMAL): Add znver4.
(X86_TUNE_AVX512_SPLIT_REGS): New tune.
(X86_TUNE_AVX256_MOVE_BY_PIECES): Add znver1-3.
(X86_TUNE_AVX256_STORE_BY_PIECES): Add znver1-3.
(X86_TUNE_AVX512_MOVE_BY_PIECES): Add znver4.
(X86_TUNE_AVX512_STORE_BY_PIECES): Add znver4.

(cherry picked from commit eef81eefcdc2a58111e50eb2162ea1f5becc8004)

2 years agoUpdate znver4 costs
Jan Hubicka [Thu, 22 Dec 2022 01:16:24 +0000 (02:16 +0100)] 
Update znver4 costs

Update cost of znver4 mostly based on data measued by Agner Fog.
Compared to previous generations x87 became bit slower which is probably not
big deal (and we have minimal benchmarking coverage for it).  One interesting
improvement is reducation of FMA cost.  I also updated costs of AVX256
loads/stores  based on latencies (not throughput which is twice of avx256).
Overall AVX512 vectorization seems to improve noticeably some of TSVC
benchmarks but since internally 512 vectors are split to 256 vectors it is
somewhat risky and does not win in SPEC scores (mostly by regressing benchmarks
with loop that have small trip count like x264 and exchange), so for now I am
going to set AVX256_OPTIMAL tune but I am still playing with it.  We improved
since ZNVER1 on choosing vectorization size and also have vectorized
prologues/epilogues so it may be possible to make avx512 small win overall.

2022-12-22  Jan Hubicka  <hubicka@ucw.cz>

* config/i386/x86-tune-costs.h (znver4_cost): Upate costs of FP and SSE
moves, division multiplication, gathers, L2 cache size, and more
complex FP instrutions.

(cherry picked from commit bbe04bade0cc3b17e62c2af3d89b899367e7d2d1)

2 years agoDaily bump.
GCC Administrator [Sun, 29 Jan 2023 00:22:12 +0000 (00:22 +0000)] 
Daily bump.

2 years agoAdd AMD znver4 instruction reservations
Tejas Joshi [Tue, 8 Nov 2022 18:40:59 +0000 (00:10 +0530)] 
Add AMD znver4 instruction reservations

This adds znver4 automata units and reservations separately from other
znver automata, avoiding the insn-automata.cc size blow-up.

gcc/ChangeLog:

* common/config/i386/i386-common.cc (processor_alias_table):
Use CPU_ZNVER4 for znver4.
* config/i386/i386.md: Add znver4.md.
* config/i386/znver4.md: New.

(cherry picked from commit 72ce780a497eb3e5efe7a79ea5f21f8dd6858f7f)

2 years agoRemove znver4 instruction reservations
Tejas Joshi [Fri, 21 Oct 2022 15:35:39 +0000 (21:05 +0530)] 
Remove znver4 instruction reservations

This reverts the changes made to znver.md in:
commit bf3b532b524ecacb3202ab2c8af419ffaaab7cff

2022-10-21  Tejas Joshi <TejasSanjay.Joshi@amd.com>

gcc/ChangeLog:

* common/config/i386/i386-common.cc (processor_alias_table): Use
CPU_ZNVER3 for znver4.
* config/i386/znver.md: Remove znver4 reservations.

(cherry picked from commit d93171509aa7ca23148508b96f1c1f70b941d808)

2 years agoEnable AMD znver4 support and add instruction reservations
Tejas Joshi [Tue, 28 Jun 2022 11:03:53 +0000 (16:33 +0530)] 
Enable AMD znver4 support and add instruction reservations

2022-09-28  Tejas Joshi <TejasSanjay.Joshi@amd.com>

gcc/ChangeLog:

* common/config/i386/cpuinfo.h (get_amd_cpu): Recognize znver4.
* common/config/i386/i386-common.cc (processor_names): Add znver4.
(processor_alias_table): Add znver4 and modularize old znvers.
* common/config/i386/i386-cpuinfo.h (processor_subtypes):
AMDFAM19H_ZNVER4.
* config.gcc (x86_64-*-* |...): Likewise.
* config/i386/driver-i386.cc (host_detect_local_cpu): Let
-march=native recognize znver4 cpus.
* config/i386/i386-c.cc (ix86_target_macros_internal): Add znver4.
* config/i386/i386-options.cc (m_ZNVER4): New definition.
(m_ZNVER): Include m_ZNVER4.
(processor_cost_table): Add znver4.
* config/i386/i386.cc (ix86_reassociation_width): Likewise.
* config/i386/i386.h (processor_type): Add PROCESSOR_ZNVER4.
(PTA_ZNVER1): New definition.
(PTA_ZNVER2): Likewise.
(PTA_ZNVER3): Likewise.
(PTA_ZNVER4): Likewise.
* config/i386/i386.md (define_attr "cpu"): Add znver4 and rename
md file.
* config/i386/x86-tune-costs.h (znver4_cost): New definition.
* config/i386/x86-tune-sched.cc (ix86_issue_rate): Add znver4.
(ix86_adjust_cost): Likewise.
* config/i386/znver1.md: Rename to znver.md.
* config/i386/znver.md: Add new reservations for znver4.
* doc/extend.texi: Add details about znver4.
* doc/invoke.texi: Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/i386/funcspec-56.inc: Handle new march.
* g++.target/i386/mv29.C: Likewise.

(cherry picked from commit bf3b532b524ecacb3202ab2c8af419ffaaab7cff)

2 years agoFortran: ICE in transformational_result [PR108529]
Harald Anlauf [Tue, 24 Jan 2023 20:39:43 +0000 (21:39 +0100)] 
Fortran: ICE in transformational_result [PR108529]

gcc/fortran/ChangeLog:

PR fortran/108529
* simplify.cc (simplify_transformation): Do not try to simplify
transformational intrinsic when the ARRAY argument has a NULL shape.

gcc/testsuite/ChangeLog:

PR fortran/108529
* gfortran.dg/pr108529.f90: New test.

(cherry picked from commit 6c96382eed96a9285611f2e3e2e59557094172b8)

2 years agoFortran: error recovery for bad initializers of implied-shape arrays [PR106209]
Harald Anlauf [Thu, 14 Jul 2022 20:24:55 +0000 (22:24 +0200)] 
Fortran: error recovery for bad initializers of implied-shape arrays [PR106209]

gcc/fortran/ChangeLog:

PR fortran/106209
* decl.cc (add_init_expr_to_sym): Handle bad initializers for
implied-shape arrays.

gcc/testsuite/ChangeLog:

PR fortran/106209
* gfortran.dg/pr106209.f90: New test.

Co-authored-by: Steven G. Kargl <kargl@gcc.gnu.org>
(cherry picked from commit 748f8a8b145dde59c7b63aa68b5a59515b7efc49)

2 years agoFortran: fix ICE in get_expr_storage_size [PR108421]
Harald Anlauf [Mon, 16 Jan 2023 20:30:56 +0000 (21:30 +0100)] 
Fortran: fix ICE in get_expr_storage_size [PR108421]

gcc/fortran/ChangeLog:

PR fortran/108421
* interface.cc (get_expr_storage_size): Check that we actually have
an integer value before trying to extract it with mpz_get_si.

gcc/testsuite/ChangeLog:

PR fortran/108421
* gfortran.dg/pr108421.f90: New test.

(cherry picked from commit a75760374ee54768e5fd6a27080698bfbbd041ab)

2 years agoFortran: fix ICE in check_charlen_present [PR108420]
Harald Anlauf [Mon, 16 Jan 2023 20:41:09 +0000 (21:41 +0100)] 
Fortran: fix ICE in check_charlen_present [PR108420]

gcc/fortran/ChangeLog:

PR fortran/108420
* iresolve.cc (check_charlen_present): Preserve character length if
there is no array constructor.

gcc/testsuite/ChangeLog:

PR fortran/108420
* gfortran.dg/pr108420.f90: New test.

(cherry picked from commit e6669c0a50ed8aee9e5997d61e6271668d149218)

2 years agoFortran: avoid ICE on invalid array subscript triplets [PR108501]
Harald Anlauf [Mon, 23 Jan 2023 20:19:03 +0000 (21:19 +0100)] 
Fortran: avoid ICE on invalid array subscript triplets [PR108501]

gcc/fortran/ChangeLog:

PR fortran/108501
* interface.cc (get_expr_storage_size): Check array subscript triplets
that we actually have integer values before trying to extract with
mpz_get_si.

gcc/testsuite/ChangeLog:

PR fortran/108501
* gfortran.dg/pr108501.f90: New test.

(cherry picked from commit 771d793df1622a476e1cf8d05f0a6aee350fa56b)

2 years agoFortran: fix NULL pointer dereference in gfc_check_dependency [PR108502]
Harald Anlauf [Mon, 23 Jan 2023 21:13:44 +0000 (22:13 +0100)] 
Fortran: fix NULL pointer dereference in gfc_check_dependency [PR108502]

gcc/fortran/ChangeLog:

PR fortran/108502
* dependency.cc (gfc_check_dependency): Prevent NULL pointer
dereference while recursively checking expressions.

gcc/testsuite/ChangeLog:

PR fortran/108502
* gfortran.dg/pr108502.f90: New test.

(cherry picked from commit 51767f31878a95161142254dca7119b409699670)

2 years agoDaily bump.
GCC Administrator [Sat, 28 Jan 2023 00:22:03 +0000 (00:22 +0000)] 
Daily bump.

2 years agoarm: Fix MVE's vcmp vector-scalar patterns [PR107987]
Andre Vieira [Tue, 6 Dec 2022 12:06:33 +0000 (12:06 +0000)] 
arm: Fix MVE's vcmp vector-scalar patterns [PR107987]

This patch surrounds the scalar operand of the MVE vcmp patterns with a
vec_duplicate to ensure both operands of the comparision operator have the same
(vector) mode.

gcc/ChangeLog:

PR target/107987
* config/arm/mve.md (mve_vcmp<mve_cmp_op>q_n_<mode>,
@mve_vcmp<mve_cmp_op>q_n_f<mode>): Apply vec_duplicate to scalar
operand.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/pr107987.c: New test.

(cherry picked from commit ed34c3bc3428bce663d42e9eeda10bc0c5d56d5c)

2 years agoDaily bump.
GCC Administrator [Fri, 27 Jan 2023 00:22:41 +0000 (00:22 +0000)] 
Daily bump.

2 years agoopts: SANITIZE_ADDRESS wrongly cleared [PR108543]
Marek Polacek [Wed, 25 Jan 2023 22:19:54 +0000 (17:19 -0500)] 
opts: SANITIZE_ADDRESS wrongly cleared [PR108543]

Here we crash on a null fndecl ultimately because we haven't defined
the built-ins described in sanitizer.def.  So
builtin_decl_explicit (BUILT_IN_ASAN_POINTER_SUBTRACT);
returns NULL_TREE, causing an ICE later.

DEF_SANITIZER_BUILTIN only actually defines the built-ins when flag_sanitize
has SANITIZE_ADDRESS, or some of the other SANITIZE_*, but it doesn't check
SANITIZE_KERNEL_ADDRESS or SANITIZE_USER_ADDRESS.  Unfortunately, with
-fsanitize=address -fno-sanitize=kernel-address
or
-fsanitize=kernel-address -fno-sanitize=address
SANITIZE_ADDRESS ends up being unset from flag_sanitize even though
_USER/_KERNEL are set.  That's because -fsanitize=address means
SANITIZE_ADDRESS | SANITIZE_USER_ADDRESS and -fsanitize=kernel-address
is SANITIZE_ADDRESS | SANITIZE_KERNEL_ADDRESS but parse_sanitizer_options
does
  flags &= ~sanitizer_opts[i].flag;
so the subsequent -fno- unsets SANITIZE_ADDRESS.  Then no sanitizer
built-ins are actually defined.

I'm not sure why SANITIZE_ADDRESS isn't just SANITIZE_USER_ADDRESS |
SANITIZE_KERNEL_ADDRESS, I don't think we need 3 bits.

PR middle-end/108543

gcc/ChangeLog:

* opts.cc (parse_sanitizer_options): Don't always clear SANITIZE_ADDRESS
if it was previously set.

gcc/testsuite/ChangeLog:

* c-c++-common/asan/pointer-subtract-5.c: New test.
* c-c++-common/asan/pointer-subtract-6.c: New test.
* c-c++-common/asan/pointer-subtract-7.c: New test.
* c-c++-common/asan/pointer-subtract-8.c: New test.

(cherry picked from commit a82ce9c8d155ecda2d1c647d5c588f29e21ef4a3)

2 years agopru: Fix CLZ expansion for QI and HI modes
Dimitar Dimitrov [Sat, 21 Jan 2023 16:10:59 +0000 (18:10 +0200)] 
pru: Fix CLZ expansion for QI and HI modes

The recent gcc.dg/tree-ssa/clz-char.c test case failed for PRU target,
exposing a wrong code generation bug in the PRU backend.  The "clz"
pattern did not produce correct output for QI and HI input operand
modes.  SI mode is ok.

The "clz" pattern is expanded to an LMBD instruction to get the
left-most bit position having value "1".  In turn, to get the correct
"clz" value, that bit position must be subtracted from the MSB bit
position of the input operand.  The old behaviour of hard-coding 31
for MSB bit position is wrong.

The LMBD instruction returns 32 if input operand is zero, irrespective
of its register mode.  This maps nicely for SI mode, where the "clz"
pattern outputs -1.  It also leads to peculiar (but valid!) output
values from the "clz" pattern for QI and HI zero-valued inputs.

The corresponding commit in trunk contains two new test cases, which
have been removed here because they depend on r13-5195-g4798080d4a3530.
Regtested for pru-unknown-elf.

gcc/ChangeLog:

* config/pru/pru.h (CLZ_DEFINED_VALUE_AT_ZERO): Fix value for QI
and HI input modes.
* config/pru/pru.md (clz): Fix generated code for QI and HI
input modes.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>
(cherry picked from commit c517295940a23db8ca165dfd5d0edea4457eda49)

2 years agolibgomp.texi: Impl. status - non-rect loop nest only partial
Tobias Burnus [Thu, 26 Jan 2023 10:14:58 +0000 (11:14 +0100)] 
libgomp.texi: Impl. status - non-rect loop nest only partial

libgomp/
* libgomp.texi (OpenMP 5.0): Set non-rectangular
loop nest back to 'P' as Fortran support is incomplete.

(cherry picked from commit 20552407ae11b61fccb46b3e96a8814e790254e7)

2 years agoinstall.texi: Bump newlib version for nvptx + gcn
Tobias Burnus [Thu, 26 Jan 2023 10:13:52 +0000 (11:13 +0100)] 
install.texi: Bump newlib version for nvptx + gcn

Before, newlib 3.2 was required for amdgcn and 3.1 for nvptx.
Now recommended is 4.3.0 which was just released on 2023-01-20.

While currently the old versions would work fine, upcoming GCC
changes depend on a newer newlib. Thus, the minimal version is
bumped instead of just recommending the new version.

For GCN, the bump is in preparation for permitting non-threadlocal
stack variables and vectorized math functions - both scheduled for
GCC 13 and added to newlib in 4.3.0.

For nvptx, this includes an emulated clock (commit 6bb96d13a),
a calloc fix (5fca4e0f1) and changes to permit libgfortran to be
compiled with I/O support instead of only in minimal mode.
(Patch approved for GCC 13 but pending on a nvtpx patch,
which for which review is pending.)

gcc/ChangeLog:

* doc/install.texi (amdgcn, nvptx): Require newlib 4.3.0.

(cherry picked from commit e94e9944f59b00de455bb719fd0c5281c5509be6)

2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Thu, 26 Jan 2023 10:08:00 +0000 (11:08 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9069-g6484fc2bf682ccf50c11675773cf72d32a079426 (26th Jan 2023)

2 years agorestrict gcc.dg/pr107554.c to 64bit platforms
Richard Biener [Mon, 14 Nov 2022 07:08:26 +0000 (08:08 +0100)] 
restrict gcc.dg/pr107554.c to 64bit platforms

The following avoids exceeding the maximum object size on 32bit
platforms.

* gcc.dg/pr107554.c: Restrict to lp64.

(cherry picked from commit e7ebdf51ea514ad0b2272ecfa97d6ec72a527e40)

2 years agoDaily bump.
GCC Administrator [Thu, 26 Jan 2023 00:21:48 +0000 (00:21 +0000)] 
Daily bump.

2 years agoaarch64: fix warning emission for ABI break since GCC 9.1
Christophe Lyon [Tue, 14 Jun 2022 21:08:33 +0000 (21:08 +0000)] 
aarch64: fix warning emission for ABI break since GCC 9.1

While looking at PR 105549, which is about fixing the ABI break
introduced in GCC 9.1 in parameter alignment with bit-fields, we
noticed that the GCC 9.1 warning is not emitted in all the cases where
it should be.  This patch fixes that and the next patch in the series
fixes the GCC 9.1 break.

We split this into two patches since patch #2 introduces a new ABI
break starting with GCC 13.1.  This way, patch #1 can be back-ported
to release branches if needed to fix the GCC 9.1 warning issue.

The main idea is to add a new global boolean that indicates whether
we're expanding the start of a function, so that aarch64_layout_arg
can emit warnings for callees as well as callers.  This removes the
need for aarch64_function_arg_boundary to warn (with its incomplete
information).  However, in the first patch there are still cases where
we emit warnings were we should not; this is fixed in patch #2 where
we can distinguish between GCC 9.1 and GCC.13.1 ABI breaks properly.

The fix in aarch64_function_arg_boundary (replacing & with &&) looks
like an oversight of a previous commit in this area which changed
'abi_break' from a boolean to an integer.

We also take the opportunity to fix the comment above
aarch64_function_arg_alignment since the value of the abi_break
parameter was changed in a previous commit, no longer matching the
description.

2022-11-28  Christophe Lyon  <christophe.lyon@arm.com>
    Richard Sandiford  <richard.sandiford@arm.com>

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_function_arg_alignment): Fix
comment.
(aarch64_layout_arg): Factorize warning conditions.
(aarch64_function_arg_boundary): Fix typo.
* function.cc (currently_expanding_function_start): New variable.
(expand_function_start): Handle
currently_expanding_function_start.
* function.h (currently_expanding_function_start): Declare.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: New
test.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: New
test.
* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: New test.
* gcc.target/aarch64/bitfield-abi-warning.h: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: New
test.
* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: New
test.
* g++.target/aarch64/bitfield-abi-warning-align8-O2.C: New test.
* g++.target/aarch64/bitfield-abi-warning.h: New test.

(cherry picked from commit 3df1a115be22caeab3ffe7afb12e71adb54ff132)

2 years agoDaily bump.
GCC Administrator [Wed, 25 Jan 2023 00:22:00 +0000 (00:22 +0000)] 
Daily bump.

2 years agotree-optimization/108164 - undefined overflow with IV vectorization
Richard Biener [Mon, 19 Dec 2022 13:55:45 +0000 (14:55 +0100)] 
tree-optimization/108164 - undefined overflow with IV vectorization

vect_update_ivs_after_vectorizer can end up emitting a signed
IV update when the loop body performed an unsigned computation.
The following makes sure to perform that update in the type
of the loop update type to avoid undefined behavior on overflow.

PR tree-optimization/108164
* tree-vect-loop-manip.cc (vect_update_ivs_after_vectorizer):
Perform vect_step_op_add update in the appropriate type.

* gcc.dg/pr108164.c: New testcase.

(cherry picked from commit ec459469f8a75d96a9b26694554efcc900d411de)

2 years agotree-optimization/108076 - if-conversion and forced labels
Richard Biener [Mon, 12 Dec 2022 16:52:46 +0000 (17:52 +0100)] 
tree-optimization/108076 - if-conversion and forced labels

When doing if-conversion we simply throw away labels without checking
whether they are possibly targets of non-local gotos or have their
address taken.  The following rectifies this and refuses to if-convert
such loops.

PR tree-optimization/108076
* tree-if-conv.cc (if_convertible_loop_p_1): Reject blocks
with non-local or forced labels that we later remove
labels from.

* gcc.dg/torture/pr108076.c: New testcase.

(cherry picked from commit b4fddbe9592e9feb37ce567d90af822b75995531)

2 years agomiddle-end/107994 - ICE after error with comparison gimplification
Richard Biener [Wed, 21 Dec 2022 11:27:58 +0000 (12:27 +0100)] 
middle-end/107994 - ICE after error with comparison gimplification

The following avoids passing down error_mark_node to fold_convert.

PR middle-end/107994
* gimplify.cc (gimplify_expr): Catch errorneous comparison
operand.

(cherry picked from commit 845b514e8a150447ba041294586af76a6ac05158)

2 years agotree-optimization/107554 - fix ICE in stlen optimization
Richard Biener [Fri, 11 Nov 2022 13:28:52 +0000 (14:28 +0100)] 
tree-optimization/107554 - fix ICE in stlen optimization

The following fixes a wrongly typed variable causing an ICE.

PR tree-optimization/107554
* tree-ssa-strlen.cc (strlen_pass::count_nonzero_bytes):
Use unsigned HOST_WIDE_INT type for the strlen.

* gcc.dg/pr107554.c: New testcase.

Co-Authored-By: Nikita Voronov <nik_1357@mail.ru>
(cherry picked from commit 81de4037454275f8ed6d858fbc129e832c6147ef)

2 years agodriver: fix environ corruption after putenv() [PR106624]
Sergei Trofimovich [Tue, 16 Aug 2022 11:35:07 +0000 (12:35 +0100)] 
driver: fix environ corruption after putenv() [PR106624]

The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out
jobserver_active_p" slightly changed `putenv()` use from allocating
to non-allocating:

    -xputenv (concat ("MAKEFLAGS=", dup, NULL));
    +xputenv (jinfo.skipped_makeflags.c_str ());

`xputenv()` (and `putenv()`) don't copy strings and only store the
pointer in the `environ` global table. As a result `environ` got
corrupted as soon as `jinfo.skipped_makeflags` store got deallocated.

This started causing bootstrap crashes in `execv()` calls:

    xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address

The change restores memory allocation for `xputenv()` argument.

gcc/

PR driver/106624
* gcc.cc (driver::detect_jobserver): Allocate storage xputenv()
argument using xstrdup().

(cherry picked from commit 2b403297b111c990c331b5bbb6165b061ad2259b)

2 years agoUpdate 'libgomp/libgomp.texi' for 'nvptx, libgfortran: Switch out of "minimal" mode...
Thomas Schwinge [Tue, 24 Jan 2023 09:47:12 +0000 (10:47 +0100)] 
Update 'libgomp/libgomp.texi' for 'nvptx, libgfortran: Switch out of "minimal" mode': 'libgomp/ChangeLog.omp'

2 years agoUpdate 'libgomp/libgomp.texi' for 'nvptx, libgfortran: Switch out of "minimal" mode'
Thomas Schwinge [Tue, 24 Jan 2023 09:29:01 +0000 (10:29 +0100)] 
Update 'libgomp/libgomp.texi' for 'nvptx, libgfortran: Switch out of "minimal" mode'

libgomp/
* libgomp.texi (nvptx): Update for
'nvptx, libgfortran: Switch out of "minimal" mode'.

2 years agoMake 'libgcc/config/nvptx/crt0.c' build '--without-headers'
Thomas Schwinge [Tue, 24 Jan 2023 08:49:34 +0000 (09:49 +0100)] 
Make 'libgcc/config/nvptx/crt0.c' build '--without-headers'

..., where it currently fails:

    [...]/libgcc/config/nvptx/crt0.c:22:10: fatal error: stdlib.h: No such file or directory
       22 | #include <stdlib.h>
          |          ^~~~~~~~~~

Fix-up for "nvptx: Support global constructors/destructors via 'collect2'".

libgcc/
* config/nvptx/crt0.c [!HAVE_STDLIB_H]: Don't '#include <stdlib.h>'.
(atexit): Prototype.

2 years agoDaily bump.
GCC Administrator [Tue, 24 Jan 2023 00:21:31 +0000 (00:21 +0000)] 
Daily bump.

2 years agoFortran: error recovery for invalid CLASS component [PR108434]
Harald Anlauf [Wed, 18 Jan 2023 21:13:29 +0000 (22:13 +0100)] 
Fortran: error recovery for invalid CLASS component [PR108434]

gcc/fortran/ChangeLog:

PR fortran/108434
* expr.cc (class_allocatable): Prevent NULL pointer dereference
or invalid read.
(class_pointer): Likewise.

gcc/testsuite/ChangeLog:

PR fortran/108434
* gfortran.dg/pr108434.f90: New test.

(cherry picked from commit 117848f425a3c0eda85517b4bdaf2ebe3bc705c2)

2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Mon, 23 Jan 2023 08:28:53 +0000 (09:28 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9058-ge1357577e6e39430869e294f94c2c547717b960f (23rd Jan 2023)

2 years agoPR 106101: IBM zSystems: Fix strict_low_part problem
Andreas Krebbel [Mon, 23 Jan 2023 07:56:05 +0000 (08:56 +0100)] 
PR 106101: IBM zSystems: Fix strict_low_part problem

This avoids generating illegal (strict_low_part (reg ...)) RTXs. This
required two changes:

1. Do not use gen_lowpart to generate the inner expression of a
STRICT_LOW_PART.  gen_lowpart might fold the SUBREG either because
there is already a paradoxical subreg or because it can directly be
applied to the register. A new wrapper function makes sure that we
always end up having an actual SUBREG.

2. Change the movstrict patterns to enforce a SUBREG as inner operand
of the STRICT_LOW_PARTs.  The new predicate introduced for the
destination operand requires a SUBREG expression with a
register_operand as inner operand.  However, since reload strips away
the majority of the SUBREGs we have to accept single registers as well
once we reach reload.

Bootstrapped and regression tested on IBM zSystems 64 bit.

gcc/ChangeLog:

PR target/106101
* config/s390/predicates.md (subreg_register_operand): New
predicate.
* config/s390/s390-protos.h (s390_gen_lowpart_subreg): New
function prototype.
* config/s390/s390.cc (s390_gen_lowpart_subreg): New function.
(s390_expand_insv): Use s390_gen_lowpart_subreg instead of
gen_lowpart.
* config/s390/s390.md ("*get_tp_64", "*zero_extendhisi2_31")
("*zero_extendqisi2_31", "*zero_extendqihi2_31"): Likewise.
("movstrictqi", "movstricthi", "movstrictsi"): Use the
subreg_register_operand predicate instead of register_operand.

gcc/testsuite/ChangeLog:

PR target/106101
* gcc.c-torture/compile/pr106101.c: New test.

(cherry picked from commit 585a21bab3ec688c2039bff2922cc372d8558283)

2 years agoDaily bump.
GCC Administrator [Mon, 23 Jan 2023 00:21:32 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sun, 22 Jan 2023 00:21:24 +0000 (00:21 +0000)] 
Daily bump.

2 years agoBackported from master:
Jerry DeLisle [Sat, 21 Jan 2023 22:58:05 +0000 (14:58 -0800)] 
Backported from master:

PR fortran/106731

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_auto_array_allocation): Remove gcc_assert (!TREE_STATIC()).

gcc/testsuite/ChangeLog:

* gfortran.dg/pr106731.f90: New test.

2 years agoDaily bump.
GCC Administrator [Sat, 21 Jan 2023 00:20:54 +0000 (00:20 +0000)] 
Daily bump.

2 years agonvptx, libgfortran: Switch out of "minimal" mode
Thomas Schwinge [Wed, 21 Sep 2022 16:58:34 +0000 (18:58 +0200)] 
nvptx, libgfortran: Switch out of "minimal" mode

..., in order to enable (portions of) Fortran I/O, for example.

libgfortran/ChangeLog:

* configure: Regenerate.
* configure.ac: No longer set LIBGFOR_MINIMAL for nvptx.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/target-print-1.f90: Adjust.
* testsuite/libgomp.fortran/target-print-1-nvptx.f90: Remove.
* testsuite/libgomp.oacc-fortran/print-1.f90: Adjust.
* testsuite/libgomp.oacc-fortran/print-1-nvptx.f90: Remove.
* testsuite/libgomp.oacc-fortran/error_stop-2.f: Adjust.
* testsuite/libgomp.oacc-fortran/stop-2.f: Likewise.

Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
2 years agonvptx, libgcc: Stub unwinding implementation
Thomas Schwinge [Wed, 21 Sep 2022 16:58:34 +0000 (18:58 +0200)] 
nvptx, libgcc: Stub unwinding implementation

Adding stub '_Unwind_Backtrace', '_Unwind_GetIPInfo' functions is necessary
for linking libbacktrace, as a normal (non-'LIBGFOR_MINIMAL') configuration
of libgfortran wants to do, for example.

The file 'libgcc/config/nvptx/unwind-nvptx.c' is copied from
'libgcc/config/gcn/unwind-gcn.c'.

libgcc/ChangeLog:

* config/nvptx/t-nvptx: Add unwind-nvptx.c.
* config/nvptx/unwind-nvptx.c: New file.

Co-authored-by: Andrew Stubbs <ams@codesourcery.com>
2 years agonvptx: Support global constructors/destructors via 'collect2' for offloading
Thomas Schwinge [Wed, 30 Nov 2022 21:09:35 +0000 (22:09 +0100)] 
nvptx: Support global constructors/destructors via 'collect2' for offloading

This extends "nvptx: Support global constructors/destructors via 'collect2'"
for offloading.

libgcc/
* config/nvptx/crtstuff.c ["mgomp"]
(__do_global_ctors__entry__mgomp)
(__do_global_dtors__entry__mgomp): New.
[!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry):
New.
libgomp/
* plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New.
(nvptx_close_device, GOMP_OFFLOAD_load_image)
(GOMP_OFFLOAD_unload_image): Call it.

2 years agonvptx: Support global constructors/destructors via 'collect2'
Thomas Schwinge [Sun, 13 Nov 2022 13:19:30 +0000 (14:19 +0100)] 
nvptx: Support global constructors/destructors via 'collect2'

The function attributes 'constructor', 'destructor', and 'init_priority' now
work, as do the C++ features making use of this.  Test cases with effective
target 'global_constructor' and 'init_priority' now generally work, and
'check-gcc-c++' test results greatly improve; no more "sorry, unimplemented:
global constructors not supported on this target".

This depends on <https://github.com/MentorEmbedded/nvptx-tools/pull/40> "'nm'"
generally, and for global destructors support: newlib
<https://inbox.sourceware.org/newlib/878rjqaku5.fsf@dem-tschwing-1.ger.mentorg.com/>
"nvptx: Implement '_exit' instead of 'exit'".

gcc/
* collect2.cc (write_c_file_glob): Allow for
'COLLECT2_MAIN_REFERENCE' override.
* config.gcc <case ${target} in nvptx-*>: Set 'use_collect2=yes'.
* config/nvptx/nvptx.h: Adjust.
gcc/testsuite/
* gcc.dg/no_profile_instrument_function-attr-1.c: GCC/nvptx is
'NO_DOT_IN_LABEL' but not 'NO_DOLLAR_IN_LABEL', so '$' may apper
in identifiers.
* lib/target-supports.exp
(check_effective_target_global_constructor): Enable for nvptx.
libgcc/
* config.host <case ${host} in nvptx-*>: Add 'crtbegin.o',
'crtend.o' to 'extra_parts'.
* config/nvptx/crt0.c: Invoke '__do_global_ctors',
'__do_global_dtors'.
* config/nvptx/crtstuff.c: New.
* config/nvptx/t-nvptx: Adjust.

2 years agonvptx: Prevent emitting duplicate declarations for '__nvptx_stacks', '__nvptx_uni'
Thomas Schwinge [Mon, 19 Dec 2022 16:19:19 +0000 (17:19 +0100)] 
nvptx: Prevent emitting duplicate declarations for '__nvptx_stacks', '__nvptx_uni'

As I have reported to Nvidia in 2022-12-01 'NVIDIA Incident Report (3891704):
ptxas: Duplicate declaration error: "cannot be resolved by a '.static'"',
'ptxas' has an inscrutable error mode for duplicate declarations:

    ptxas softstack-decl-1.o, line 11; error   : '.extern' variable '__nvptx_stacks' cannot be resolved by a '.static'
    ptxas fatal   : Ptx assembly aborted due to errors
    nvptx-as: ptxas returned 255 exit status

    ptxas uniform-simt-decl-1.o, line 12; error   : '.extern' variable '__nvptx_uni' cannot be resolved by a '.static'
    ptxas fatal   : Ptx assembly aborted due to errors
    nvptx-as: ptxas returned 255 exit status

This is inscrutable, because (a) what is "cannot be resolved by a '.static'"
supposed to tell me (there is no '.static' in PTX?), and (b) why arent't
repeated declaration just verified to match the first, but otherwise a no-op
(like in other programming languages)?

gcc/
* config/nvptx/nvptx.cc (nvptx_assemble_undefined_decl): Notice
'__nvptx_stacks', '__nvptx_uni' declarations.
(nvptx_file_end): Don't emit duplicate declarations for those.
gcc/testsuite/
* gcc.target/nvptx/softstack-decl-1.c: Make 'dg-do assemble',
adjust.
* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.

2 years agoAdd 'gcc.target/nvptx/softstack-decl-1.c', 'gcc.target/nvptx/uniform-simt-decl-1.c'
Thomas Schwinge [Mon, 19 Dec 2022 16:10:52 +0000 (17:10 +0100)] 
Add 'gcc.target/nvptx/softstack-decl-1.c', 'gcc.target/nvptx/uniform-simt-decl-1.c'

... to document the status quo re implicit (via 'need_softstack_decl',
'need_unisimt_decl') and explicit declarations of '__nvptx_stacks',
'__nvptx_uni'.

gcc/testsuite/
* gcc.target/nvptx/softstack-decl-1.c: New.
* gcc.target/nvptx/uniform-simt-decl-1.c: Likewise.

2 years agonvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp execution
Thomas Schwinge [Mon, 12 Dec 2022 21:05:37 +0000 (22:05 +0100)] 
nvptx: Make 'nvptx_uniform_warp_check' fit for non-full-warp execution

For example, this allows for '-muniform-simt' code to be executed
single-threaded, which currently fails (device-side 'trap'), as the 0xffffffff
mask isn't correct if not all 32 threads of a warp are active.  The same
issue/fix, I suppose but have not verified, would apply if we were to allow for
OpenACC 'vector_length' smaller than 32, for example for OpenACC 'serial'.

We use 'nvptx_uniform_warp_check' only for PTX ISA version less than 6.0.
Otherwise we're using 'nvptx_warpsync', which emits 'bar.warp.sync 0xffffffff',
which evidently appears to do the right thing.  (I've tested '-muniform-simt'
code executing single-threaded.)

gcc/
* config/nvptx/nvptx.md (nvptx_uniform_warp_check): Make fit for
non-full-warp execution.
gcc/testsuite/
* gcc.target/nvptx/nvptx.exp
(check_effective_target_default_ptx_isa_version_at_least_6_0):
New.
* gcc.target/nvptx/uniform-simt-5.c: New.
libgomp/
* plugin/plugin-nvptx.c (nvptx_exec): Assert what we know about
'blockDimX'.

2 years agoClean up after newlib "nvptx: In offloading execution, map '_exit' to 'abort' [GCC...
Thomas Schwinge [Thu, 19 Jan 2023 19:25:45 +0000 (20:25 +0100)] 
Clean up after newlib "nvptx: In offloading execution, map '_exit' to 'abort' [GCC PR85463]"

PR target/85463
libgfortran/
* runtime/minimal.c [__nvptx__] (exit): Don't override.
libgomp/
* config/nvptx/error.c (exit): Don't override.
* testsuite/libgomp.oacc-fortran/error_stop-1.f: Update.
* testsuite/libgomp.oacc-fortran/error_stop-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/error_stop-3.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-1.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-2.f: Likewise.
* testsuite/libgomp.oacc-fortran/stop-3.f: Likewise.

2 years agoFix 'libgomp.c/simd-math-1.c' configuration, again
Thomas Schwinge [Fri, 20 Jan 2023 16:17:21 +0000 (17:17 +0100)] 
Fix 'libgomp.c/simd-math-1.c' configuration, again

Tobias pointed out that as of my recent
og12 commit e7d4bcb974915bfe95be6c385641fc66a4201581
"Fix 'libgomp.c/simd-math-1.c' configuration",
in GCC configurations without GCN offloading configured, we'd get:

    xgcc: error: GCC is not configured to support 'amdgcn-amdhsa' as '-foffload=' argument

("Interestingly", GCC doesn't complain for '-foffload-options=-lm' if there are
no offload targets configured...)

libgomp/
* testsuite/libgomp.c/simd-math-1.c: Fix configuration, again.

2 years agoForce '--param openacc-kernels=parloops' in 'libgomp.oacc-c-c++-common/abort-3.c'
Thomas Schwinge [Tue, 17 Jan 2023 08:56:15 +0000 (09:56 +0100)] 
Force '--param openacc-kernels=parloops' in 'libgomp.oacc-c-c++-common/abort-3.c'

libgomp/
* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Force
'--param openacc-kernels=parloops'.

2 years agoFix 'libgomp.c/simd-math-1.c' configuration
Thomas Schwinge [Sat, 14 Jan 2023 09:28:09 +0000 (10:28 +0100)] 
Fix 'libgomp.c/simd-math-1.c' configuration

If nvptx offloading is configured in addition to GCN, we see:

    FAIL: libgomp.c/simd-math-1.c (test for excess errors)
    UNRESOLVED: libgomp.c/simd-math-1.c compilation failed to produce executable

    x86_64-pc-linux-gnu-accel-nvptx-none-gcc: error: unrecognized command-line option '-mstack-size=3000000'

Thus, restrict that ooption to GCN offloading compilation, and on the other
hand, there's no reason to skip this test for non-GCN offloading execution:
even if not SIMD-vectorized there, we still benefit from correctness testing.

libgomp/
* testsuite/libgomp.c/simd-math-1.c: Fix configuration.

2 years agoDaily bump.
GCC Administrator [Fri, 20 Jan 2023 00:21:02 +0000 (00:21 +0000)] 
Daily bump.

2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Thu, 19 Jan 2023 20:23:08 +0000 (21:23 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9052-g61ef24af3ce8ec9c5eb65770f8047d98f42a93bf (19th Jan 2023)

2 years agoopenmp: Fix up OpenMP expansion of non-rectangular loops [PR108459]
Jakub Jelinek [Thu, 19 Jan 2023 20:22:22 +0000 (21:22 +0100)] 
openmp: Fix up OpenMP expansion of non-rectangular loops [PR108459]

expand_omp_for_init_counts was using for the case where collapse(2)
inner loop has init expression dependent on non-constant multiple of
the outer iterator and the condition upper bound expression doesn't
depend on the outer iterator fold_unary (NEGATE_EXPR, ...).  This
will just return NULL if it can't be folded, we need fold_build1
instead.

2023-01-19  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/108459
* omp-expand.cc (expand_omp_for_init_counts): Use fold_build1 rather
than fold_unary for NEGATE_EXPR.

* testsuite/libgomp.c/pr108459.c: New test.

(cherry picked from commit 46644ec99cb355845b23bb1d02775c057ed8ee88)

2 years agoDaily bump.
GCC Administrator [Thu, 19 Jan 2023 00:21:17 +0000 (00:21 +0000)] 
Daily bump.

2 years agolibstdc++: Avoid recursion in __nothrow_wait_cv::wait [PR105730]
Jonathan Wakely [Thu, 22 Dec 2022 09:56:47 +0000 (09:56 +0000)] 
libstdc++: Avoid recursion in __nothrow_wait_cv::wait [PR105730]

The commit r12-5877-g9e18a25331fa25 removed the incorrect
noexcept-specifier from std::condition_variable::wait and gave the new
symbol version @@GLIBCXX_3.4.30. It also redefined the original symbol
std::condition_variable::wait(unique_lock<mutex>&)@GLIBCXX_3.4.11 as an
alias for a new symbol, __gnu_cxx::__nothrow_wait_cv::wait, which still
has the incorrect noexcept guarantee. That __nothrow_wait_cv::wait is
just a wrapper around the real condition_variable::wait which adds
noexcept and so terminates on a __forced_unwind exception.

This doesn't work on uclibc, possibly due to a dynamic linker bug. When
__nothrow_wait_cv::wait calls the condition_variable::wait function it
binds to the alias symbol, which means it just calls itself recursively
until the stack overflows.

This change avoids the possibility of a recursive call by changing the
__nothrow_wait_cv::wait function so that instead of calling
condition_variable::wait it re-implements it. This requires accessing
the private _M_cond member of condition_variable, so we need to use the
trick of instantiating a template with the member-pointer of the private
member.

libstdc++-v3/ChangeLog:

PR libstdc++/105730
* src/c++11/compatibility-condvar.cc (__nothrow_wait_cv::wait):
Access private data member of base class and call its wait
member.

(cherry picked from commit ee4af2ed0b7322884ec4ff537564683c3749b813)

2 years agoDaily bump.
GCC Administrator [Wed, 18 Jan 2023 00:21:32 +0000 (00:21 +0000)] 
Daily bump.

2 years agolibgomp: Add forgotten Changelog.omp entries
Andrew Stubbs [Tue, 17 Jan 2023 11:22:21 +0000 (11:22 +0000)] 
libgomp: Add forgotten Changelog.omp entries

2 years agoDaily bump.
GCC Administrator [Tue, 17 Jan 2023 00:21:46 +0000 (00:21 +0000)] 
Daily bump.

2 years agoAdd cpplib ka.po
Joseph Myers [Mon, 16 Jan 2023 22:44:28 +0000 (22:44 +0000)] 
Add cpplib ka.po

* ka.po: New.

2 years agolibstdc++: Unblock atomic wait on non-futex platforms [PR106183]
Jonathan Wakely [Thu, 28 Jul 2022 15:15:58 +0000 (16:15 +0100)] 
libstdc++: Unblock atomic wait on non-futex platforms [PR106183]

When using a mutex and condition variable, the notifying thread needs to
increment _M_ver while holding the mutex lock, and the waiting thread
needs to re-check after locking the mutex. This avoids a missed
notification as described in the PR.

By moving the increment of _M_ver to the base _M_notify we can make the
use of the mutex local to the use of the condition variable, and
simplify the code a little. We can use a relaxed store because the mutex
already provides sequential consistency. Also we don't need to check
whether __addr == &_M_ver because we know that's always true for
platforms that use a condition variable, and so we also know that we
always need to use notify_all() not notify_one().

Reviewed-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/106183
* include/bits/atomic_wait.h (__waiter_pool_base::_M_notify):
Move increment of _M_ver here.
[!_GLIBCXX_HAVE_PLATFORM_WAIT]: Lock mutex around increment.
Use relaxed memory order and always notify all waiters.
(__waiter_base::_M_do_wait) [!_GLIBCXX_HAVE_PLATFORM_WAIT]:
Check value again after locking mutex.
(__waiter_base::_M_notify): Remove increment of _M_ver.

(cherry picked from commit af98cb88eb4be6a1668ddf966e975149bf8610b1)

2 years agoFortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107706]
Tobias Burnus [Mon, 16 Jan 2023 11:37:41 +0000 (12:37 +0100)] 
Fortran/OpenMP: Reject non-scalar 'holds' expr in 'omp assume(s)' [PR107706]

gcc/fortran/ChangeLog:

PR fortran/107706
* openmp.cc (gfc_resolve_omp_assumptions): Reject nonscalars.

gcc/testsuite/ChangeLog:

PR fortran/107706
* gfortran.dg/gomp/assume-2.f90: Update dg-error.
* gfortran.dg/gomp/assumes-2.f90: Likewise.
* gfortran.dg/gomp/assume-5.f90: New test.

(cherry picked from commit 2ce55247a8bf32985a96ed63a7a92d36746723dc)

2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Mon, 16 Jan 2023 11:26:42 +0000 (12:26 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9046-gd369eb486bdc720e4c50563226dbbb11a0226b5d (16th Jan 2023)

2 years agoDaily bump.
GCC Administrator [Mon, 16 Jan 2023 00:21:08 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sun, 15 Jan 2023 00:20:59 +0000 (00:20 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 14 Jan 2023 00:20:27 +0000 (00:20 +0000)] 
Daily bump.

2 years agolibgomp, amdgcn: Switch USM to 128-byte alignment
Andrew Stubbs [Fri, 13 Jan 2023 17:38:39 +0000 (17:38 +0000)] 
libgomp, amdgcn: Switch USM to 128-byte alignment

This should optimize cache-lines on the AMD GPUs somewhat.

libgomp/ChangeLog:

* usm-allocator.c (ALIGN): Use 128-byte alignment.

2 years agoDaily bump.
GCC Administrator [Fri, 13 Jan 2023 00:21:17 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Thu, 12 Jan 2023 00:24:45 +0000 (00:24 +0000)] 
Daily bump.

2 years agoamdgcn, libgomp: custom USM allocator
Andrew Stubbs [Tue, 13 Dec 2022 23:31:21 +0000 (23:31 +0000)] 
amdgcn, libgomp: custom USM allocator

There were problems with critical driver data sharing pages with USM data, so
this new allocator implementation moves USM to entirely different pages.

libgomp/ChangeLog:

* plugin/plugin-gcn.c: Include sys/mman.h and unistd.h.
(usm_heap_create): New function.
(struct usm_splay_tree_key_s): Delete function.
(usm_splay_compare): Delete function.
(splay_tree_prefix): Delete define.
(GOMP_OFFLOAD_usm_alloc): Use new allocator.
(GOMP_OFFLOAD_usm_free): Likewise.
(GOMP_OFFLOAD_is_usm_ptr): Likewise.
(gomp_fatal): Delete macro.
(splay_tree_c): Delete.
* usm-allocator.c: New file.

2 years agoFix problematic interaction between bitfields, unions, SSO and SRA
Eric Botcazou [Wed, 11 Jan 2023 14:58:47 +0000 (15:58 +0100)] 
Fix problematic interaction between bitfields, unions, SSO and SRA

The handling of bitfields by the SRA pass is peculiar and this must be taken
into account to support the scalar_storage_order attribute.

gcc/
PR tree-optimization/108199
* tree-sra.cc (sra_modify_expr): Deal with reverse storage order
for bit-field references.

gcc/testsuite/
* gcc.dg/sso-17.c: New test.

2 years agostrlen: do not use cond_expr for boundaries
Martin Liska [Fri, 23 Dec 2022 14:27:32 +0000 (15:27 +0100)] 
strlen: do not use cond_expr for boundaries

PR tree-optimization/108137

gcc/ChangeLog:

* tree-ssa-strlen.cc (get_range_strlen_phi): Reject anything
different from INTEGER_CST.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr108137.c: New test.

(cherry picked from commit ee6f262b87fef590729e96e999f1c3b207c251c0)

2 years agoDaily bump.
GCC Administrator [Wed, 11 Jan 2023 00:22:09 +0000 (00:22 +0000)] 
Daily bump.

2 years agoFix memory constraint on MVE v[ld/st][2/4] instructions [PR107714]
Stam Markianos-Wright [Fri, 30 Dec 2022 11:25:22 +0000 (11:25 +0000)] 
Fix memory constraint on MVE v[ld/st][2/4] instructions [PR107714]

In the M-Class Arm-ARM:

https://developer.arm.com/documentation/ddi0553/bu/?lang=en

these MVE instructions only have '!' writeback variant and at:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714

we found that the Um constraint would also allow through a
register offset writeback, resulting in an assembler error.

Here I have added a new constraint and predicate for these
instructions, which (uniquely, AFAICT), only support a `!` writeback
increment by the data size (inside the compiler this is a POST_INC).

No regressions in arm-none-eabi with MVE and MVE.FP.

gcc/ChangeLog:
PR target/107714
* config/arm/arm-protos.h (mve_struct_mem_operand): New protoype.
* config/arm/arm.cc (mve_struct_mem_operand): New function.
* config/arm/constraints.md (Ug): New constraint.
* config/arm/mve.md (mve_vst4q<mode>): Change constraint.
(mve_vst2q<mode>): Likewise.
(mve_vld4q<mode>): Likewise.
(mve_vld2q<mode>): Likewise.
* config/arm/predicates.md (mve_struct_operand): New predicate.

gcc/testsuite/ChangeLog:
PR target/107714
* gcc.target/arm/mve/intrinsics/vldst24q_reg_offset.c: New test.

(cherry picked from commit 4269a6567eb991e6838f40bda5be9e3a7972530c)

2 years agoaarch64: PR target/108140 Handle NULL target in data intrinsic expansion
Kyrylo Tkachov [Mon, 19 Dec 2022 11:16:47 +0000 (11:16 +0000)] 
aarch64: PR target/108140 Handle NULL target in data intrinsic expansion

In this PR we ICE when expanding the __rbit builtin with a NULL target rtx.
I *think* that only happens when the result is unused and hence maybe we shouldn't be expanding
any RTL at all, but the ICE here is easily fixed by deriving the mode from the type of the expression
rather than the target.

This patch does that.
Bootstrapped and tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

PR target/108140
* config/aarch64/aarch64-builtins.cc
(aarch64_expand_builtin_data_intrinsic): Handle NULL target.

gcc/testsuite/ChangeLog:

PR target/108140
* gcc.target/aarch64/acle/pr108140.c: New test.

(cherry picked from commit 98756bcbe27647f263f2b312d1d933d70cf56ba9)

2 years agoDaily bump.
GCC Administrator [Tue, 10 Jan 2023 00:22:35 +0000 (00:22 +0000)] 
Daily bump.

2 years agoUpdate cpplib eo.po
Joseph Myers [Mon, 9 Jan 2023 20:17:37 +0000 (20:17 +0000)] 
Update cpplib eo.po

* eo.po: Update.

2 years agoopenmp: Fix up finish_omp_target_clauses [PR108286]
Jakub Jelinek [Mon, 9 Jan 2023 10:54:33 +0000 (11:54 +0100)] 
openmp: Fix up finish_omp_target_clauses [PR108286]

The comment in the loop says that we shouldn't add a map clause if such
a clause exists already, but the loop was actually using OMP_CLAUSE_DECL
on any clause.  Target construct can have various clauses which don't
have OMP_CLAUSE_DECL at all (e.g. nowait, device or if) or clause
where it means something different (e.g. privatization clauses, allocate,
depend).

So, only check OMP_CLAUSE_DECL on OMP_CLAUSE_MAP clauses.

2023-01-05  Jakub Jelinek  <jakub@redhat.com>

PR c++/108286
* semantics.cc (finish_omp_target_clauses): Ignore clauses other than
OMP_CLAUSE_MAP.

* testsuite/libgomp.c++/pr108286.C: New test.

(cherry picked from commit 29c3218618ef6177dc33871b26c8fbd9b21eabe1)

2 years agoMerge branch 'releases/gcc-12' into devel/omp/gcc-12
Tobias Burnus [Mon, 9 Jan 2023 09:20:42 +0000 (10:20 +0100)] 
Merge branch 'releases/gcc-12' into devel/omp/gcc-12

Merge up to r12-9034-g4494965932fc5d005e2482bbe58cf9e138c830bd (9th Jan 2023)

2 years agoDaily bump.
GCC Administrator [Mon, 9 Jan 2023 00:21:52 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sun, 8 Jan 2023 00:21:14 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 7 Jan 2023 00:21:38 +0000 (00:21 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Fri, 6 Jan 2023 00:21:03 +0000 (00:21 +0000)] 
Daily bump.

2 years agolibstdc++: Fix std::chrono::hh_mm_ss with unsigned rep [PR108265]
Jonathan Wakely [Wed, 4 Jan 2023 16:43:51 +0000 (16:43 +0000)] 
libstdc++: Fix std::chrono::hh_mm_ss with unsigned rep [PR108265]

libstdc++-v3/ChangeLog:

PR libstdc++/108265
* include/std/chrono (hh_mm_ss): Do not use chrono::abs if
duration rep is unsigned. Remove incorrect noexcept-specifier.
* testsuite/std/time/hh_mm_ss/1.cc: Check unsigned rep. Check
floating-point representations. Check default construction.

(cherry picked from commit e36e57b032b2d70eaa1294d5921e4fd8ce12a74d)