]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
2 years agogo: Update usage of TARGET_AIX to TARGET_AIX_OS
Paul E. Murphy [Thu, 22 Jun 2023 22:53:46 +0000 (17:53 -0500)] 
go: Update usage of TARGET_AIX to TARGET_AIX_OS

TARGET_AIX is defined to a non-zero value on linux and maybe other
powerpc64le targets.  This leads to unexpected behavior such as
dropping the .go_export section when linking a shared library
on linux/powerpc64le.

Instead, use TARGET_AIX_OS to toggle AIX specific behavior.

Fixes golang/go#60798.

2023-06-22  Paul E. Murphy  <murphyp@linux.ibm.com>

gcc/go/
* go-backend.c [TARGET_AIX]: Rename and update usage to TARGET_AIX_OS.
* go-lang.c: Likewise.

(cherry picked from commit b76cd1ec361712e1ac9ca5e0246da24ea2b78916)

2 years agoRefine maskstore patterns with UNSPEC_MASKMOV.
liuhongt [Mon, 26 Jun 2023 13:07:09 +0000 (21:07 +0800)] 
Refine maskstore patterns with UNSPEC_MASKMOV.

Similar like r14-2070-gc79476da46728e

If mem_addr points to a memory region with less than whole vector size
bytes of accessible memory and k is a mask that would prevent reading
the inaccessible bytes from mem_addr, add UNSPEC_MASKMOV to prevent
it to be transformed to any other whole memory access instructions.

gcc/ChangeLog:

PR rtl-optimization/110237
* config/i386/sse.md (<avx512>_store<mode>_mask): Refine with
UNSPEC_MASKMOV.
(maskstore<mode><avx512fmaskmodelower): Ditto.
(*<avx512>_store<mode>_mask): New define_insn, it's renamed
from original <avx512>_store<mode>_mask.

2 years agoRefine maskloadmn pattern with UNSPEC_MASKLOAD.
liuhongt [Tue, 20 Jun 2023 07:41:00 +0000 (15:41 +0800)] 
Refine maskloadmn pattern with UNSPEC_MASKLOAD.

If mem_addr points to a memory region with less than whole vector size
bytes of accessible memory and k is a mask that would prevent reading
the inaccessible bytes from mem_addr, add UNSPEC_MASKLOAD to prevent
it to be transformed to vpblendd.

gcc/ChangeLog:

PR target/110309
* config/i386/sse.md (maskload<mode><avx512fmaskmodelower>):
Refine pattern with UNSPEC_MASKLOAD.
(maskload<mode><avx512fmaskmodelower>): Ditto.

2 years agoDaily bump.
GCC Administrator [Thu, 29 Jun 2023 00:19:33 +0000 (00:19 +0000)] 
Daily bump.

2 years agoSupport parallel testing in libgomp: fallback Perl 'flock' [PR66005]
Thomas Schwinge [Mon, 15 May 2023 18:00:07 +0000 (20:00 +0200)] 
Support parallel testing in libgomp: fallback Perl 'flock' [PR66005]

Follow-up to commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba
"Support parallel testing in libgomp, part II [PR66005]"
("..., and enable if 'flock' is available for serializing execution testing"),
where we saw:

> On my Dell Precision 7530 laptop:
>
>     $ uname -srvi
>     Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
>     $ grep '^model name' < /proc/cpuinfo | uniq -c
>          12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
>     $ nvidia-smi -L
>     GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)
>
> ... [...]: case (c) standard configuration, no offloading
> configured, [...]

>     $ \time make check-target-libgomp
>
> Case (c), baseline; [...]:
>
>     1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
>     1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k
>
> Case (c), parallelized [using 'flock']:
>
> [...]
>     -j12 GCC_TEST_PARALLEL_SLOTS=12
>     2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
>     2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k

Quite the same when instead of 'flock' using this fallback Perl 'flock':

    2565.23user 194.35system 4:46.77elapsed 962%CPU (0avgtext+0avgdata 505216maxresident)k
    2549.38user 200.20system 4:46.08elapsed 961%CPU (0avgtext+0avgdata 505216maxresident)k

PR testsuite/66005
gcc/
* doc/install.texi: Document (optional) Perl usage for parallel
testing of libgomp.
libgomp/
* testsuite/lib/libgomp.exp: 'flock' through stdout.
* testsuite/flock: New.
* configure.ac (FLOCK): Point to that if no 'flock' available, but
'perl' is.
* configure: Regenerate.

(cherry picked from commit 04abe1944d30eb18a2060cfcd9695d085f7b4752)

2 years agoSupport parallel testing in libgomp, part II [PR66005]
Thomas Schwinge [Tue, 25 Apr 2023 21:53:12 +0000 (23:53 +0200)] 
Support parallel testing in libgomp, part II [PR66005]

..., and enable if 'flock' is available for serializing execution testing.

Regarding the default of 19 parallel slots, this turned out to be a local
minimum for wall time when testing this on:

    $ uname -srvi
    Linux 4.2.0-42-generic #49~14.04.1-Ubuntu SMP Wed Jun 29 20:22:11 UTC 2016 x86_64
    $ grep '^model name' < /proc/cpuinfo | uniq -c
         32 model name      : Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz

... in two configurations: case (a) standard configuration, no offloading
configured, case (b) offloading for GCN and nvptx configured but no devices
available.  For both cases, default plus '-m32' variant.

    $ \time make check-target-libgomp RUNTESTFLAGS="--target_board=unix\{,-m32\}"

Case (a), baseline:

    6432.23user 332.38system 47:32.28elapsed 237%CPU (0avgtext+0avgdata 505044maxresident)k
    6382.43user 319.21system 47:06.04elapsed 237%CPU (0avgtext+0avgdata 505172maxresident)k

This is what people have been complaining about, rightly so, in
<https://gcc.gnu.org/PR66005> "libgomp make check time is excessive" and
elsewhere.

Case (a), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=10
    3088.49user 267.74system 6:43.82elapsed 831%CPU (0avgtext+0avgdata 505188maxresident)k
    -j15 GCC_TEST_PARALLEL_SLOTS=15
    3308.08user 294.79system 5:56.04elapsed 1011%CPU (0avgtext+0avgdata 505360maxresident)k
    -j17 GCC_TEST_PARALLEL_SLOTS=17
    3539.93user 298.99system 5:27.86elapsed 1170%CPU (0avgtext+0avgdata 505112maxresident)k
    -j18 GCC_TEST_PARALLEL_SLOTS=18
    3697.50user 317.18system 5:14.63elapsed 1275%CPU (0avgtext+0avgdata 505360maxresident)k
    -j19 GCC_TEST_PARALLEL_SLOTS=19
    3765.94user 324.27system 5:13.22elapsed 1305%CPU (0avgtext+0avgdata 505128maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20
    3684.66user 312.32system 5:15.26elapsed 1267%CPU (0avgtext+0avgdata 505100maxresident)k
    -j23 GCC_TEST_PARALLEL_SLOTS=23
    4040.59user 347.10system 5:29.12elapsed 1333%CPU (0avgtext+0avgdata 505200maxresident)k
    -j26 GCC_TEST_PARALLEL_SLOTS=26
    3973.24user 377.96system 5:24.70elapsed 1340%CPU (0avgtext+0avgdata 505160maxresident)k
    -j32 GCC_TEST_PARALLEL_SLOTS=32
    4004.42user 346.10system 5:16.11elapsed 1376%CPU (0avgtext+0avgdata 505160maxresident)k

Yay!

Case (b), baseline; 2+ h:

    7227.58user 700.54system 2:14:33elapsed 98%CPU (0avgtext+0avgdata 994264maxresident)k

Case (b), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=10
    7377.46user 777.52system 16:06.63elapsed 843%CPU (0avgtext+0avgdata 994344maxresident)k
    -j15 GCC_TEST_PARALLEL_SLOTS=15
    8019.18user 721.42system 12:13.56elapsed 1191%CPU (0avgtext+0avgdata 994228maxresident)k
    -j17 GCC_TEST_PARALLEL_SLOTS=17
    8530.11user 716.95system 10:45.92elapsed 1431%CPU (0avgtext+0avgdata 994176maxresident)k
    -j18 GCC_TEST_PARALLEL_SLOTS=18
    8776.79user 645.89system 10:27.20elapsed 1502%CPU (0avgtext+0avgdata 994248maxresident)k
    -j19 GCC_TEST_PARALLEL_SLOTS=19
    9332.37user 641.76system 10:15.09elapsed 1621%CPU (0avgtext+0avgdata 994260maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20
    9609.54user 789.88system 10:26.94elapsed 1658%CPU (0avgtext+0avgdata 994284maxresident)k
    -j23 GCC_TEST_PARALLEL_SLOTS=23
    10362.40user 911.14system 10:44.47elapsed 1749%CPU (0avgtext+0avgdata 994208maxresident)k
    -j26 GCC_TEST_PARALLEL_SLOTS=26
    11159.44user 850.99system 11:09.25elapsed 1794%CPU (0avgtext+0avgdata 994256maxresident)k
    -j32 GCC_TEST_PARALLEL_SLOTS=32
    11453.50user 939.52system 11:00.38elapsed 1876%CPU (0avgtext+0avgdata 994240maxresident)k

On my Dell Precision 7530 laptop:

    $ uname -srvi
    Linux 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64
    $ grep '^model name' < /proc/cpuinfo | uniq -c
         12 model name      : Intel(R) Core(TM) i7-8850H CPU @ 2.60GHz
    $ nvidia-smi -L
    GPU 0: Quadro P1000 (UUID: GPU-e043973b-b52a-d02b-c066-a8fdbf64e8ea)

... in two configurations: case (c) standard configuration, no offloading
configured, case (d) offloading for nvptx configured and device available.
For both cases, only default variant, no '-m32'.

    $ \time make check-target-libgomp

Case (c), baseline; roughly half of case (a) (just one variant):

    1180.98user 110.80system 19:36.40elapsed 109%CPU (0avgtext+0avgdata 505148maxresident)k
    1133.22user 111.08system 19:35.75elapsed 105%CPU (0avgtext+0avgdata 505212maxresident)k

Case (c), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=2
    1143.83user 110.76system 10:20.46elapsed 202%CPU (0avgtext+0avgdata 505216maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=6
    1737.08user 143.94system 4:59.48elapsed 628%CPU (0avgtext+0avgdata 505200maxresident)k
    1730.31user 143.02system 4:58.75elapsed 627%CPU (0avgtext+0avgdata 505152maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=8
    2192.63user 169.34system 4:52.96elapsed 806%CPU (0avgtext+0avgdata 505216maxresident)k
    2219.04user 167.67system 4:53.19elapsed 814%CPU (0avgtext+0avgdata 505152maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=10
    2463.93user 184.98system 4:48.39elapsed 918%CPU (0avgtext+0avgdata 505200maxresident)k
    2455.62user 183.68system 4:47.40elapsed 918%CPU (0avgtext+0avgdata 505216maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=12
    2591.04user 192.64system 4:44.98elapsed 976%CPU (0avgtext+0avgdata 505216maxresident)k
    2581.23user 195.21system 4:47.51elapsed 965%CPU (0avgtext+0avgdata 505212maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
    2613.18user 199.51system 4:44.06elapsed 990%CPU (0avgtext+0avgdata 505216maxresident)k

Case (d), baseline (compared to case (b): only nvptx offloading compilation,
but also nvptx offloading execution); ~1 h:

    2841.93user 653.68system 1:02:26elapsed 93%CPU (0avgtext+0avgdata 909792maxresident)k
    2842.03user 654.39system 1:02:24elapsed 93%CPU (0avgtext+0avgdata 909880maxresident)k

Case (d), parallelized:

    -j12 GCC_TEST_PARALLEL_SLOTS=2
    2856.39user 606.87system 33:58.64elapsed 169%CPU (0avgtext+0avgdata 909948maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=6
    3444.90user 666.86system 18:37.57elapsed 367%CPU (0avgtext+0avgdata 909856maxresident)k
    3462.13user 667.13system 18:36.87elapsed 369%CPU (0avgtext+0avgdata 909872maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=8
    3929.74user 716.22system 18:02.36elapsed 429%CPU (0avgtext+0avgdata 909832maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=10
    4152.84user 736.16system 17:43.05elapsed 459%CPU (0avgtext+0avgdata 909872maxresident)k
    -j12 GCC_TEST_PARALLEL_SLOTS=12
    4209.60user 749.00system 17:35.20elapsed 469%CPU (0avgtext+0avgdata 909840maxresident)k
    -j20 GCC_TEST_PARALLEL_SLOTS=20 [oversubscribe]
    4255.54user 756.78system 17:29.06elapsed 477%CPU (0avgtext+0avgdata 909868maxresident)k

Worth noting is that with nvptx offloading, there is one execution test case
that times out ('libgomp.fortran/reverse-offload-5.f90').  This effectively
stalls progress for almost 5 min: quickly other executions test cases queue up
on the lock for all parallel slots.  That's working as expected; just noting
this as it accordingly does skew the wall time numbers.

PR testsuite/66005
libgomp/
* configure.ac: Look for 'flock'.
* testsuite/Makefile.am (gcc_test_parallel_slots): Enable parallel testing.
* testsuite/config/default.exp: Don't 'load_lib "standard.exp"' here...
* testsuite/lib/libgomp.exp: ... but here, instead.
(libgomp_load): Override for parallel testing.
* testsuite/libgomp-site-extra.exp.in (FLOCK): Set.
* configure: Regenerate.
* Makefile.in: Regenerate.
* testsuite/Makefile.in: Regenerate.

(cherry picked from commit 6c3b30ef9e0578509bdaf59c13da4a212fe6c2ba)

2 years agoSupport parallel testing in libgomp, part I [PR66005]
Rainer Orth [Thu, 7 May 2015 11:26:57 +0000 (13:26 +0200)] 
Support parallel testing in libgomp, part I [PR66005]

..., while still hard-coding the number of parallel slots to one.

PR testsuite/66005
libgomp/
* testsuite/Makefile.am (PWD_COMMAND): New variable.
(%/site.exp): New target.
(check_p_numbers0, check_p_numbers1, check_p_numbers2)
(check_p_numbers3, check_p_numbers4, check_p_numbers5)
(check_p_numbers6, check_p_numbers, gcc_test_parallel_slots)
(check_p_subdirs)
(check_DEJAGNU_libgomp_targets): New variables.
($(check_DEJAGNU_libgomp_targets)): New target.
($(check_DEJAGNU_libgomp_targets)): New dependency.
(check-DEJAGNU $(check_DEJAGNU_libgomp_targets)): New targets.
* testsuite/Makefile.in: Regenerate.
* testsuite/lib/libgomp.exp: For parallel testing,
'load_file ../libgomp-test-support.exp'.

Co-authored-by: Thomas Schwinge <thomas@codesourcery.com>
(cherry picked from commit e797db5c744f7b4e110f23a495fca8e6b8aebe83)

2 years agoMake option mvzeroupper independent of optimization level.
liuhongt [Mon, 26 Jun 2023 01:50:25 +0000 (09:50 +0800)] 
Make option mvzeroupper independent of optimization level.

pass_insert_vzeroupper is under condition

TARGET_AVX && TARGET_VZEROUPPER
&& flag_expensive_optimizations && !optimize_size

But the document of mvzeroupper doesn't mention the insertion
required -O2 and above, it may confuse users when they explicitly
use -Os -mvzeroupper.

------------
mvzeroupper
Target Mask(VZEROUPPER) Save
Generate vzeroupper instruction before a transfer of control flow out of
the function.
------------

The patch moves flag_expensive_optimizations && !optimize_size to
ix86_option_override_internal. It makes -mvzeroupper independent of
optimization level, but still keeps the behavior of architecture
tuning(emit_vzeroupper) unchanged.

gcc/ChangeLog:

* config/i386/i386-features.c (pass_insert_vzeroupper:gate):
Move flag_expensive_optimizations && !optimize_size to ..
* config/i386/i386-options.c (ix86_option_override_internal):
.. this, it makes -mvzeroupper independent of optimization
level, but still keeps the behavior of architecture
tuning(emit_vzeroupper) unchanged.
(rest_of_handle_insert_vzeroupper): Remove
flag_expensive_optimizations && !optimize_size.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx-vzeroupper-29.c: New testcase.

2 years agoDaily bump.
GCC Administrator [Wed, 28 Jun 2023 00:20:10 +0000 (00:20 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Tue, 27 Jun 2023 00:20:37 +0000 (00:20 +0000)] 
Daily bump.

2 years agod: Suboptimal codegen for __builtin_expect(cond, false)
Iain Buclaw [Mon, 26 Jun 2023 01:24:27 +0000 (03:24 +0200)] 
d: Suboptimal codegen for __builtin_expect(cond, false)

Since PR96435, both boolean objects and expressions have been evaluated
in the following way.

    (*(ubyte*)&obj_or_expr) & 1

It has been noted that sometimes this can cause the back-end to optimize
in non-obvious ways - in particular with __builtin_expect.

This @safe feature is now restricted to just when reading the value of a
bool field that comes from a union.

PR d/110359

gcc/d/ChangeLog:

* d-convert.cc (convert_for_rvalue): Only apply the @safe boolean
conversion to boolean fields of a union.
(convert_for_condition): Call convert_for_rvalue in the default case.

gcc/testsuite/ChangeLog:

* gdc.dg/pr110359.d: New test.

(cherry picked from commit ab98db1e8c1b997414539f41b7fb814019497d8d)

2 years agoDaily bump.
GCC Administrator [Mon, 26 Jun 2023 00:19:02 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sun, 25 Jun 2023 00:18:30 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 24 Jun 2023 00:18:57 +0000 (00:18 +0000)] 
Daily bump.

2 years agolibstdc++: Document removal of implicit allocator rebinding extensions
Jonathan Wakely [Mon, 15 May 2023 20:41:56 +0000 (21:41 +0100)] 
libstdc++: Document removal of implicit allocator rebinding extensions

Traditionally libstdc++ allowed containers to be
instantiated with allocator's that have the wrong value type, implicitly
rebinding the allocator to the container's value type. Since C++20 that
has been explicitly ill-formed, so the extension is no longer supported
in strict modes (e.g. -std=c++17) and in C++20 and later.

libstdc++-v3/ChangeLog:

* doc/xml/manual/evolution.xml: Document removal of implicit
allocator rebinding extensions in strict mode and for C++20.
* doc/html/*: Regenerate.

(cherry picked from commit 8cbaf679a3c1875c5475bd1cb0fb86fb9d03b2d4)

2 years agolibstdc++: Simplify constraints for std::any construction [PR104242]
Jonathan Wakely [Fri, 18 Mar 2022 13:10:01 +0000 (13:10 +0000)] 
libstdc++: Simplify constraints for std::any construction [PR104242]

Partially revert r12-4190-g6da36b7d0e43b6f9281c65c19a025d4888a25b2d
because using __and_<..., is_copy_constructible<T>> when T is incomplete
results in an error about deriving from is_copy_constructible<T> when
that is incomplete. I don't know how to fix that, so this simply
restores the previous constraint which worked in this case (even though
I think it's technically undefined to use is_copy_constructible<T> with
incomplete T). This doesn't restore exactly what we had before, but uses
the is_copy_constructible_v and __is_in_place_type_v variable templates
instead of the ::value member.

libstdc++-v3/ChangeLog:

PR libstdc++/104242
* include/std/any (any(T&&)): Revert change to constraints.
* testsuite/20_util/any/cons/104242.cc: New test.

(cherry picked from commit 7a42b1fa1a090ead96cc0f94a8060a9650c810d5)

2 years agoDaily bump.
GCC Administrator [Fri, 23 Jun 2023 00:19:13 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Thu, 22 Jun 2023 00:18:49 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Wed, 21 Jun 2023 00:19:11 +0000 (00:19 +0000)] 
Daily bump.

2 years agors6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]
Kewen Lin [Tue, 20 Jun 2023 06:40:52 +0000 (01:40 -0500)] 
rs6000: Guard __builtin_{un,}pack_vector_int128 with vsx [PR109932]

As PR109932 shows, builtins __builtin_{un,}pack_vector_int128
should be guarded under vsx rather than power7, as their
corresponding bif patterns have the conditions TARGET_VSX
and VECTOR_MEM_ALTIVEC_OR_VSX_P (V1TImode).  This patch is to
ensure __builtin_{un,}pack_vector_int128 only available under
vsx.

PR target/109932

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.def (BU_VSX_MISC_2): New macro.
({un,}pack_vector_int128): Use BU_VSX_MISC_2 instead of
BU_P7_MISC_2.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr109932-1.c: New test.
* gcc.target/powerpc/pr109932-2.c: New test.

2 years agors6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]
Kewen Lin [Mon, 12 Jun 2023 06:07:52 +0000 (01:07 -0500)] 
rs6000: Don't use TFmode for 128 bits fp constant in toc [PR110011]

As PR110011 shows, when encoding 128 bits fp constant into
toc, we adopts REAL_VALUE_TO_TARGET_LONG_DOUBLE which is
to find the first float mode with LONG_DOUBLE_TYPE_SIZE
bits of precision, it would be TFmode here.  But the 128
bits fp constant can be with mode IFmode or KFmode, which
doesn't necessarily have the same underlying float format
as the one of TFmode, like this PR exposes, with option
-mabi=ibmlongdouble TFmode has ibm_extended_format while
KFmode has ieee_quad_format, mixing up the formats (the
encoding/decoding ways) would cause unexpected results.

This patch is to make it use constant's own mode instead
of TFmode for real_to_target call.

PR target/110011

gcc/ChangeLog:

* config/rs6000/rs6000.c (output_toc): Use the mode of the 128-bit
floating constant itself for real_to_target call.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr110011.c: New test.

(cherry picked from commit 388809f2afde874180da0669c669e241037eeba0)

2 years agoDaily bump.
GCC Administrator [Tue, 20 Jun 2023 00:19:45 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Mon, 19 Jun 2023 00:19:11 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sun, 18 Jun 2023 00:19:08 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 17 Jun 2023 00:19:27 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Fri, 16 Jun 2023 00:19:27 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Thu, 15 Jun 2023 00:19:09 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Wed, 14 Jun 2023 00:19:29 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Tue, 13 Jun 2023 00:19:58 +0000 (00:19 +0000)] 
Daily bump.

2 years agomiddle-end/110200 - genmatch force-leaf and convert interaction
Richard Biener [Mon, 12 Jun 2023 08:17:26 +0000 (10:17 +0200)] 
middle-end/110200 - genmatch force-leaf and convert interaction

The following fixes code GENERIC generation for (convert! ...)
which currently generates

  if (TREE_TYPE (_o1[0]) != type)
    _r1 = fold_build1_loc (loc, NOP_EXPR, type, _o1[0]);
    if (EXPR_P (_r1))
      goto next_after_fail867;
  else
    _r1 = _o1[0];

where obviously braces are missing.

PR middle-end/110200
* genmatch.c (expr::gen_transform): Put braces around
the if arm for the (convert ...) short-cut.

(cherry picked from commit 820d1aec89c43dbbc70d3d0b888201878388454c)

2 years agoDaily bump.
GCC Administrator [Mon, 12 Jun 2023 00:19:22 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sun, 11 Jun 2023 00:19:33 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 10 Jun 2023 00:19:43 +0000 (00:19 +0000)] 
Daily bump.

2 years agoarm: Fix ICE due to infinite splitting [PR109800]
Alex Coplan [Thu, 25 May 2023 12:34:46 +0000 (13:34 +0100)] 
arm: Fix ICE due to infinite splitting [PR109800]

In r11-966-g9a182ef9ee011935d827ab5c6c9a7cd8e22257d8 we introduce a
simplification to emit_move_insn that attempts to simplify moves of the form:

(set (subreg:M1 (reg:M2 ...)) (constant C))

where M1 and M2 are of equal mode size. That is problematic for the splitter
vfp.md:no_literal_pool_df_immediate in the arm backend, which tries to pun an
lvalue DFmode pseudo into DImode and assign a constant to it with
emit_move_insn, as the new transformation simply undoes this, and we end up
splitting indefinitely.

This patch changes things around in the arm backend so that we use a
DImode temporary (instead of DFmode) and first load the DImode constant
into the pseudo, and then pun the pseudo into DFmode as an rvalue in a
reg -> reg move. I believe this should be semantically equivalent but
avoids the pathalogical behaviour seen in the PR.

gcc/ChangeLog:

PR target/109800
* config/arm/arm.md (movdf): Generate temporary pseudo in DImode
instead of DFmode.
* config/arm/vfp.md (no_literal_pool_df_immediate): Rather than punning an
lvalue DFmode pseudo into DImode, use a DImode pseudo and pun it into
DFmode as an rvalue.

gcc/testsuite/ChangeLog:

PR target/109800
* gcc.target/arm/pure-code/pr109800.c: New test.

(cherry picked from commit f5298d9969b4fa34ff3aecd54b9630e22b2984a5)

2 years agoDarwin, PPC: Fix struct layout with pragma pack [PR110044].
Iain Sandoe [Thu, 1 Jun 2023 12:43:35 +0000 (13:43 +0100)] 
Darwin, PPC: Fix struct layout with pragma pack [PR110044].

This bug was essentially that darwin_rs6000_special_round_type_align()
was ignoring externally-imposed capping of field alignment.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR target/110044

gcc/ChangeLog:

* config/rs6000/rs6000.c (darwin_rs6000_special_round_type_align):
Make sure that we do not have a cap on field alignment before altering
the struct layout based on the type alignment of the first entry.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/darwin-abi-13-0.c: New test.
* gcc.target/powerpc/darwin-abi-13-1.c: New test.
* gcc.target/powerpc/darwin-abi-13-2.c: New test.
* gcc.target/powerpc/darwin-structs-0.h: New test.

(cherry picked from commit 84d080a29a780973bef47171ba708ae2f7b4ee47)

2 years agofortran: Fix ICE on pr96024.f90 on big-endian hosts [PR96024]
Jakub Jelinek [Fri, 9 Jun 2023 07:10:29 +0000 (09:10 +0200)] 
fortran: Fix ICE on pr96024.f90 on big-endian hosts [PR96024]

The pr96024.f90 testcase ICEs on big-endian hosts.  The problem is
that length->val.integer is accessed after checking
length->expr_type == EXPR_CONSTANT, but it is a CHARACTER constant
which uses length->val.character union member instead and on big-endian
we end up reading constant 0x100000000 rather than some small number
on little-endian and if target doesn't have enough memory for 4 times
that (i.e. 16GB allocation), it ICEs.

2023-06-09  Jakub Jelinek  <jakub@redhat.com>

PR fortran/96024
* primary.c (gfc_convert_to_structure_constructor): Only do
constant string ctor length verification and truncation/padding
if constant length has INTEGER type.

(cherry picked from commit 4cf6e322adc19f927859e0a5edfa93cec4b8c844)

2 years agoDaily bump.
GCC Administrator [Fri, 9 Jun 2023 00:18:58 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Thu, 8 Jun 2023 00:19:38 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Wed, 7 Jun 2023 00:20:10 +0000 (00:20 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Tue, 6 Jun 2023 00:19:56 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Mon, 5 Jun 2023 00:18:58 +0000 (00:18 +0000)] 
Daily bump.

2 years agoFortran: fix diagnostics for SELECT RANK [PR100607]
Steve Kargl [Fri, 2 Jun 2023 17:44:11 +0000 (19:44 +0200)] 
Fortran: fix diagnostics for SELECT RANK [PR100607]

gcc/fortran/ChangeLog:

PR fortran/100607
* resolve.c (resolve_select_rank): Remove duplicate error.
(resolve_fl_var_and_proc): Prevent NULL pointer dereference and
suppress error message for temporary.

gcc/testsuite/ChangeLog:

PR fortran/100607
* gfortran.dg/select_rank_6.f90: New test.

(cherry picked from commit fae09dfc0e6bf4cfe35d817558827aea78c6426f)

2 years agoDaily bump.
GCC Administrator [Sun, 4 Jun 2023 00:18:53 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 3 Jun 2023 00:19:18 +0000 (00:19 +0000)] 
Daily bump.

2 years agomatch.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]
Jakub Jelinek [Sun, 21 May 2023 11:36:56 +0000 (13:36 +0200)] 
match.pd: Ensure (op CONSTANT_CLASS_P CONSTANT_CLASS_P) is simplified [PR109505]

On the following testcase we hang, because POLY_INT_CST is CONSTANT_CLASS_P,
but BIT_AND_EXPR with it and INTEGER_CST doesn't simplify and the
(x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2)
simplification actually relies on the (CST1 & CST2) simplification,
otherwise it is a deoptimization, trading 2 ops for 3 and furthermore
running into
/* Given a bit-wise operation CODE applied to ARG0 and ARG1, see if both
   operands are another bit-wise operation with a common input.  If so,
   distribute the bit operations to save an operation and possibly two if
   constants are involved.  For example, convert
     (A | B) & (A | C) into A | (B & C)
   Further simplification will occur if B and C are constants.  */
simplification which simplifies that
(x & CST2) | (CST1 & CST2) back to
CST2 & (x | CST1).
I went through all other places I could find where we have a simplification
with 2 CONSTANT_CLASS_P operands and perform some operation on those two,
while the other spots aren't that severe (just trade 2 operations for
another 2 if the two constants don't simplify, rather than as in the above
case trading 2 ops for 3), I still think all those spots really intend
to optimize only if the 2 constants simplify.

So, the following patch adds to those a ! modifier to ensure that,
even at GENERIC that modifier means !EXPR_P which is exactly what we want
IMHO.

2023-05-21  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/109505
* match.pd ((x | CST1) & CST2 -> (x & CST2) | (CST1 & CST2),
Combine successive equal operations with constants,
(A +- CST1) +- CST2 -> A + CST3, (CST1 - A) +- CST2 -> CST3 - A,
CST1 - (CST2 - A) -> CST3 + A): Use ! on ops with 2 CONSTANT_CLASS_P
operands.

* gcc.target/aarch64/sve/pr109505.c: New test.

(cherry picked from commit f211757f6fa9515e3fd1a4f66f1a8b48e500c9de)

2 years agomiddle-end/109505 - backport match.pd ! support for GENERIC
Richard Biener [Wed, 23 Feb 2022 12:47:01 +0000 (13:47 +0100)] 
middle-end/109505 - backport match.pd ! support for GENERIC

The patch adds support for the ! modifier to GENERIC, backported
from r12-7361-gfdc46830f1b793.

2023-06-02  Richard Biener  <rguenther@suse.de>

PR tree-optimization/109505
* doc/match-and-simplify.texi: Amend ! documentation.
* genmatch.c (expr::gen_transform): Code-generate ! support
for GENERIC.
(parser::parse_expr): Allow ! for GENERIC.

2 years agoDaily bump.
GCC Administrator [Fri, 2 Jun 2023 00:19:46 +0000 (00:19 +0000)] 
Daily bump.

2 years agodoc: Fix description of x86 -m32 option [PR109954]
Jonathan Wakely [Thu, 1 Jun 2023 10:30:10 +0000 (11:30 +0100)] 
doc: Fix description of x86 -m32 option [PR109954]

This option does not imply -march=i386 so it's incorrect to say it
generates code that will run on "any i386 system".

gcc/ChangeLog:

PR target/109954
* doc/invoke.texi (x86 Options): Fix description of -m32 option.

(cherry picked from commit eeb92704967875411416b0b9508aa6f49e8192fd)

2 years agoDaily bump.
GCC Administrator [Thu, 1 Jun 2023 00:19:07 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Wed, 31 May 2023 00:19:12 +0000 (00:19 +0000)] 
Daily bump.

2 years agolibstdc++: Correct NTTP and simd_mask ctor call
Matthias Kretz [Fri, 26 May 2023 10:23:44 +0000 (12:23 +0200)] 
libstdc++: Correct NTTP and simd_mask ctor call

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109822
* include/experimental/bits/simd.h (to_native): Use int NTTP
as specified in PTS2.
(to_compatible): Likewise. Add missing tag to call mask
generator ctor.
* testsuite/experimental/simd/pr109822_cast_functions.cc: New
test.

(cherry picked from commit 668d43502f465d48adbc1fe2956b979f36657e5f)

2 years agolibstdc++: Simplify calculation of expected value in simd test
Matthias Kretz [Thu, 25 May 2023 10:53:06 +0000 (12:53 +0200)] 
libstdc++: Simplify calculation of expected value in simd test

This avoids a failure on PR109964.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/integer_operators.cc:
Compute expected value differently to avoid getting turned into
a vector shift.

(cherry picked from commit 3e2689e568425f14d6728504ad6f5d32b90320ad)

2 years agolibstdc++: Fix test assumptions on long and long double
Matthias Kretz [Thu, 25 May 2023 10:07:45 +0000 (12:07 +0200)] 
libstdc++: Fix test assumptions on long and long double

Expect that long might not fit into the long double mantissa bits.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/operator_cvt.cc: Make long
double <-> (u)long conversion tests conditional on sizeof(long
double) and sizeof(long).

(cherry picked from commit 291549d43e823f163fa9961e42a751b5ce0d57fb)

2 years agolibstdc++: Resolve -Wsign-compare issue
Matthias Kretz [Thu, 25 May 2023 08:45:21 +0000 (10:45 +0200)] 
libstdc++: Resolve -Wsign-compare issue

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_ppc.h (_S_bit_shift_left):
Negative __y is UB, so prefer signed compare.

(cherry picked from commit 1a1abec1d618cde709c585fcce89330bb33b07ac)

2 years agoDaily bump.
GCC Administrator [Tue, 30 May 2023 00:18:41 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Mon, 29 May 2023 11:16:48 +0000 (11:16 +0000)] 
Daily bump.

2 years agoBump BASE-VER
Jakub Jelinek [Mon, 29 May 2023 09:47:49 +0000 (11:47 +0200)] 
Bump BASE-VER

2023-05-29  Jakub Jelinek  <jakub@redhat.com>

* BASE-VER: Set to 11.4.1.

2 years agoUpdate ChangeLog and version files for release releases/gcc-11.4.0
Jakub Jelinek [Mon, 29 May 2023 08:46:51 +0000 (08:46 +0000)] 
Update ChangeLog and version files for release

2 years agoDaily bump.
GCC Administrator [Sun, 28 May 2023 00:18:31 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Sat, 27 May 2023 00:18:22 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Fri, 26 May 2023 00:19:03 +0000 (00:19 +0000)] 
Daily bump.

2 years agolibstdc++: Add missing constexpr to simd
Matthias Kretz [Thu, 23 Mar 2023 08:32:58 +0000 (09:32 +0100)] 
libstdc++: Add missing constexpr to simd

The constexpr API is only available with -std=gnu++XX (and proposed for
C++26). The proposal is to have the complete simd API usable in constant
expressions.

This patch resolves several issues with using simd in constant
expressions.

Issues why constant_evaluated branches are necessary:
* subscripting vector builtins is not allowed in constant expressions
* if the implementation needs/uses memcpy
* if the implementation would otherwise call SIMD intrinsics/builtins

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd.h (_SimdWrapper::_M_set):
Avoid vector builtin subscripting in constant expressions.
(resizing_simd_cast): Avoid memcpy if constant_evaluated.
(const_where_expression, where_expression, where)
(__extract_part, simd_mask, _SimdIntOperators, simd): Add either
_GLIBCXX_SIMD_CONSTEXPR (on public APIs), or constexpr (on
internal APIs).
* include/experimental/bits/simd_builtin.h (__vector_permute)
(__vector_shuffle, __extract_part, _GnuTraits::_SimdCastType1)
(_GnuTraits::_SimdCastType2, _SimdImplBuiltin)
(_MaskImplBuiltin::_S_store): Add constexpr.
(_CommonImplBuiltin::_S_store_bool_array)
(_SimdImplBuiltin::_S_load, _SimdImplBuiltin::_S_store)
(_SimdImplBuiltin::_S_reduce, _MaskImplBuiltin::_S_load): Add
constant_evaluated case.
* include/experimental/bits/simd_fixed_size.h
(_S_masked_load): Reword comment.
(__tuple_element_meta, __make_meta, _SimdTuple::_M_apply_r)
(_SimdTuple::_M_subscript_read, _SimdTuple::_M_subscript_write)
(__make_simd_tuple, __optimize_simd_tuple, __extract_part)
(__autocvt_to_simd, _Fixed::__traits::_SimdBase)
(_Fixed::__traits::_SimdCastType, _SimdImplFixedSize): Add
constexpr.
(_SimdTuple::operator[], _M_set): Add constexpr and add
constant_evaluated case.
(_MaskImplFixedSize::_S_load): Add constant_evaluated case.
* include/experimental/bits/simd_scalar.h: Add constexpr.
* include/experimental/bits/simd_x86.h (_CommonImplX86): Add
constexpr and add constant_evaluated case.
(_SimdImplX86::_S_equal_to, _S_not_equal_to, _S_less)
(_S_less_equal): Value-initialize to satisfy constexpr
evaluation.
(_MaskImplX86::_S_load): Add constant_evaluated case.
(_MaskImplX86::_S_store): Add constexpr and constant_evaluated
case. Value-initialize local variables.
(_MaskImplX86::_S_logical_and, _S_logical_or, _S_bit_not)
(_S_bit_and, _S_bit_or, _S_bit_xor): Add constant_evaluated
case.
* testsuite/experimental/simd/pr109261_constexpr_simd.cc: New
test.

(cherry picked from commit da579188807ede4ee9466d0b5bf51559c96a0b51)

2 years agolibstdc++: Fix type of first argument to vec_cntm call
Matthias Kretz [Wed, 24 May 2023 14:43:07 +0000 (16:43 +0200)] 
libstdc++: Fix type of first argument to vec_cntm call

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109949
* include/experimental/bits/simd.h (__intrinsic_type): If
__ALTIVEC__ is defined, map gnu::vector_size types to their
corresponding __vector T types without losing unsignedness of
integer types. Also prefer long long over long.
* include/experimental/bits/simd_ppc.h (_S_popcount): Cast mask
object to the expected unsigned vector type.

(cherry picked from commit efd2b55d8562c6e80cb7ee8b9b1f9418f0c00cd9)

2 years agolibstdc++: Fix SFINAE for __is_intrinsic_type on ARM
Matthias Kretz [Wed, 24 May 2023 10:50:46 +0000 (12:50 +0200)] 
libstdc++: Fix SFINAE for __is_intrinsic_type on ARM

On ARM NEON doesn't support double, so __is_intrinsic_type_v<double,
whatever> should say false (instead of being ill-formed).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd.h (__intrinsic_type):
Specialize __intrinsic_type<double, 8> and
__intrinsic_type<double, 16> in any case, but provide the member
type only with __aarch64__.

(cherry picked from commit aa8b363171a95b8f867a74f29c75f9577e9087e1)

2 years agolibstdc++: Add missing constexpr to simd_neon
Matthias Kretz [Tue, 23 May 2023 21:48:49 +0000 (23:48 +0200)] 
libstdc++: Add missing constexpr to simd_neon

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/109261
* include/experimental/bits/simd_neon.h (_S_reduce): Add
constexpr and make NEON implementation conditional on
not __builtin_is_constant_evaluated.

(cherry picked from commit b0a483b0a011f9cbc8b25053eae809c77dae2a12)

2 years agolibstdc++: Resolve -Wunused-variable warnings in stdx::simd and tests
Matthias Kretz [Mon, 22 May 2023 14:58:30 +0000 (16:58 +0200)] 
libstdc++: Resolve -Wunused-variable warnings in stdx::simd and tests

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_builtin.h (_S_fpclassify): Move
__infn into #ifdef'ed block.
* testsuite/experimental/simd/tests/fpclassify.cc: Declare
constants only when used.
* testsuite/experimental/simd/tests/frexp.cc: Likewise.
* testsuite/experimental/simd/tests/logarithm.cc: Likewise.
* testsuite/experimental/simd/tests/trunc_ceil_floor.cc:
Likewise.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
Move totest and expect1 into #ifdef'ed block.

(cherry picked from commit a7129e82bed1bd4f513fc3c3f401721e2c96a865)

2 years agolibstdc++: Add missing trait is_simd_flag_type
Matthias Kretz [Wed, 22 Mar 2023 07:12:08 +0000 (08:12 +0100)] 
libstdc++: Add missing trait is_simd_flag_type

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (is_simd_flag_type): New.
(_IsSimdFlagType): New.
(copy_from, copy_to, load ctors): Constrain _Flags using
_IsSimdFlagType.

(cherry picked from commit 97383b4116ea63486eb5bfb0a7140871bed75fb4)

2 years agolibstdc++: Fix operator% implementation for Clang
Matthias Kretz [Wed, 22 Mar 2023 07:12:08 +0000 (08:12 +0100)] 
libstdc++: Fix operator% implementation for Clang

This resolves a regression of my previous fix where Clang would ICE on
_S_divides.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_x86.h (_SimdImplX86): Use
_Base::_S_divides if the optimized _S_divides function is hidden
via the preprocessor.

(cherry picked from commit 1a62008123694b2ac07f28e25fc6e5ff371925f5)

2 years agolibstdc++: Fix simd compilation with Clang
Matthias Kretz [Thu, 23 Feb 2023 13:55:08 +0000 (14:55 +0100)] 
libstdc++: Fix simd compilation with Clang

Clang fails to compile some constant expressions involving simd.
Therefore, just disable this non-conforming extension for clang.

Fix AVX512 blend implementation for Clang. It was converting the bitmask
to bool before, which is obviously wrong. Instead use a Clang builtin to
convert the bitmask to vector-mask before using a vector blend ?:. A
similar change is required for the masked unary implementation, because
the GCC builtins do not exist on Clang.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd_detail.h: Don't declare the
simd API as constexpr with Clang.
* include/experimental/bits/simd_x86.h (__movm): New.
(_S_blend_avx512): Resolve FIXME. Implement blend using __movm
and ?:.
(_SimdImplX86::_S_masked_unary): Clang does not implement the
same builtins. Implement the function using __movm, ?:, and -
operators on vector_size types instead.

(cherry picked from commit 8ff3ca2d94721fab78f167d435d4ea4fa4fdca6a)

2 years agolibstdc++: Fix formatting
Matthias Kretz [Tue, 21 Feb 2023 07:48:18 +0000 (08:48 +0100)] 
libstdc++: Fix formatting

Whitespace changes only.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h: Line breaks and indenting
fixed to follow the libstdc++ standard.
* include/experimental/bits/simd_builtin.h: Likewise.
* include/experimental/bits/simd_fixed_size.h: Likewise.
* include/experimental/bits/simd_neon.h: Likewise.
* include/experimental/bits/simd_ppc.h: Likewise.
* include/experimental/bits/simd_scalar.h: Likewise.
* include/experimental/bits/simd_x86.h: Likewise.

(cherry picked from commit b31186e589caee43ac5720a538d9a41ebf514e81)

2 years agolibstdc++: Always-inline most of non-cmath fixed_size implementation
Matthias Kretz [Mon, 20 Feb 2023 16:49:37 +0000 (17:49 +0100)] 
libstdc++: Always-inline most of non-cmath fixed_size implementation

For simd, the inlining behavior should be similar to builtin types. (No
operator on buitin types is ever translated into a function call.)
Therefore, always_inline is the right choice (i.e. inline on -O0 as
well).

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_fixed_size.h
(_SimdImplFixedSize::_S_broadcast): Replace inline with
_GLIBCXX_SIMD_INTRINSIC.
(_SimdImplFixedSize::_S_generate): Likewise.
(_SimdImplFixedSize::_S_load): Likewise.
(_SimdImplFixedSize::_S_masked_load): Likewise.
(_SimdImplFixedSize::_S_store): Likewise.
(_SimdImplFixedSize::_S_masked_store): Likewise.
(_SimdImplFixedSize::_S_min): Likewise.
(_SimdImplFixedSize::_S_max): Likewise.
(_SimdImplFixedSize::_S_complement): Likewise.
(_SimdImplFixedSize::_S_unary_minus): Likewise.
(_SimdImplFixedSize::_S_plus): Likewise.
(_SimdImplFixedSize::_S_minus): Likewise.
(_SimdImplFixedSize::_S_multiplies): Likewise.
(_SimdImplFixedSize::_S_divides): Likewise.
(_SimdImplFixedSize::_S_modulus): Likewise.
(_SimdImplFixedSize::_S_bit_and): Likewise.
(_SimdImplFixedSize::_S_bit_or): Likewise.
(_SimdImplFixedSize::_S_bit_xor): Likewise.
(_SimdImplFixedSize::_S_bit_shift_left): Likewise.
(_SimdImplFixedSize::_S_bit_shift_right): Likewise.
(_SimdImplFixedSize::_S_remquo): Add inline keyword (to be
explicit about not always-inline, yet).
(_SimdImplFixedSize::_S_isinf): Likewise.
(_SimdImplFixedSize::_S_isfinite): Likewise.
(_SimdImplFixedSize::_S_isnan): Likewise.
(_SimdImplFixedSize::_S_isnormal): Likewise.
(_SimdImplFixedSize::_S_signbit): Likewise.

(cherry picked from commit e37b04328ae68f91efe1fb2c5de9122be34bc74a)

2 years agolibstdc++: More efficient masked inc-/decrement implementation
Matthias Kretz [Mon, 20 Feb 2023 15:33:31 +0000 (16:33 +0100)] 
libstdc++: More efficient masked inc-/decrement implementation

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108856
* include/experimental/bits/simd_builtin.h
(_SimdImplBuiltin::_S_masked_unary): More efficient
implementation of masked inc-/decrement for integers and floats
without AVX2.
* include/experimental/bits/simd_x86.h
(_SimdImplX86::_S_masked_unary): New. Use AVX512 masked subtract
builtins for masked inc-/decrement.

(cherry picked from commit 6ce55180d494b616e2e3e68ffedfe9007e42ca06)

2 years agolibstdc++: Test that integral simd reductions are precise
Matthias Kretz [Tue, 21 Feb 2023 09:43:13 +0000 (10:43 +0100)] 
libstdc++: Test that integral simd reductions are precise

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

* testsuite/experimental/simd/tests/reductions.cc: Introduce
max_distance as the type-dependent max error.

(cherry picked from commit 8fda668e0919af9ceda9435f02a1708b375b2913)

2 years agolibstdc++: Fix simd build failure on clang
Matthias Kretz [Mon, 20 Feb 2023 10:13:44 +0000 (11:13 +0100)] 
libstdc++: Fix simd build failure on clang

Clang does not support __attribute__ on lambdas. Therefore, only set
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA if __clang__ is not defined.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_detail.h
(_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA): Define as empty for
__clang__.

(cherry picked from commit 92c47b15d5af3e7f93d11ad69a45b6d1cb8661c5)

2 years agolibstdc++: Annotate most lambdas with always_inline
Matthias Kretz [Sat, 14 Jan 2023 16:07:59 +0000 (17:07 +0100)] 
libstdc++: Annotate most lambdas with always_inline

All of the annotated lambdas are simply a necessary means for
implementing these functions and should never result in an actual
function call. Many of these lambdas would go away if C++ had better
language support for packs.

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>
libstdc++-v3/ChangeLog:

PR libstdc++/108030
* include/experimental/bits/simd_detail.h: Define
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA.
* include/experimental/bits/simd.h: Annotate lambdas with
_GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA.
* include/experimental/bits/simd_builtin.h: Ditto.
* include/experimental/bits/simd_converter.h: Ditto.
* include/experimental/bits/simd_fixed_size.h: Ditto.
* include/experimental/bits/simd_math.h: Ditto.
* include/experimental/bits/simd_neon.h: Ditto.
* include/experimental/bits/simd_x86.h: Ditto.

(cherry picked from commit 53b55701aed6896f456cdec7997ac6bbef1d6074)

2 years agoDaily bump.
GCC Administrator [Thu, 25 May 2023 00:19:12 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Wed, 24 May 2023 00:19:43 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Tue, 23 May 2023 00:18:47 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDo not generate vmaddfp and vnmsubfp
Michael Meissner [Mon, 22 May 2023 15:17:01 +0000 (11:17 -0400)] 
Do not generate vmaddfp and vnmsubfp

This is version 3 of the patch.  This is essentially version 1 with the removal
of changes to altivec.md, and cleanup of the comments.

Version 2 generated the vmaddfp and vnmsubfp instructions if -Ofast was used,
and those changes are deleted in this patch.

The Altivec instructions vmaddfp and vnmsubfp have different rounding behaviors
than the VSX xvmaddsp and xvnmsubsp instructions.  In particular, generating
these instructions seems to break Eigen on big endian systems.

I have done bootstrap builds on power9 little endian (with both IEEE long
double and IBM long double).  I have also done the builds and test on a power8
big endian system (testing both 32-bit and 64-bit code generation).  Chip has
verified that it fixes the problem that Eigen encountered.  Can I check this
into the master GCC branch?  After a burn-in period, can I check this patch
into the active GCC branches?

Thanks in advance.

2023-04-07   Michael Meissner  <meissner@linux.ibm.com>

gcc/

PR target/70243
* config/rs6000/vsx.md (vsx_fmav4sf4): Do not generate vmaddfp.  Back
port from master 04/10/2023.
(vsx_nfmsv4sf4): Do not generate vnmsubfp.

gcc/testsuite/

PR target/70243
* gcc.target/powerpc/pr70243.c: New test.  Back port from master
04/10/2023.

2 years agolibstdc++: Implement P2520R0 changes to move_iterator's iterator_concept
Patrick Palka [Tue, 14 Mar 2023 20:44:32 +0000 (16:44 -0400)] 
libstdc++: Implement P2520R0 changes to move_iterator's iterator_concept

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (move_iterator::_S_iter_concept):
Define.
(__cpp_lib_move_iterator_concept): Define for C++20.
(move_iterator::iterator_concept): Strengthen as per P2520R0.
* include/std/version (__cpp_lib_move_iterator_concept): Define
for C++20.
* testsuite/24_iterators/move_iterator/p2520r0.cc: New test.

(cherry picked from commit 2b204accd07a3185b58b1edc6e9b019472857a5d)

2 years agoc++: thinko in extract_local_specs [PR108998]
Patrick Palka [Fri, 3 Mar 2023 16:37:02 +0000 (11:37 -0500)] 
c++: thinko in extract_local_specs [PR108998]

In order to fix PR100295, r13-4730-g18499b9f848707 attempted to make
extract_local_specs walk the given pattern twice, ignoring unevaluated
operands the first time around so that we prefer to process a local
specialization in an evaluated context if it appears in one (we process
each local specialization once even if it appears multiple times in the
pattern).

But there's a thinko in the patch, namely that we don't actually walk
the pattern twice since we don't clear the visited set for the second
walk (to avoid processing a local specialization twice) and so the root
node (and any node leading up to an unevaluated operand) is considered
visited already.  So the patch effectively made extract_local_specs
ignore unevaluated operands altogether, which this testcase demonstrates
isn't quite safe (extract_local_specs never sees 'aa' and we don't record
its local specialization, so later we try to specialize 'aa' on the spot
with the args {{int},{17}} which causes us to nonsensically substitute
its auto with 17.)

This patch fixes this by refining the second walk to start from the
trees we skipped over during the first walk.

PR c++/108998

gcc/cp/ChangeLog:

* pt.c (el_data::skipped_trees): New data member.
(extract_locals_r): Push to skipped_trees any unevaluated
contexts that we skipped over.
(extract_local_specs): For the second walk, start from each
tree in skipped_trees.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/lambda-generic11.C: New test.

(cherry picked from commit 341e6cd8d603a334fd34657a6b454176be1c6437)

2 years agoc++: extract_local_specs and unevaluated contexts [PR100295]
Patrick Palka [Thu, 15 Dec 2022 21:02:05 +0000 (16:02 -0500)] 
c++: extract_local_specs and unevaluated contexts [PR100295]

Here during partial instantiation of the constexpr if, extra_local_specs
walks the statement looking for local specializations within to capture.
However, we're thwarted by the fact that 'ts' first appears inside an
unevaluated context, and so the calls to process_outer_var_ref for its
local specializations are a no-op.  And since we walk each tree exactly
once, we end up not capturing the local specializations despite 'ts'
later occurring in an evaluated context.

This patch fixes this by making extract_local_specs walk evaluated
contexts first before walking unevaluated contexts.  We could probably
get away with not walking unevaluated contexts at all, but this approach
seems more clearly safe.

PR c++/100295
PR c++/107579

gcc/cp/ChangeLog:

* pt.c (el_data::skip_unevaluated_operands): New data member.
(extract_locals_r): If skip_unevaluated_operands is true,
don't walk into unevaluated contexts.
(extract_local_specs): Walk the pattern twice, first with
skip_unevaluated_operands true followed by it set to false.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-if-lambda5.C: New test.

(cherry picked from commit 18499b9f848707aee42d810e99ac0a4c9788433c)

2 years agoc++: explicit specialization and trailing requirements [PR107864]
Patrick Palka [Tue, 29 Nov 2022 14:55:21 +0000 (09:55 -0500)] 
c++: explicit specialization and trailing requirements [PR107864]

Here we're crashing when using the explicit specialization of the
function template g with trailing requirements ultimately because
earlier decls_match (called indirectly from register_specialization) for
for the explicit specialization returned false since the template has
trailing requirements whereas the specialization doesn't.

In r12-2230-gddd25bd1a7c8f4, we fixed a similar issue concerning template
requirements instead of trailing requirements.  We could extend that fix
to ignore trailing requirement mismatches for explicit specializations
as well, but it seems cleaner to just propagate constraints from the
specialized template to the specialization when declaring an explicit
specialization so that decls_match will naturally return true in this
case.  And it looks like determine_specialization already does this,
albeit inconsistently (only when specializing a non-template member
function of a class template as in cpp2a/concepts-explicit-spec4.C).

So this patch makes determine_specialization consistently propagate
constraints from the specialized template to the specialization, which
in turn lets us get rid of the function_requirements_equivalent_p special
case added by r12-2230.

PR c++/107864

gcc/cp/ChangeLog:

* decl.c (function_requirements_equivalent_p): Don't check
DECL_TEMPLATE_SPECIALIZATION.
* pt.c (determine_specialization): Propagate constraints when
specializing a function template too.  Simplify by using
add_outermost_template_args.

gcc/testsuite/ChangeLog:

* g++.dg/concepts/explicit-spec1a.C: New test.

(cherry picked from commit 36cabc257dfb7dd4f7625896891f6c5b195a0241)

2 years agoc++: requires-expr and access checking [PR107179]
Patrick Palka [Thu, 3 Nov 2022 19:35:18 +0000 (15:35 -0400)] 
c++: requires-expr and access checking [PR107179]

Like during satisfaction, we also need to avoid deferring access checks
during substitution of a requires-expr because the outcome of an access
check can determine the value of the requires-expr.  Otherwise (in
deferred access checking contexts such as within a base-clause), the
requires-expr may evaluate to the wrong result, and along the way a
failed access check may leak out from it into a non-SFINAE context and
cause a hard error (as in the below testcase).

PR c++/107179

gcc/cp/ChangeLog:

* constraint.cc (tsubst_requires_expr): Make sure we're not
deferring access checks.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-requires31.C: New test.

(cherry picked from commit 40c34beef620ed13c4113c893ed4335ccc1b8f92)

2 years agoc++: ICE with failed __is_constructible constraint [PR100474]
Patrick Palka [Wed, 30 Mar 2022 14:13:11 +0000 (10:13 -0400)] 
c++: ICE with failed __is_constructible constraint [PR100474]

Here we're crashing when diagnosing an unsatisfied __is_constructible
constraint because diagnose_trait_expr doesn't recognize this trait
(along with a bunch of other traits).  Fix this by adding handling for
all remaining traits and removing the default case so that when adding a
new trait we'll get a warning that diagnose_trait_expr needs to handle it.

PR c++/100474

gcc/cp/ChangeLog:

* constraint.cc (diagnose_trait_expr): Handle all remaining
traits appropriately.  Remove default case.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-traits3.C: New test.

(cherry picked from commit 3aaf9bf77047aecc23072fe3db7f13ecff72a7cf)

2 years agoc++: return-type-req in constraint using only outer tparms [PR104527]
Patrick Palka [Sat, 12 Mar 2022 20:00:40 +0000 (15:00 -0500)] 
c++: return-type-req in constraint using only outer tparms [PR104527]

Here the template context for the atomic constraint has two levels of
template parameters, but since it depends only on the innermost parameter
T we use a single-level argument vector (built by get_mapped_args) during
substitution into the atom.  We eventually pass this vector to
do_auto_deduction as part of checking the return-type-requirement within
the atom, but do_auto_deduction expects outer_targs to be a full set of
arguments for sake of satisfaction.

This patch fixes this by making get_mapped_args always return an
argument vector whose depth corresponds to the template depth of the
context in which the atomic constraint expression was written, instead
of the highest parameter level that the expression happens to use.

PR c++/104527

gcc/cp/ChangeLog:

* constraint.cc (normalize_atom): Set
ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P appropriately.
(get_mapped_args):  Make static, adjust parameters.  Always
return a vector whose depth corresponds to the template depth of
the context of the atomic constraint expression.  Micro-optimize
by passing false as exact to safe_grow_cleared and by collapsing
a multi-level depth-one argument vector.
(satisfy_atom): Adjust call to get_mapped_args and
diagnose_atomic_constraint.
(diagnose_atomic_constraint): Replace map parameter with an args
parameter.
* cp-tree.h (ATOMIC_CONSTR_EXPR_FROM_CONCEPT_P): Define.
(get_mapped_args): Remove declaration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-return-req4.C: New test.

(cherry picked from commit 9413bb55185b9e88d84e91d5145d59f9f83b884a)

2 years agoc++: bogus warning with value init of const pmf [PR92752]
Patrick Palka [Fri, 28 Jan 2022 20:41:15 +0000 (15:41 -0500)] 
c++: bogus warning with value init of const pmf [PR92752]

Here we're emitting a -Wignored-qualifiers warning for an intermediate
compiler-generated cast of nullptr to 'method-type* const' as part of
value initialization of a const pmf.  This patch suppresses the warning
by instead casting to the corresponding unqualified type.

PR c++/92752

gcc/cp/ChangeLog:

* typeck.c (build_ptrmemfunc): Cast a nullptr constant to the
unqualified pointer type not the qualified one.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wignored-qualifiers2.C: New test.

Co-authored-by: Jason Merrill <jason@redhat.com>
(cherry picked from commit e971990cbda091b4caf5e1a5bded5121068934e4)

2 years agoDaily bump.
GCC Administrator [Mon, 22 May 2023 00:19:22 +0000 (00:19 +0000)] 
Daily bump.

2 years agoDarwin, libgcc : Adjust min version supported for the OS.
Iain Sandoe [Thu, 11 May 2023 22:24:02 +0000 (23:24 +0100)] 
Darwin, libgcc : Adjust min version supported for the OS.

Tools from later versions of the OS deprecate or fail to support
earlier OS revisions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libgcc/ChangeLog:

* config.host: Arrange to set min Darwin OS versions from
the configured host version.
* config/darwin10-unwind-find-enc-func.c: Do not use current
headers, but declare the nexessary structures locally to the
versions in use for Mac OSX 10.6.
* config/t-darwin: Amend to handle configured min OS
versions.
* config/t-darwin-min-1: New.
* config/t-darwin-min-5: New.
* config/t-darwin-min-8: New.

(cherry picked from commit 20b8779ea9bd82b26eeb195b30f695168cd7ae1d)

2 years agoDaily bump.
GCC Administrator [Sun, 21 May 2023 00:18:51 +0000 (00:18 +0000)] 
Daily bump.

2 years agoFortran: CLASS pointer function result in variable definition context [PR109846]
Harald Anlauf [Sun, 14 May 2023 19:53:51 +0000 (21:53 +0200)] 
Fortran: CLASS pointer function result in variable definition context [PR109846]

gcc/fortran/ChangeLog:

PR fortran/109846
* expr.c (gfc_check_vardef_context): Check appropriate pointer
attribute for CLASS vs. non-CLASS function result in variable
definition context.

gcc/testsuite/ChangeLog:

PR fortran/109846
* gfortran.dg/ptr-func-5.f90: New test.

(cherry picked from commit fa0569e90efe8a5cb895a3f50dd502f849940828)

2 years agoDaily bump.
GCC Administrator [Sat, 20 May 2023 00:18:51 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Fri, 19 May 2023 00:19:51 +0000 (00:19 +0000)] 
Daily bump.

2 years agoc++tools, configury: Configure with C++; test checking status [PR98821].
Iain Sandoe [Tue, 20 Jul 2021 13:00:38 +0000 (14:00 +0100)] 
c++tools, configury: Configure with C++; test checking status [PR98821].

The c++tools configure fragments need to be built with a C++ compiler.

In addition, the stand-alone server uses diagnostic mechanisms in common
with GCC, but needs to define implementations for gcc_assert and
supporting output functions.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/98821 - modules : c++tools configures with CC but code fragments assume CXX.

PR c++/98821

c++tools/ChangeLog:

* config.h.in: Regenerate.
* configure: Regenerate.
* configure.ac: Configure using C++.  Pull logic to
detect enabled checking modes; default to release
checking.
* server.cc (AI_NUMERICSERV): Define a fallback value.
(gcc_assert): New.
(gcc_unreachable): New.
(fancy_abort): Only build when checking is enabled.

Co-authored-by: Jakub Jelinek <jakub@redhat.com>
(cherry picked from commit e4d306cf706eef83f99d510c308eda1539d05875)

2 years agoDaily bump.
GCC Administrator [Thu, 18 May 2023 00:18:45 +0000 (00:18 +0000)] 
Daily bump.

2 years agoDaily bump.
GCC Administrator [Wed, 17 May 2023 00:19:16 +0000 (00:19 +0000)] 
Daily bump.

2 years agoc++, coroutines: Fix block nests when the function has no top-level bind.
Iain Sandoe [Sat, 1 Apr 2023 16:23:51 +0000 (21:53 +0530)] 
c++, coroutines: Fix block nests when the function has no top-level bind.

When the function contains no local vars and also no nested scopes, there
is no top-level bind expression.  Because the rewritten coroutine body will
require both local vars and contain nested scopes, we add a bind expression
to such functions.  When this was done the necessary scope blocks were
omitted which leads to disconnected function content.

Fixed by adding a new block to the added bind expression.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/cp/ChangeLog:

* coroutines.cc (coro_rewrite_function_body): Ensure that added
bind expressions have scope blocks.

(cherry picked from commit a8d7631d333c22e38a067d32d11fd2b60cf1d960)

2 years agoc++,coroutines: Stabilize names of promoted slot vars [PR101118].
Iain Sandoe [Thu, 30 Mar 2023 07:44:23 +0000 (13:14 +0530)] 
c++,coroutines: Stabilize names of promoted slot vars [PR101118].

When we need to 'promote' a value (i.e. store it in the coroutine frame) it
is given a frame entry name.  This was based on the DECL_UID for slot vars.
However, when LTO is used, the names from multiple TUs become visible at the
same time, and the DECL_UIDs usually differ between units.  This leads to a
"ODR mismatch" warning for the frame type.

The fix here is to use the current promoted temporaries count to produce
the name, this is stable between TUs and computed per coroutine.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
PR c++/101118

gcc/cp/ChangeLog:

* coroutines.cc (flatten_await_stmt): Use the current count of
promoted temporaries to build a unique name for the frame entries.

(cherry picked from commit fc4cde2e6aa4d6ebdf7f70b7b4359fb59a1915ae)

2 years agolibsanitizer, darwin: Unsupport Darwin >= 22 for now.
Iain Sandoe [Mon, 17 Apr 2023 09:23:16 +0000 (10:23 +0100)] 
libsanitizer, darwin: Unsupport Darwin >= 22 for now.

The mechanism for location dyld has altered from Darwin22 since dyld is now
in the shared cache.  The implemented mechanism for walking the cache uses
Apple Blocks which GCC does not yet support, and the fallback to the original
mechanism does not work there.

Until a suitable work-around can be found, unsupport Darwin22+.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
libsanitizer/ChangeLog:

* configure.tgt: Unsupport Darwin22+ until a mechanism can be found
to locate dyld in the shared cache.

(cherry picked from commit e722a1f42b28092c9f709a3f758fc4fe57db32b0)

2 years agoDarwin, fixincludes: Handle Apple Blocks in objc/runtime.h.
Iain Sandoe [Wed, 18 Jan 2023 23:25:36 +0000 (23:25 +0000)] 
Darwin, fixincludes: Handle Apple Blocks in objc/runtime.h.

The macOS 13 SDK has unguarded Apple Blocks use in objc/runtime.h which
causes most of the objective-c tests to fail.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
fixincludes/ChangeLog:

* fixincl.x: Regenerate.
* inclhack.def (darwin_objc_runtime_1): New hack.
* tests/base/objc/runtime.h: New file.

(cherry picked from commit 046dc9d0d4683bab99d28983d8841ba3c56ef744)