Martin Sebor [Tue, 26 Oct 2021 20:34:16 +0000 (14:34 -0600)]
Detect overflow by atomic functions [PR102453].
Resolves:
PR middle-end/102453 - buffer overflow by atomic built-ins not diagnosed
gcc/ChangeLog:
PR middle-end/102453
* gimple-ssa-warn-access.cc (pass_waccess::check_atomic_builtin): New.
(pass_waccess::check_atomic_builtin): Call it.
gcc/testsuite/ChangeLog:
PR middle-end/102453
* gcc.dg/Warray-bounds-90.c: New test.
* gcc.dg/Wstringop-overflow-77.c: New test.
* gcc.dg/Wstringop-overflow-78.c: New test.
* gcc.dg/Wstringop-overflow-79.c: New test.
* gcc.dg/Wstringop-overflow-80.c: New test.
* c-c++-common/gomp/atomic-4.c: Avoid an out-of-bounds access.
[PR102842] Consider all outputs in generation of matching reloads
Without considering all output insn operands (not only processed
before), in rare cases LRA can use the same hard register for
different outputs of the insn on different assignment subpasses. The
patch fixes the problem.
gcc/ChangeLog:
PR rtl-optimization/102842
* lra-constraints.c (match_reload): Ignore out in checking values
of outs.
(curr_insn_transform): Collect outputs before doing reloads of operands.
gcc/testsuite/ChangeLog:
PR rtl-optimization/102842
* g++.target/arm/pr102842.C: New test.
Paul A. Clarke [Mon, 25 Oct 2021 20:18:33 +0000 (15:18 -0500)]
rs6000: Fixes for tests including only <x86intrin.h>
Tests which only include <x86intrin.h> expect many other include files
to be brought in, but not enough are.
Try to increase compatibility with x86 headers by:
- Create new immintrin.h, including the analogous subset of intrinsics
headers available for powerpc.
- Create new x86gprintrin.h, serving exclusively as the umbrella for
bmiintrin.h and bmi2intrin.h.
- Modify x86intrin.h:
- Include new immintrin.h.
- Remove mmintrin.h, xmmintrin.h, emmintrin.h, now included indirectly
from immintrin.h.
- Remove bmiintrin.h, bmi2intrin.h, now included indirectly from
x86gprintrin.h (which is now included from immintrin.h).
Add the new files to gcc/config.gcc.
Also, fix up the testcase that provoked PR102719, which requires
Power8 vector support.
gcc
PR target/102719
* config/rs6000/x86intrin.h: Move some included headers to new
headers. Include new immintrin.h instead of those headers.
* config/rs6000/immintrin.h: New.
* config/rs6000/x86gprintrin.h: New.
* config.gcc (powerpc*-*-*): Add new headers to extra_headers.
Marek Polacek [Thu, 21 Oct 2021 15:10:02 +0000 (11:10 -0400)]
c++: P2360R0: Extend init-stmt to allow alias-decl [PR102617]
The following patch implements C++23 P2360R0. This proposal merely
extends init-statement to contain alias-declaration. init-statement
is used in if/for/switch. It also removes the unsightly duplication
of code by calling cp_parser_init_statement twice.
PR c++/102617
gcc/cp/ChangeLog:
* parser.c (cp_parser_for): Maybe call cp_parser_init_statement
twice. Warn about range-based for loops with initializer here.
(cp_parser_init_statement): Don't duplicate code. Allow
alias-declaration in init-statement.
gcc/testsuite/ChangeLog:
* g++.dg/cpp23/init-stmt1.C: New test.
* g++.dg/cpp23/init-stmt2.C: New test.
There are only 3 instances of the expected pattern because Solaris/x86
defaults to -mno-stv. Fixed by compiling with -mstv and
-mno-stackrealign. Tested on i386-pc-solaris2.11 and
x86_64-pc-linux-gnu.
This happens because Solaris defaults to -fno-omit-frame-pointer, so it
uses %ebp instead of the expected %esp. As Hongyu Wang suggested in the
PR, this can be fixed by accepting both forms, which this patch does.
Tested on i386-pc-solaris2.11 and x86_64-pc-linux-gnu.
Rainer Orth [Tue, 26 Oct 2021 12:07:57 +0000 (14:07 +0200)]
libstdc++: Fix 28_regex/basic_regex/84110.cc on Solaris
28_regex/basic_regex/84110.cc currently FAILs on Solaris:
FAIL: 28_regex/basic_regex/84110.cc (test for excess errors)
UNRESOLVED: 28_regex/basic_regex/84110.cc compilation failed to produce executable
Excess errors:
/vol/gcc/src/hg/master/local/libstdc++-v3/testsuite/28_regex/basic_regex/84110.cc:14: error: reference to 'extended' is ambiguous
The issue is seen in the full output:
/vol/gcc/src/hg/master/local/libstdc++-v3/testsuite/28_regex/basic_regex/84110.cc: In function ‘void test01()’:
/vol/gcc/src/hg/master/local/libstdc++-v3/testsuite/28_regex/basic_regex/84110.cc:14: error: reference to ‘extended’ is ambiguous
In file included from /var/gcc/regression/master/11.4-gcc-gas/build/gcc/include-fixed/math.h:391,
from /var/gcc/regression/master/11.4-gcc-gas/build/i386-pc-solaris2.11/libstdc++-v3/include/cmath:45,
from /vol/gcc/src/hg/master/local/libstdc++-v3/include/precompiled/stdc++.h:41:
/usr/include/floatingpoint.h:73: note: candidates are: ‘typedef unsigned int extended [3]’
Fixed by disambiguating extended. Tested on i386-pc-solaris2.11,
sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.
Rainer Orth [Tue, 26 Oct 2021 12:00:18 +0000 (14:00 +0200)]
libstdc++: Fix 17_intro/names.cc on Solaris
17_intro/names.cc and experimental/names.cc currently FAIL on Solaris
FAIL: 17_intro/names.cc (test for excess errors)
FAIL: experimental/names.cc (test for excess errors)
Excess errors:
/usr/include/sys/timespec_util.h:22: error: expected ')' before ';' token
/usr/include/stdlib.h:157: error: expected unqualified-id before '[' token
/usr/include/stdlib.h:157: error: expected ')' before '[' token
<sys/timespec_util.h> has
extern int timespeccompare(const struct timespec *l, const struct timespec *r);
while <stdlib.h> has
typedef struct drand48_data {
unsigned int _initialised;
unsigned short int x[3];
unsigned short int a[3];
unsigned int c;
unsigned short lastx[3];
} drand48_data;
both of which are broken by defining r resp. x to ( in the testcase.
Fixed by undoing the defines. Tested on i386-pc-solaris2.11,
sparc-sun-solaris2.11, and x86_64-pc-linux-gnu.
Richard Biener [Mon, 25 Oct 2021 11:39:07 +0000 (13:39 +0200)]
Move negative stride bias out of dr_misalignment
This moves applying of a bias for negative stride accesses out of
dr_misalignment in favor of a more general optional offset argument.
The negative bias is now computed by get_load_store_type and applied
accordingly to determine the alignment support scheme. Likewise
the peeling/versioning code is adjusted albeit that still assumes
we'll end up with VMAT_CONTIGUOUS_DOWN or VMAT_CONTIGUOUS_REVERSE
but at least when not so (VMAT_STRIDED_SLP is one possibility) then
get_load_store_type will _not_ falsely report an aligned access but
instead an access with known misalignment.
This fixes PR96109.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/96109
* tree-vectorizer.h (dr_misalignment): Add optional offset
parameter.
* tree-vect-data-refs.c (dr_misalignment): Likewise. Remove
offset applied for negative stride accesses.
(vect_enhance_data_refs_alignment): Compute negative stride
access offset and pass it to dr_misalignment.
* tree-vect-stmts.c (get_negative_load_store_type): Pass
negative offset to dr_misalignment.
(get_group_load_store_type): Likewise.
(get_load_store_type): Likewise.
(vectorizable_store): Remove asserts about alignment.
(vectorizable_load): Likewise.
Kewen Lin [Tue, 26 Oct 2021 09:09:38 +0000 (04:09 -0500)]
forwprop: Remove incorrect assertion [PR102897]
As PR102897 shows, there is one incorrect assertion in function
simplify_permutation, which is based on the wrong assumption that
all cases with op2_type == tgt_type are handled previously, the
proposed fix is to remove the assertion.
gcc/ChangeLog:
PR tree-optimization/102897
* tree-ssa-forwprop.c (simplify_permutation): Remove a wrong assertion.
Richard Biener [Tue, 26 Oct 2021 08:52:44 +0000 (10:52 +0200)]
Turn vect_create_addr_base_for_vector_ref offset into a byte offset
This changes the offset in elements for vect_create_addr_base_for_vector_ref
and vect_create_data_ref_ptr to an offset in bytes, easing a following
refactoring.
2021-10-26 Richard Biener <rguenther@suse.de>
* tree-vect-data-refs.c (vect_create_addr_base_for_vector_ref):
Take offset in bytes.
(vect_create_data_ref_ptr): Likewise.
* tree-vect-loop-manip.c (get_misalign_in_elems): Multiply
offset by element size.
(vect_create_cond_for_align_checks): Likewise.
* tree-vect-stmts.c (get_negative_load_store_type): Likewise.
(vectorizable_load): Remove duplicate leftover from merge
conflict.
Roger Sayle [Tue, 26 Oct 2021 07:33:41 +0000 (08:33 +0100)]
x86_64: Implement V1TI mode shifts/rotates by a constant
This patch provides RTL expanders to implement logical shifts and
rotates of 128-bit values (stored in vector integer registers) by
constant bit counts. Previously, GCC would transfer these values
to a pair of integer registers (TImode) via memory to perform the
operation, then transfer the result back via memory. Instead these
operations are now expanded using (between 1 and 5) SSE2 vector
instructions.
Logical shifts by multiples of 8 can be implemented using x86_64's
pslldq/psrldq instruction:
ashl_8: pslldq $1, %xmm0
ret
lshr_32:
psrldq $4, %xmm0
ret
Logical shifts by greater than 64 can use pslldq/psrldq $8, followed
by a psllq/psrlq for the remaining bits:
ashl_111:
pslldq $8, %xmm0
psllq $47, %xmm0
ret
lshr_127:
psrldq $8, %xmm0
psrlq $63, %xmm0
ret
The remaining logical shifts make use of the following idiom:
ashl_1:
movdqa %xmm0, %xmm1
psllq $1, %xmm0
pslldq $8, %xmm1
psrlq $63, %xmm1
por %xmm1, %xmm0
ret
lshr_15:
movdqa %xmm0, %xmm1
psrlq $15, %xmm0
psrldq $8, %xmm1
psllq $49, %xmm1
por %xmm1, %xmm0
ret
Rotates by multiples of 32 can use x86_64's pshufd:
rotr_32:
pshufd $57, %xmm0, %xmm0
ret
rotr_64:
pshufd $78, %xmm0, %xmm0
ret
rotr_96:
pshufd $147, %xmm0, %xmm0
ret
Rotates by multiples of 8 (other than multiples of 32) can make
use of both pslldq and psrldq, followed by por:
rotr_8:
movdqa %xmm0, %xmm1
psrldq $1, %xmm0
pslldq $15, %xmm1
por %xmm1, %xmm0
ret
rotr_112:
movdqa %xmm0, %xmm1
psrldq $14, %xmm0
pslldq $2, %xmm1
por %xmm1, %xmm0
ret
And the remaining rotates use one or two pshufd, followed by a
psrld/pslld/por sequence:
rotr_1:
movdqa %xmm0, %xmm1
pshufd $57, %xmm0, %xmm0
psrld $1, %xmm1
pslld $31, %xmm0
por %xmm1, %xmm0
ret
rotr_63:
pshufd $78, %xmm0, %xmm1
pshufd $57, %xmm0, %xmm0
pslld $1, %xmm1
psrld $31, %xmm0
por %xmm1, %xmm0
ret
rotr_111:
pshufd $147, %xmm0, %xmm1
pslld $17, %xmm0
psrld $15, %xmm1
por %xmm1, %xmm0
ret
The new test case, sse2-v1ti-shift.c, is a run-time check to confirm that
the results of V1TImode shifts/rotates by constants, exactly match the
expected results of TImode operations, for various input test vectors.
2021-10-26 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/i386-expand.c (ix86_expand_v1ti_shift): New helper
function to expand V1TI mode logical shifts by integer constants.
(ix86_expand_v1ti_rotate): New helper function to expand V1TI
mode rotations by integer constants.
* config/i386/i386-protos.h (ix86_expand_v1ti_shift,
ix86_expand_v1ti_rotate): Prototype new functions here.
* config/i386/sse.md (ashlv1ti3, lshrv1ti3, rotlv1ti3, rotrv1ti3):
New TARGET_SSE2 expanders to implement V1TI shifts and rotations.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-shift.c: New test case.
Aldy Hernandez [Sat, 23 Oct 2021 06:59:24 +0000 (08:59 +0200)]
[PR testsuite/102857] Tweak ssa-dom-thread-7.c for aarch64.
First, ssa-dom-thread-7 was looking at a dump file that was not
being generated. This probably happened in the detangling of the VRP
threader from VRP, and I didn't notice because the test came back as
with UNRESOLVED instead of FAIL.
Second, aarch64 gets far more threads than other architectures (20
versus 12). The difference is sufficiently different to make the
regex awkward.
We already have special casing for aarch64 in other parts of this
test, so perhaps it's simplest to have an arch specific test
for the thread3 count.
I don't know perhaps there's a better way. I wake up with chills in
the middle of the night thinking about this test ;-).
Tested on x86-64 Linux and aarch64 Linux.
gcc/testsuite/ChangeLog:
PR testsuite/102857
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Add -fdump-tree-vrp2-stats.
Tweak for aarch64.
Aldy Hernandez [Wed, 20 Oct 2021 16:52:45 +0000 (18:52 +0200)]
Avoid threading circular paths.
The backward threader keeps a hash of visited blocks to avoid crossing
the same block twice. Interestingly, we haven't been checking it for
the final block out of the path. This may be inherited from the old
code, as it was simple enough that it didn't matter. With the
upcoming changes enabling the fully resolving threader, it gets
tripped often enough to cause wrong code to be generated.
Aldy Hernandez [Wed, 20 Oct 2021 05:29:59 +0000 (07:29 +0200)]
Attempt to resolve all incoming paths to a PHI.
The code that threads incoming paths to a PHI is duplicating what we
do generically in find_paths_to_names. This shortcoming is actually
one of the reasons we aren't threading all possible paths into a PHI.
For example, we give up after finding one threadable path, but some
PHIs have multiple threadable paths:
// x_5 = PHI <10(4), 20(5), ...>
// if (x_5 > 5)
Addressing this not only fixes the oversight, but simplifies the
PHI handling code, since we can consider the PHI fully resolved upon
return.
Interestingly, for ssa-thread-12.c the main thread everything was
hinging on was unreachable. With this patch, we call
maybe_register_path() earlier. In doing so, the solver realizes
that any path starting with 4->8 is unreachable and can be avoided.
This caused the cascade of threadable paths that depended on this
to no longer happen. Since threadable paths in thread[34] was the only
thing this test was testing, there's no longer anything to test. Neat!
Tested on x86-64 Linux.
gcc/ChangeLog:
* tree-ssa-threadbackward.c (back_threader::resolve_phi):
Attempt to resolve all incoming paths to a PHI.
(back_threader::resolve_def): Always return true for PHIs.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/pr21090.c: Adjust for threading.
* gcc.dg/tree-ssa/ssa-thread-12.c: Removed.
Aldy Hernandez [Wed, 20 Oct 2021 05:29:25 +0000 (07:29 +0200)]
Try to resolve paths in threader without looking further back.
Sometimes we can solve a candidate path without having to recurse
further back. This can mostly happen in fully resolving mode, because
we can ask the ranger what the range on entry to the path is, but
there's no reason this can't always apply. This one-liner removes
the fully-resolving restriction.
I'm tickled pink to see how many things we now get quite early
in the compilation. I actually had to disable jump threading entirely
for a few tests because the early threader was catching things
disturbingly early. Also, as Richi predicted, I saw a lot of pre-VRP
cleanups happening.
I was going to commit this as obvious, but I think the test changes
merit discussion.
We've been playing games with gcc.dg/tree-ssa/ssa-thread-11.c for quite
some time. Every time a threading pass gets smarter, we push the
check further down the pipeline. We've officially run out of dumb
threading passes to disable ;-). In the last year we've gone up from a
handful of threads, to 34 threads with the current combination of
options. I doubt this is testing anything useful anymore, so I've
removed it.
Similarly for gcc.dg/tree-ssa/ssa-dom-thread-4.c. We used to thread 3
jump threads, but they were disallowed because of loop rotation. Then
we started catching more jump threads in VRP2 threading so we tested
there. With this patch though, we triple the number of threads found
from 11 to 31. I believe this test has outlived its usefulness, and
I've removed it. Note that even though we have these outrageous
possibilities for this test, the block copier ultimately chops them
down (23 survive though).
Tested on x86-64 Linux.
gcc/ChangeLog:
* tree-ssa-threadbackward.c (back_threader::find_paths_to_names):
Always try to resolve path without looking back.
* tree-ssa-threadupdate.c (dump_jump_thread): Indidicate whether
edge is a back edge.
Kewen Lin [Tue, 26 Oct 2021 02:05:02 +0000 (21:05 -0500)]
vect: Don't update inits for simd_lane_access DRs [PR102789]
As PR102789 shows, when vectorizer does some peelings for alignment
in prologues, function vect_update_inits_of_drs would update the
inits of some drs. But as the failed case, we shouldn't update the
dr for simd_lane_access, it has the fixed-length storage mainly for
the main loop, the update can make the access out of bound and access
the unexpected element.
gcc/ChangeLog:
PR tree-optimization/102789
* tree-vect-loop-manip.c (vect_update_inits_of_drs): Do not
update inits of simd_lane_access.
Paul A. Clarke [Mon, 25 Oct 2021 20:17:28 +0000 (15:17 -0500)]
rs6000: Fix missing "externs" in smmintrin.h
Inline functions defined in smmintrin.h need "extern" as part of their
declaration, otherwise instances of those functions are created in the
objects which include them.
Roger Sayle [Mon, 25 Oct 2021 15:16:11 +0000 (16:16 +0100)]
Constant fold/simplify SS_ASHIFT and US_ASHIFT in simplify-rtx.c
This patch adds compile-time evaluation of signed saturating left shift
(SS_ASHIFT) and unsigned saturating left shift (US_ASHIFT) to simplify-rtx's
simplify_const_binary_operation. US_ASHIFT saturates to the maximum
unsigned value on overflow (which occurs when the shift is greater than
the leading zero count), while SS_ASHIFT saturates on overflow to the
maximum signed value for positive arguments, and the minimum signed value
for negative arguments (which occurs when the shift count is greater than
the number of leading redundant sign bits, clrsb). This suggests
some additional simplifications that this patch implements in
simplify_binary_operation_1; us_ashift:HI of 0xffff remains 0xffff
(much like any ashift of 0x0000 remains 0x0000), and ss_ashift:HI of
0x7fff remains 0x7ffff, and of 0x8000 remains 0x8000.
Conveniently the bfin backend provides instructions/built-ins that allow
this functionality to be tested. The two functions below
short stest_sat_max() { return __builtin_bfin_shl_fr1x16(10000,8); }
short stest_sat_min() { return __builtin_bfin_shl_fr1x16(-10000,8); }
2021-10-25 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* simplify-rtx.c (simplify_binary_operation_1) [SS_ASHIFT]: Simplify
shifts of the mode's smin_value and smax_value when the bit count
operand doesn't have side-effects.
[US_ASHIFT]: Likewise, simplify shifts of the mode's umax_value
when the bit count operand doesn't have side-effects.
(simplify_const_binary_operation) [SS_ASHIFT, US_ASHIFT]: Perform
compile-time evaluation of saturating left shifts with constant
arguments.
gcc/testsuite/ChangeLog
* gcc.target/bfin/ssashift-1.c: New test case.
Ed Schonberg [Tue, 14 Sep 2021 00:14:56 +0000 (20:14 -0400)]
[Ada] Spurious error on user-defined literal and operator
gcc/ada/
* sem_ch4.adb (Has_Possible_Literal_Aspects): If analysis of an
operator node fails to find a possible interpretation, and one
of its operands is a literal or a named number, assign to the
node the corresponding class type (Any_Integer, Any_String,
etc).
(Operator_Check): Call it before emitting a type error.
* sem_res.adb (Has_Applicable_User_Defined_Literal): Given a
literal and a type, determine whether the type has a
user_defined aspect that can apply to the literal, and rewrite
the node as call to the corresponding function. Most of the code
was previously in procedure Resolve.
(Try_User_Defined_Literal): Check operands of a predefined
operator that fails to resolve, and apply
Has_Applicable_User_Defined_Literal to literal operands if any,
to find if a conversion will allow the operator to resolve
properly.
(Resolve): Call the above when a literal or an operator with a
literal operand fails to resolve.
Bob Duff [Fri, 22 Oct 2021 16:00:38 +0000 (12:00 -0400)]
[Ada] Follow-on cleanups for Uint fields
gcc/ada/
* freeze.adb (Freeze_Fixed_Point_Type): Remove
previously-inserted test for Uint_0; no longer needed.
* gen_il-gen.ads: Improve comments.
* repinfo.adb (Rep_Value): Use Ubool type for B.
* repinfo.ads (Node_Ref): Use Unegative type.
(Node_Ref_Or_Val): Document that values of this type can be
No_Uint.
* exp_disp.adb (Make_Disp_Requeue_Body): Minor comment fix.
* sem_ch3.adb: Likewise.
* sem_ch8.adb: Likewise.
* sinfo-utils.adb (End_Location): End_Span can never be No_Uint,
so remove the "if No (L)" test.
* uintp.adb (Image_String): Use "for ... of" loop.
* uintp.ads (Unegative): New type for negative integers. We
give it a long name (unlike Unat and Upos) because it is rarely
used.
Bob Duff [Wed, 20 Oct 2021 20:55:38 +0000 (16:55 -0400)]
[Ada] Fix bugs in Base_Type_Only (etc.) fields
gcc/ada/
* gen_il-gen.adb (Put_Seinfo): Generate type
Seinfo.Type_Only_Enum based on type
Gen_IL.Internals.Type_Only_Enum. Automatically generating a copy
of the type will help keep them in sync. (Note that there are
no Ada compiler packages imported into Gen_IL.) Add a Type_Only
field to Field_Descriptor, so this information is available in
the Ada compiler (as opposed to just in the Gen_IL "compiler").
(One_Comp): Add initialization of the Type_Only field of
Field_Descriptor.
* gen_il-internals.ads (Image): Image function for
Type_Only_Enum.
* atree.ads (Node_To_Fetch_From): New function to compute which
node to fetch from, based on the Type_Only aspect.
* atree.adb (Get_Field_Value): Call Node_To_Fetch_From.
* treepr.adb (Print_Entity_Field): Call Node_To_Fetch_From.
(Print_Node_Field): Assert.
* sinfo-utils.adb (Walk_Sinfo_Fields,
Walk_Sinfo_Fields_Pairwise): Asserts.
Steve Baird [Fri, 15 Oct 2021 22:23:34 +0000 (15:23 -0700)]
[Ada] Relax INOX restrictions when casing on composite value.
gcc/ada/
* sem_case.adb (Composite_Case_Ops.Box_Value_Required): A new
function which takes a component type and returns a Boolean.
Returns True for the cases which were formerly forbidden as
components (these checks were formerly performed in the
now-deleted procedure
Check_Composite_Case_Selector.Check_Component_Subtype).
(Composite_Case_Ops.Normalized_Case_Expr_Type): Hoist this
function out of the Array_Case_Ops package because it has been
generalized to also do the analogous thing in the case of a
discriminated type.
(Composite_Case_Ops.Scalar_Part_Count): Return 0 if
Box_Value_Required returns True for the given type/subtype.
(Composite_Case_Ops.Choice_Analysis.Choice_Analysis.Component_Bounds_Info.
Traverse_Discrete_Parts): Return without doing anything if
Box_Value_Required returns True for the given type/subtype.
(Composite_Case_Ops.Choice_Analysis.Parse_Choice.Traverse_Choice):
If Box_Value_Required yields True for a given component type,
then check that the value of that component in a choice
expression is indeed a box (in which case the component is
ignored).
* doc/gnat_rm/implementation_defined_pragmas.rst: Update
documentation.
* gnat_rm.texi: Regenerate.
Bob Duff [Mon, 6 Sep 2021 17:01:04 +0000 (13:01 -0400)]
[Ada] Make Declaration_Node return nondeclarations in fewer cases
gcc/ada/
* einfo-utils.adb (Declaration_Node): Avoid returning the
following node kinds: N_Assignment_Statement, N_Integer_Literal,
N_Procedure_Call_Statement, N_Subtype_Indication, and
N_Type_Conversion. Assert that the result is in N_Is_Decl or
empty.
* gen_il-gen-gen_nodes.adb (N_Is_Decl): Modify to match the
things that Declaration_Node can return.
Piotr Trojanek [Tue, 15 Jun 2021 21:32:51 +0000 (23:32 +0200)]
[Ada] Reference in Unbounded_String is almost never null
gcc/ada/
* libgnat/a-strunb.ads (Unbounded_String): Reference is never
null.
* libgnat/a-strunb.adb (Finalize): Copy reference while it needs
to be deallocated.
Javier Miranda [Sat, 4 Sep 2021 17:11:34 +0000 (13:11 -0400)]
[Ada] Ada 2022: Class-wide types and formal abstract subprograms
gcc/ada/
* sem_ch8.adb (Build_Class_Wide_Wrapper): Previous version split
in two subprograms to factorize its functionality:
Find_Suitable_Candidate, and Build_Class_Wide_Wrapper. These
routines are also placed in the new subprogram
Handle_Instance_With_Class_Wide_Type.
(Handle_Instance_With_Class_Wide_Type): New subprogram that
encapsulates all the code that handles instantiations with
class-wide types.
(Analyze_Subprogram_Renaming): Adjust code to invoke the new
nested subprogram Handle_Instance_With_Class_Wide_Type; adjust
documentation.
Bob Duff [Fri, 10 Sep 2021 15:18:47 +0000 (11:18 -0400)]
[Ada] Renamed_Or_Alias cleanup
gcc/ada/
* einfo-utils.ads, einfo-utils.adb (Alias, Set_Alias,
Renamed_Entity, Set_Renamed_Entity, Renamed_Object,
Set_Renamed_Object): Add assertions that reflect how these are
supposed to be used and what they are supposed to return.
(Renamed_Entity_Or_Object): New getter.
(Set_Renamed_Object_Of_Possibly_Void): Setter that allows N to
be E_Void.
* checks.adb (Ensure_Valid): Use Renamed_Entity_Or_Object
because this is called for both cases.
* exp_dbug.adb (Debug_Renaming_Declaration): Use
Renamed_Entity_Or_Object because this is called for both cases.
Add assertions.
* exp_util.adb (Possible_Bit_Aligned_Component): Likewise.
* freeze.adb (Freeze_All_Ent): Likewise.
* sem_ch5.adb (Within_Function): Likewise.
* exp_attr.adb (Calculate_Header_Size): Call Renamed_Entity
instead of Renamed_Object.
* exp_ch11.adb (Expand_N_Raise_Statement): Likewise.
* repinfo.adb (Find_Declaration): Likewise.
* sem_ch10.adb (Same_Unit, Process_Spec_Clauses,
Analyze_With_Clause, Install_Parents): Likewise.
* sem_ch12.adb (Build_Local_Package, Needs_Body_Instantiated,
Build_Subprogram_Renaming, Check_Formal_Package_Instance,
Check_Generic_Actuals, In_Enclosing_Instance,
Denotes_Formal_Package, Process_Nested_Formal,
Check_Initialized_Types, Map_Formal_Package_Entities,
Restore_Nested_Formal): Likewise.
* sem_ch6.adb (Report_Conflict): Likewise.
* sem_ch8.adb (Analyze_Exception_Renaming,
Analyze_Generic_Renaming, Analyze_Package_Renaming,
Is_Primitive_Operator_In_Use, Declared_In_Actual,
Note_Redundant_Use): Likewise.
* sem_warn.adb (Find_Package_Renaming): Likewise.
* sem_elab.adb (Ultimate_Variable): Call Renamed_Object instead
of Renamed_Entity.
* exp_ch6.adb (Get_Function_Id): Call
Set_Renamed_Object_Of_Possibly_Void, because the defining
identifer is still E_Void at this point.
* sem_util.adb (Function_Call_Or_Allocator_Level): Likewise.
Remove redundant (unreachable) code.
(Is_Object_Renaming, Is_Valid_Renaming): Call Renamed_Object
instead of Renamed_Entity.
(Get_Fullest_View): Call Renamed_Entity instead of
Renamed_Object.
(Copy_Node_With_Replacement): Call
Set_Renamed_Object_Of_Possibly_Void because the defining entity
is sometimes E_Void.
* exp_ch5.adb (Expand_N_Assignment_Statement): Protect a call to
Renamed_Object with Is_Object to avoid assertion failure.
* einfo.ads: Minor comment fixes.
* inline.adb: Minor comment fixes.
* tbuild.ads: Minor comment fixes.
Yannick Moy [Fri, 15 Oct 2021 13:06:34 +0000 (15:06 +0200)]
[Ada] Issue error on invalid use of Ghost inside pragma Predicate
gcc/ada/
* sem_ch13.adb (Freeze_Entity_Checks): Perform same check on
predicate expression inside pragma as inside aspect.
* sem_util.adb (Is_Current_Instance): Recognize possible
occurrence of subtype as current instance inside the pragma
Predicate.
Martin Jambor [Mon, 25 Oct 2021 13:22:06 +0000 (15:22 +0200)]
sra: Fix the fix for PR 102505 (PR 102886)
I was not careful with the fix for PR 102505 and did not craft the
check to satisfy the verifier carefully, which lead to PR 102886.
(The verifier has the test structured differently and somewhat
redundantly, so I could not just copy it).
This patch fixes it. I hope it is quite obvious correction of an
oversight and so will commit it if survives bootstrap and testing on
x86_64-linux and ppc64le-linux.
Testcase for this bug is gcc.dg/tree-ssa/sra-18.c (but only on
platforms with constant pools). I will backport the two fixes
to the release branches squashed.
gcc/ChangeLog:
2021-10-22 Martin Jambor <mjambor@suse.cz>
PR tree-optimization/102886
* tree-sra.c (totally_scalarize_subtree): Fix the out of
access-condition.
Just like PR 100382, here we have a DCE removing a
null pointer load which is needed still.
In this case, execute_fixup_cfg removes a store (correctly)
and then removes the null load (incorrectly) due to
not checking stmt_unremovable_because_of_non_call_eh_p.
This patch adds the check in the similar way as the patch
to fix PR 100382 did.
gcc/ChangeLog:
* tree-ssa-dce.c (simple_dce_from_worklist):
Check stmt_unremovable_because_of_non_call_eh_p also
before removing the statement.
Richard Biener [Mon, 25 Oct 2021 09:33:10 +0000 (11:33 +0200)]
tree-optimization/102905 - restore re-align load for alignment peeling
Previous refactoring made the possibility of considering re-aligned
loads for unlimited cost model alignment peeling difficult so I
ditched that. Later refactoring made it easily possible again so
the following patch re-instantiates this which should fix the
observed regression on powerpc with altivec.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/102905
* tree-vect-data-refs.c (vect_enhance_data_refs_alignment):
Use vect_supportable_dr_alignment again to determine whether
an access is supported when not aligned.
Richard Biener [Mon, 25 Oct 2021 07:33:15 +0000 (09:33 +0200)]
tree-optimization/102920 - fix PHI VN with undefined args
This fixes a latent issue exposed by now allowing VN_TOP in PHI
arguments. We may only use optimistic equality when merging values on
different edges, not when merging values on the same edge - in particular
we may not choose the undef value on any edge when there's a not undef
value as well.
2021-10-25 Richard Biener <rguenther@suse.de>
PR tree-optimization/102920
* tree-ssa-sccvn.h (expressions_equal_p): Add argument
controlling VN_TOP matching behavior.
* tree-ssa-sccvn.c (expressions_equal_p): Likewise.
(vn_phi_eq): Do not optimistically match VN_TOP.
konglin1 [Tue, 19 Oct 2021 01:35:30 +0000 (09:35 +0800)]
Combine the FADD(A, FMA(B, C, 0)) to FMA(B, C, A) and combine FADD(A, FMUL(B, C)) to FMA(B, C, A).
This patch is to support transform in fast-math something like
_mm512_add_ph(x1, _mm512_fmadd_pch(a, b, _mm512_setzero_ph())) to
_mm512_fmadd_pch(a, b, x1).
And support transform _mm512_add_ph(x1, _mm512_fmul_pch(a, b))
to _mm512_fmadd_pch(a, b, x1).
gcc/ChangeLog:
* config/i386/sse.md (fma_<mode>_fadd_fmul): Add new
define_insn_and_split.
(fma_<mode>_fadd_fcmul):Likewise
(fma_<complexopname>_<mode>_fma_zero):Likewise
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512fp16-complex-fma.c: New test.
Revise -mdisable-fpregs option and add new -msoft-mult option
The behavior of the -mdisable-fpregs is confusing in that it doesn't
disable the use of the floating-point registers in all situations.
The -msoft-float disables the use of the floating-point registers in
all situations. The Linux kernel only needs to disable use of the
xmpyu instruction to avoid using the floating-point registers.
This change revises the -mdisable-fpregs option to disable the use of
the floating-point registers in all situations. It is now equivalent
to the -msoft-float option. A new -msoft-mult option is added to
disable use of the xmpyu instruction. The libgcc library can be
compiled with the -msoft-mult option to avoid using hardware integer
multiplication.
2021-10-24 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
* config/pa/pa-d.c (pa_d_handle_target_float_abi): Don't check
TARGET_DISABLE_FPREGS.
* config/pa/pa.c (fix_range): Use MASK_SOFT_FLOAT instead of
MASK_DISABLE_FPREGS.
(hppa_rtx_costs): Don't check TARGET_DISABLE_FPREGS. Adjust
cost of hardware integer multiplication.
(pa_conditional_register_usage): Don't check TARGET_DISABLE_FPREGS.
* config/pa/pa.h (INT14_OK_STRICT): Likewise.
* config/pa/pa.md: Don't check TARGET_DISABLE_FPREGS. Check
TARGET_SOFT_FLOAT in patterns that use xmpyu instruction.
* config/pa/pa.opt (mdisable-fpregs): Change target mask to
SOFT_FLOAT. Revise comment.
(msoft-float): New option.
This patch cures the testsuite failure of bfin/20090914-3.c, which
currently FAILs on bfin-elf with "(test for excess errors)" due to: 20090914-3.c:3:1: warning: return type defaults to 'int' [-Wimplicit-int]
which is obviously not what this code was intended to test. Fixed by
turning the code into a function returning the final "fract32" result,
as simply specifying an "int" return type for main, results in the
entire function being optimized away, as the result is unused.
2021-10-24 Roger Sayle <roger@nextmovesoftware.com>
gcc/testsuite/ChangeLog
* gcc.target/bfin/20090914-3.c: Tweak test case.
Roger Sayle [Sat, 23 Oct 2021 09:06:06 +0000 (10:06 +0100)]
x86_64: Add insn patterns for V1TI mode logic operations.
On x86_64, V1TI mode holds a 128-bit integer value in a (vector) SSE
register (where regular TI mode uses a pair of 64-bit general purpose
scalar registers). This patch improves the implementation of AND, IOR,
XOR and NOT on these values.
The benefit is demonstrated by the following simple test program:
with this patch we now generate the much more efficient:
and: pand %xmm1, %xmm0
ret
ior: por %xmm1, %xmm0
ret
xor: pxor %xmm1, %xmm0
ret
not: pcmpeqd %xmm1, %xmm1
pxor %xmm1, %xmm0
ret
For my first few attempts at this patch I tried adding V1TI to the
existing VI and VI12_AVX_512F mode iterators, but these then have
dependencies on other iterators (and attributes), and so on until
everything ties itself into a knot, as V1TI mode isn't really a
first-class vector mode on x86_64. Hence I ultimately opted to use
simple stand-alone patterns (as used by the existing TF mode support).
2021-10-23 Roger Sayle <roger@nextmovesoftware.com>
gcc/ChangeLog
* config/i386/sse.md (<any_logic>v1ti3): New define_insn to
implement V1TImode AND, IOR and XOR on TARGET_SSE2 (and above).
(one_cmplv1ti2): New define expand.
gcc/testsuite/ChangeLog
* gcc.target/i386/sse2-v1ti-logic.c: New test case.
* gcc.target/i386/sse2-v1ti-logic-2.c: New test case.
Stafford Horne [Thu, 21 Oct 2021 13:11:27 +0000 (22:11 +0900)]
or1k: Update FPU to specify detect tininess before rounding
This was not defined in the spec and not consistent in the
implementation causing incosistent behavior. After review we have
updated the CPU implementations and proposed the spec be updated to
specific that FPU tininess checks check for tininess before roudning.
Architecture change draft:
https://openrisc.io/proposals/p18-fpu-tininess
libgcc/ChangeLog:
* config/or1k/sfp-machine.h (_FP_TININESS_AFTER_ROUNDING):
Change to 0.
Martin Liska [Fri, 22 Oct 2021 08:12:56 +0000 (10:12 +0200)]
Handle jobserver file descriptors in btest.
PR testsuite/102742
libbacktrace/ChangeLog:
* btest.c (MIN_DESCRIPTOR): New.
(MAX_DESCRIPTOR): Likewise.
(check_available_files): Likewise.
(check_open_files): Check only file descriptors that
were not available at the entry.
(main): Call check_available_files.
Aldy Hernandez [Tue, 19 Oct 2021 18:57:49 +0000 (20:57 +0200)]
Disregard incoming equivalences to a path when defining a new one.
The equivalence oracle creates a new equiv set at each def point,
killing any incoming equivalences, however in the path sensitive
oracle we create brand new equivalences at each PHI:
BB4:
BB8:
x_5 = PHI <y_8(4)>
Here we note that x_5 == y_8 at the end of the path.
The current code is intersecting this new equivalence with previously
known equivalences coming into the path. This is incorrect, as this
is a new definition. This patch kills any known equivalence before we
register a new one.
This hasn't caused problems so far, but upcoming changes to the
pipeline has us threading more aggressively and triggering corner
cases where this causes incorrect code.
I have tested this patch with the usual regstrap cycle. I have also
hacked a compiler comparing the old and new behavior to see if we were
previously threading paths where the decision was made due to invalid
equivalences. Luckily, there were no such paths, but there were 22
paths in a set of .ii files where disregarding incoming relations
allowed us to thread the path. This is a miniscule improvement,
but we moved a handful of thredable paths earlier in the pipeline,
which is always good.
Tested on x86-64 Linux.
Co-authored-by: Andrew MacLeod <amacleod@redhat.com>
gcc/ChangeLog:
* gimple-range-path.cc (path_range_query::compute_phi_relations):
Kill any global relations we may know before registering a new
one.
* value-relation.cc (path_oracle::killing_def): New.
* value-relation.h (path_oracle::killing_def): New.