Andreas Krebbel [Wed, 22 Sep 2021 07:32:21 +0000 (09:32 +0200)]
IBM Z: Fix PR102222
Avoid emitting a strict low part move if the insv target actually
affects the whole target reg.
gcc/ChangeLog:
PR target/102222
* config/s390/s390.c (s390_expand_insv): Emit a normal move if it
is actually a full copy of the source operand into the target.
Don't emit a strict low part move if source and target mode match.
The removal of Cilk Plus support r8-4956 missed to remove
the streaming out of the bit, instead just change the value
for streaming out to be always false.
By hacking fp_expression_p to always return true, I can see
it reads the wrong fp_expressions value (false) out in wpa.
We cannot use r12 here, it is already in use as the GEP (for sibling
calls).
2021-09-08 Segher Boessenkool <segher@kernel.crashing.org>
PR target/102107
* config/rs6000/rs6000-logue.c (rs6000_emit_epilogue): For ELFv2 use
r11 instead of r12 for restoring CR.
rs6000: Don't use r12 for CR save on ELFv2 (PR102107)
CR is saved and/or restored on some paths where GPR12 is already live
since it has a meaning in the calling convention in the ELFv2 ABI.
It is not completely clear to me that we can always use r11 here, but
it does seem save, there is checking code (to detect conflicts here),
and it is stage 1. So here goes.
Harald Anlauf [Fri, 17 Sep 2021 19:45:33 +0000 (21:45 +0200)]
Fortran - (large) arrays in the main shall be static
gcc/fortran/ChangeLog:
PR fortran/102366
* trans-decl.c (gfc_finish_var_decl): Disable the warning message
for variables moved from stack to static storange if they are
declared in the main, but allow the move to happen.
gcc/testsuite/ChangeLog:
PR fortran/102366
* gfortran.dg/pr102366.f90: New test.
Eric Botcazou [Tue, 21 Sep 2021 07:25:47 +0000 (09:25 +0200)]
Fix no_fsanitize_address effective target
The implementation of the no_fsanitize_address effective target was copied
from asan-dg.exp without realizing that it does not work outside of this
context (there is a comment explaining why). As a consequence, it always
returns 0, so for example the directive in gnat.dg/asan1.adb:
{ dg-skip-if "no address sanitizer" { no_fsanitize_address } }
does not work. This led some people to add the nonsensical:
GCC11 - Fortran: combined directives - order(concurrent) not on distribute
While OpenMP 5.1 and GCC 12 permits 'order(concurrent)' on distribute,
OpenMP 5.0 and GCC 11 don't. This patch for GCC 11 ensures the clause also
does not end up on 'distribute' when splitting combined directives.
gcc/fortran/ChangeLog:
* trans-openmp.c (gfc_split_omp_clauses): Don't put 'order(concurrent)'
on 'distribute' for combined directives, matching OpenMP 5.0
gcc/testsuite/ChangeLog:
* gfortran.dg/gomp/distribute-order-concurrent.f90: New test.
Harald Anlauf [Thu, 16 Sep 2021 18:12:21 +0000 (20:12 +0200)]
Fortran - fix handling of optional allocatable DT arguments with INTENT(OUT)
gcc/fortran/ChangeLog:
PR fortran/102287
* trans-expr.c (gfc_conv_procedure_call): Wrap deallocation of
allocatable components of optional allocatable derived type
procedure arguments with INTENT(OUT) into a presence check.
gcc/testsuite/ChangeLog:
PR fortran/102287
* gfortran.dg/intent_out_14.f90: New test.
Eric Botcazou [Fri, 17 Sep 2021 08:12:12 +0000 (10:12 +0200)]
Fix PR rtl-optimization/102306
This is a duplication of volatile loads introduced during GCC 9 development
by the 2->2 mechanism of the RTL combiner. There is already a substantial
checking for volatile references in can_combine_p but it implicitly assumes
that the combination reduces the number of instructions, which is of course
not the case here. So the fix teaches try_combine to abort the combination
when it is about to make a copy of volatile references to preserve them.
gcc/
PR rtl-optimization/102306
* combine.c (try_combine): Abort the combination if we are about to
duplicate volatile references.
gcc/testsuite/
* gcc.target/sparc/20210917-1.c: New test.
Harald Anlauf [Mon, 13 Sep 2021 17:28:10 +0000 (19:28 +0200)]
Fortran - ensure simplification of bounds of array-valued named constants
gcc/fortran/ChangeLog:
PR fortran/82314
* decl.c (add_init_expr_to_sym): For proper initialization of
array-valued named constants the array bounds need to be
simplified before adding the initializer.
gcc/testsuite/ChangeLog:
PR fortran/82314
* gfortran.dg/pr82314.f90: New test.
Daniel Cederman [Mon, 25 Mar 2019 08:12:17 +0000 (09:12 +0100)]
sparc: Add scheduling information for LEON5
The LEON5 can often dual issue instructions from the same 64-bit aligned
double word if there are no data dependencies. Add scheduling information
to avoid scheduling unpairable instructions back-to-back.
gcc/ChangeLog:
* config/sparc/sparc-opts.h (enum sparc_processor_type): Add LEON5
* config/sparc/sparc.c (struct processor_costs): Add LEON5 costs
(leon5_adjust_cost): Increase cost of store with data dependency
on ALU instruction and FPU anti-dependencies.
(sparc_option_override): Add LEON5 costs
(sparc_adjust_cost): Add LEON5 cost adjustments
* config/sparc/sparc.h: Add LEON5
* config/sparc/sparc.md: Include LEON5 scheduling information
* config/sparc/sparc.opt: Add LEON5
* doc/invoke.texi: Add LEON5
* config/sparc/leon5.md: New file.
Daniel Cederman [Fri, 16 Oct 2020 07:12:30 +0000 (09:12 +0200)]
sparc: Skip all empty assembly statements
This version detects multiple empty assembly statements in a row and also
detects non-memory barrier empty assembly statements (__asm__("")). It
can be used instead of next_active_insn().
gcc/ChangeLog:
* config/sparc/sparc.c (next_active_non_empty_insn): New function
that returns next active non empty assembly instruction.
(sparc_do_work_around_errata): Use new function.
Daniel Cederman [Fri, 25 Sep 2020 11:17:46 +0000 (13:17 +0200)]
sparc: Treat more instructions as load or store in errata workarounds
Check the attribute of instruction to determine if it performs a store
or load operation. This more generic approach sees the last instruction
in the GOTdata_op model as a potential load and treats the memory barrier
as a potential store instruction.
gcc/ChangeLog:
* config/sparc/sparc.c (store_insn_p): Add predicate for store
attributes.
(load_insn_p): Add predicate for load attributes.
(sparc_do_work_around_errata): Use new predicates.
Andrew Pinski [Tue, 31 Aug 2021 04:41:14 +0000 (04:41 +0000)]
Fix target/101934: aarch64 memset code creates unaligned stores for -mstrict-align
The problem here is the aarch64_expand_setmem code did not check
STRICT_ALIGNMENT if it is creating an overlapping store.
This patch adds that check and the testcase works.
gcc/ChangeLog:
PR target/101934
* config/aarch64/aarch64.c (aarch64_expand_setmem):
Check STRICT_ALIGNMENT before creating an overlapping
store.
gcc/testsuite/ChangeLog:
PR target/101934
* gcc.target/aarch64/memset-strict-align-1.c: New test.
Jakub Jelinek [Wed, 15 Sep 2021 20:21:17 +0000 (22:21 +0200)]
c++: Fix handling of decls with flexible array members initialized with side-effects [PR88578]
> > Note, if the flexible array member is initialized only with non-constant
> > initializers, we have a worse bug that this patch doesn't solve, the
> > splitting of initializers into constant and dynamic initialization removes
> > the initializer and we don't have just wrong DECL_*SIZE, but nothing is
> > emitted when emitting those vars into assembly either and so the dynamic
> > initialization clobbers other vars that may overlap the variable.
> > I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
> > flexible array member in that case.
>
> Makes sense.
So, the following patch fixes that.
The typeck2.c change makes sure we keep those CONSTRUCTORs around (although
they should be empty because all their elts had side-effects/was
non-constant if it was removed earlier), and the varasm.c change is to avoid
ICEs on those as well as ICEs on other flex array members that had some
initializers without side-effects, but not on the last array element.
The code was already asserting that the (index of the last elt in the
CONSTRUCTOR + 1) times elt size is equal to TYPE_SIZE_UNIT of the local->val
type, which is true for C flex arrays or for C++ if they don't have any
side-effects or the last elt doesn't have side-effects, this patch changes
that to assertion that the TYPE_SIZE_UNIT is greater than equal to the
offset of the end of last element in the CONSTRUCTOR and uses TYPE_SIZE_UNIT
(int_size_in_bytes) in the code later on.
2021-09-15 Jakub Jelinek <jakub@redhat.com>
PR c++/88578
PR c++/102295
gcc/
* varasm.c (output_constructor_regular_field): Instead of assertion
that array_size_for_constructor result is equal to size of
TREE_TYPE (local->val) in bytes, assert that the type size is greater
or equal to array_size_for_constructor result and use type size as
fieldsize.
gcc/cp/
* typeck2.c (split_nonconstant_init_1): Don't throw away empty
initializers of flexible array members if they have non-zero type
size.
gcc/testsuite/
* g++.dg/ext/flexary39.C: New test.
* g++.dg/ext/flexary40.C: New test.
Jakub Jelinek [Tue, 14 Sep 2021 14:56:30 +0000 (16:56 +0200)]
c++: Update DECL_*SIZE for objects with flexible array members with initializers [PR102295]
The C FE updates DECL_*SIZE for vars which have initializers for flexible
array members for many years, but C++ FE kept DECL_*SIZE the same as the
type size (i.e. as if there were zero elements in the flexible array
member). This results e.g. in ELF symbol sizes being too small.
Note, if the flexible array member is initialized only with non-constant
initializers, we have a worse bug that this patch doesn't solve, the
splitting of initializers into constant and dynamic initialization removes
the initializer and we don't have just wrong DECL_*SIZE, but nothing is
emitted when emitting those vars into assembly either and so the dynamic
initialization clobbers other vars that may overlap the variable.
I think we need keep an empty CONSTRUCTOR elt in DECL_INITIAL for the
flexible array member in that case.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102295
* decl.c (layout_var_decl): For aggregates ending with a flexible
array member, add the size of the initializer for that member to
DECL_SIZE and DECL_SIZE_UNIT.
Jakub Jelinek [Tue, 14 Sep 2021 14:55:04 +0000 (16:55 +0200)]
c++: Fix __is_*constructible/assignable for templates [PR102305]
is_xible_helper returns error_mark_node (i.e. false from the traits)
for abstract classes by testing ABSTRACT_CLASS_TYPE_P (to) early.
Unfortunately, as the testcase shows, that doesn't work on class templates
that haven't been instantiated yet, ABSTRACT_CLASS_TYPE_P for them is false
until it is instantiated, which is done when the routine later constructs
a dummy object with that type.
The following patch fixes this by calling complete_type first, so that
ABSTRACT_CLASS_TYPE_P test will work properly, while keeping the handling
of arrays with unknown bounds, or incomplete types where it is done
currently.
2021-09-14 Jakub Jelinek <jakub@redhat.com>
PR c++/102305
* method.c (is_xible_helper): Call complete_type on to.
Peter Bergner [Wed, 14 Jul 2021 23:27:02 +0000 (18:27 -0500)]
rs6000: Generate an lxvp instead of two adjacent lxv instructions
The MMA build built-ins currently use individual lxv instructions to
load up the registers of a __vector_pair or __vector_quad. If the
memory addresses of the built-in operands are to adjacent locations,
then we can use an lxvp in some cases to load up two registers at once.
The patch below adds support for checking whether memory addresses are
adjacent and emitting an lxvp instead of two lxv instructions.
2021-07-14 Peter Bergner <bergner@linux.ibm.com>
gcc/
* config/rs6000/rs6000.c (adjacent_mem_locations): Return the lower
addressed memory rtx, if any.
(rs6000_split_multireg_move): Fix code formatting.
Handle MMA build built-ins with operands in adjacent memory locations.
gcc/testsuite/
* gcc.target/powerpc/mma-builtin-9.c: New test.
Eric Botcazou [Tue, 14 Sep 2021 09:33:05 +0000 (11:33 +0200)]
Fix PR ada/101970
This is a regression present on the mainline and 11 branch in the form of an
ICE for an enumeration type with a full signed representation for its size.
gcc/ada/
PR ada/101970
* exp_attr.adb (Expand_N_Attribute_Reference) <Attribute_Enum_Rep>:
Use an unchecked conversion instead of a regular conversion in the
enumeration case and remove Conversion_OK flag in the integer case.
<Attribute_Pos>: Remove superfluous test.
Eric Botcazou [Tue, 14 Sep 2021 07:41:36 +0000 (09:41 +0200)]
Give more informative error message for by-reference types
Recent compilers enforce more strictly the RM C.6(18) clause, which says
that volatile record types are by-reference types. This changes the typical
error message now given in these cases.
gcc/ada/
* gcc-interface/decl.c (gnat_to_gnu_entity) <is_type>: Declare new
constant. Adjust error message issued by validate_size in the case
of by-reference types.
(validate_size): Always use the error strings passed by the caller.
Harald Anlauf [Thu, 9 Sep 2021 19:34:01 +0000 (21:34 +0200)]
Fortran - out of bounds in array constructor with implied do loop
gcc/fortran/ChangeLog:
PR fortran/98490
* trans-expr.c (gfc_conv_substring): Do not generate substring
bounds check for implied do loop index variable before it actually
becomes defined.
gcc/testsuite/ChangeLog:
PR fortran/98490
* gfortran.dg/bounds_check_23.f90: New test.
Ian Lance Taylor [Fri, 10 Sep 2021 18:14:25 +0000 (11:14 -0700)]
compiler: correct condition for calling memclrHasPointers
When compiling append(s, make([]typ, ln)...), where typ has a pointer,
and the append fits within the existing capacity of s, the condition
used to clear out the new elements was reversed.
Jakub Jelinek [Wed, 8 Sep 2021 09:25:31 +0000 (11:25 +0200)]
i386: Fix up @xorsign<mode>3_1 [PR102224]
As the testcase shows, we miscompile @xorsign<mode>3_1 if both input
operands are in the same register, because the splitter overwrites op1
before with op1 & mask before using op0.
For dest = xorsign op0, op0 we can actually simplify it from
dest = (op0 & mask) ^ op0 to dest = op0 & ~mask (aka abs).
The expander change is an optimization improvement, if we at expansion
time know it is xorsign op0, op0, we can emit abs right away and get better
code through that.
The @xorsign<mode>3_1 is a fix for the case where xorsign wouldn't be known
to have same operands during expansion, but during RTL optimizations they
would appear. We need to use earlyclobber, we require dest and op1 to be
the same but op0 must be different because we overwrite
op1 first.
2021-09-08 Jakub Jelinek <jakub@redhat.com>
PR target/102224
* config/i386/i386.md (xorsign<mode>3): If operands[1] is equal to
operands[2], emit abs<mode>2 instead.
(@xorsign<mode>3_1): Add early-clobber for output operand.
* gcc.dg/pr102224.c: New test.
* gcc.target/i386/avx-pr102224.c: New test.
Joseph Myers [Wed, 8 Sep 2021 15:38:18 +0000 (15:38 +0000)]
testsuite: Allow .sdata in more cases in gcc.dg/array-quals-1.c
When testing for Nios II (gcc-testresults shows this for MIPS as
well), failures of gcc.dg/array-quals-1.c appear where a symbol was
found in .sdata rather than one of the expected sections.
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?a$ (found a) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?b$ (found b) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?c$ (found c) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
FAIL: gcc.dg/array-quals-1.c scan-assembler-symbol-section symbol ^_?d$ (found d) has section ^\\.(const|rodata|srodata)|\\[RO\\] (found .sdata)
Jakub's commit 0b34dbc0a24864b1674bff7a92fa3cf0f1cbcea1 allowed .sdata
for many variables in that test where use of .sdata caused a failure
on powerpc-linux. I'm presuming the choice of which variables had
.sdata allowed was based only on the code generated for powerpc-linux,
not on any reason it would be wrong to allow it for the other
variables; thus, this patch adjusts the test to allow .sdata for some
more variables where that is needed on Nios II (and in one case where
it's not needed on Nios II, but the test results on gcc-testresults
suggest that it is needed on MIPS).
Tested with no regressions with cross to nios2-elf.
* gcc.dg/array-quals-1.c: Allow .sdata section in more cases.
Joseph Myers [Wed, 8 Sep 2021 14:59:41 +0000 (14:59 +0000)]
testsuite: Use explicit -ftree-cselim in tests using -fdump-tree-cselim-details
When testing for Nios II (gcc-testresults shows this for various other
targets as well), tests scanning cselim dumps produce an UNRESOLVED
result because those dumps do not exist.
cselim is enabled conditionally by code in toplev.c:
if (flag_tree_cselim == AUTODETECT_VALUE)
{
if (HAVE_conditional_move)
flag_tree_cselim = 1;
else
flag_tree_cselim = 0;
}
Add explicit -ftree-cselim to dg-options in the affected tests (as
already used by some other tests of cselim dumps) so that this dump
exists on all architectures.
Tested with no regressions with cross to nios2-elf, where this causes
the tests in question to PASS instead of being UNRESOLVED.
Max Filippov [Tue, 7 Sep 2021 22:40:00 +0000 (15:40 -0700)]
gcc: xtensa: fix PR target/102115
2021-09-07 Takayuki 'January June' Suwa <jjsuwa_sys3175@yahoo.co.jp>
gcc/
PR target/102115
* config/xtensa/xtensa.c (xtensa_emit_move_sequence): Add
'CONST_INT_P (src)' to the condition of the block that tries to
eliminate literal when loading integer contant.
Jakub Jelinek [Tue, 7 Sep 2021 17:33:28 +0000 (19:33 +0200)]
c++: Fix up constexpr evaluation of deleting dtors [PR100495]
We do not save bodies of constexpr clones and instead evaluate the bodies
of the constexpr functions they were cloned from.
I believe that is just fine for constructors because complete vs. base
ctors differ only in classes that have virtual bases and such constructors
aren't constexpr, similarly complete/base destructors.
But as the testcase below shows, for deleting destructors it is not fine,
deleting dtors while marked as clones in fact are just artificial functions
with synthetized body which calls the user destructor and deallocation.
So, either we'd need to evaluate the destructor and afterwards synthetize
and evaluate the deallocation, or we can just save and use the deleting
dtors bodies. The latter seems much easier to me.
2021-09-07 Jakub Jelinek <jakub@redhat.com>
PR c++/100495
* constexpr.c (maybe_save_constexpr_fundef): Save body even for
constexpr deleting dtors.
(cxx_eval_call_expression): Don't use DECL_CLONED_FUNCTION for
deleting dtors.
Richard Biener [Wed, 25 Aug 2021 08:06:01 +0000 (10:06 +0200)]
tree-optimization/102046 - fix SLP build from scalars with patterns
When we swap operands for SLP builds we lose track where exactly
pattern defs are - but we fail to update the any_pattern member
of the operands info. Do so conservatively.
Richard Biener [Mon, 16 Aug 2021 13:17:08 +0000 (15:17 +0200)]
tree-optimization/101925 - fix VN with reverse storage order
This fixes value-numbering breaking reverse storage order accesses
due to a missed check. It adds a new overload for
reverse_storage_order_for_component_p and sets reversed on the
VN IL ops for component and array accesses accordingly.
It also compares the reversed reference ops flag on reference
lookup.
2021-08-16 Richard Biener <rguenther@suse.de>
PR tree-optimization/101925
* tree-ssa-sccvn.c (copy_reference_ops_from_ref): Set
reverse on COMPONENT_REF and ARRAY_REF according to
what reverse_storage_order_for_component_p does.
(vn_reference_eq): Compare reversed on reference ops.
(reverse_storage_order_for_component_p): New overload.
(vn_reference_lookup_3): Check reverse_storage_order_for_component_p
on the reference looked up.
Richard Biener [Mon, 9 Aug 2021 08:19:10 +0000 (10:19 +0200)]
middle-end/101824 - properly handle volatiles in nested fn lowering
When we build the COMPONENT_REF of a formerly volatile local off
the FRAME decl we have to make sure to mark the COMPONENT_REF
as TREE_THIS_VOLATILE. While the GIMPLE operand scanner looks
at the FIELD_DECL this is not how volatile GENERIC refs work.
2021-08-09 Richard Biener <rguenther@suse.de>
PR middle-end/101824
* tree-nested.c (get_frame_field): Mark the COMPONENT_REF as
volatile in case the variable was.
Harald Anlauf [Mon, 30 Aug 2021 20:41:01 +0000 (22:41 +0200)]
Fortran - correct check for constraint F2008:C628 / F2018:C932
gcc/fortran/ChangeLog:
PR fortran/101349
* resolve.c (resolve_allocate_expr): An unlimited polymorphic
argument to ALLOCATE must be ALLOCATABLE or a POINTER. Fix the
corresponding check.
gcc/testsuite/ChangeLog:
PR fortran/101349
* gfortran.dg/unlimited_polymorphic_33.f90: New test.
2021-09-03 Michael Meissner <meissner@linux.ibm.com>
gcc/testsuite/
PR target/94630
* gcc.target/powerpc/pr70117.c: Specify that we need the long double
type to be IBM 128-bit. Remove the code to use __ibm128.
Backport from master 2021-08-25.
* c-c++-common/dfp/convert-bfp-11.c: Specify that we need the long
double type to be IBM 128-bit. Run the test at -O2 optimization.
Backport from master 2021-08-25.
* lib/target-supports.exp (add_options_for_long_double_ibm128): New
function. Backport from master 2021-08-25.
(check_effective_target_long_double_ibm128): New function.
(add_options_for_long_double_ieee128): New function.
(check_effective_target_long_double_ieee128): New function.
(add_options_for_long_double_64bit): New function.
(check_effective_target_long_double_64bit): New function.
Peter Bergner [Thu, 19 Aug 2021 22:33:29 +0000 (17:33 -0500)]
rs6000: Fix ICE expanding lxvp and stxvp gimple built-ins [PR101849]
PR101849 shows we ICE on a test case when we pass a non __vector_pair *
pointer to the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins
that is cast to __vector_pair *. The problem is that when we expand
the built-in, the cast has already been removed from gimple and we are
only given the base pointer. The solution used here (which fixes the ICE)
is to catch this case and convert the pointer to a __vector_pair * pointer
when expanding the built-in.
Marek Polacek [Wed, 1 Sep 2021 20:47:44 +0000 (16:47 -0400)]
c++: Fix ICE with nullptr comparison (GCC 11) [PR101592]
On trunk, PR101592 was fixed by r12-2537, but that change shouldn't be
backported to GCC 11. In the PR Jakub suggested this fix, so here it
is, after the usual testing.
PR c++/101592
gcc/ChangeLog:
* fold-const.c (make_range_step): Return NULL_TREE for NULLPTR_TYPE.
Jakub Jelinek [Wed, 1 Sep 2021 11:30:51 +0000 (13:30 +0200)]
vectorizer: Fix up vectorization using WIDEN_MINUS_EXPR [PR102124]
The following testcase is miscompiled on aarch64-linux at -O3 since the
introduction of WIDEN_MINUS_EXPR.
The problem is if the inner type (half_type) is unsigned and the result
type in which the subtraction is performed (type) has precision more than
twice as larger as the inner type's precision.
For other widening operations like WIDEN_{PLUS,MULT}_EXPR, if half_type
is unsigned, the addition/multiplication result in itype is also unsigned
and needs to be zero-extended to type.
But subtraction is special, even when half_type is unsigned, the subtraction
behaves as signed (also regardless of whether the result type is signed or
unsigned), 0xfeU - 0xffU is -1 or 0xffffffffU, not 0x0000ffff.
I think it is better not to use mixed signedness of types in
WIDEN_MINUS_EXPR (have unsigned vector of operands and signed result
vector), so this patch instead adds another cast to make sure we always
sign-extend the result from itype to type if type is wider than itype.
2021-09-01 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/102124
* tree-vect-patterns.c (vect_recog_widen_op_pattern): For ORIG_CODE
MINUS_EXPR, if itype is unsigned with smaller precision than type,
add an extra cast to signed variant of itype to ensure sign-extension.
Thomas Schwinge [Fri, 27 Aug 2021 05:49:35 +0000 (07:49 +0200)]
Fix 'OMP_CLAUSE_TILE' operands handling in 'gcc/tree.c:walk_tree_1'
In r245300 (commit 02889d23ee3b02854dff203dd87b9a25e30b61b4)
"OpenACC tile clause support" that one had changed to three operands,
similar to 'OMP_CLAUSE_COLLAPSE'.
There is no (existing) test case where this seems to matter (likewise
for 'OMP_CLAUSE_COLLAPSE'), but it's good to be consistent.
gcc/
* tree.c (walk_tree_1) <OMP_CLAUSE_TILE>: Handle three operands.
Quoting from https://gcc.gnu.org/pipermail/gcc/2021-July/236716.html:
--------------------------------------------------------------------
It was pointed out to me off-list that config/aarch64/value-unwind.h
is missing the runtime exception. It looks like a few other files
are too; a fuller list is:
Certainly for the aarch64 file this was simply a mistake;
it seems to have been copied from the i386 version, both of which
reference the runtime exception but don't actually include it.
--------------------------------------------------------------------
Similarly, frv-abi.h referenced the exception but didn't include it.
pa64-hpux-lib.h was missing any reference to the exception.
The decision was that this was simply a mistake
[https://gcc.gnu.org/pipermail/gcc/2021-July/236717.html]:
--------------------------------------------------------------------
[…] It generally is
considered a textual omission. The runtime library components of GCC
are intended to be licensed under the runtime exception, which was
granted and approved at the time of introduction.
--------------------------------------------------------------------
and that we should simply change all of the files above
[https://gcc.gnu.org/pipermail/gcc/2021-July/236719.html]:
--------------------------------------------------------------------
Please correct the text in the files. The files in libgcc used in the
GCC runtime are intended to be licensed with the runtime exception and
GCC previously was granted approval for that licensing and purpose.
[…]
The runtime exception explicitly was intended for this purpose and
usage at the time that GCC received approval to apply the exception.
--------------------------------------------------------------------
Haochen Gui [Fri, 4 Jun 2021 06:38:53 +0000 (14:38 +0800)]
rs6000: Disable mode promotion for pseudos
rs6000 has instructions that can do almost everything 32 bit
at least as efficiently as corresponding 64 bit things. The
mode promotion can be defered to when a wide mode is necessary.
So it helps a lot not promote mode for pseudos. SPECint test
shows that the overall performance improvement (by geomean) is
more than 2% with this patch.
testsuite/gcc.target/powerpc/not-promote-mode.c illustrates how
the patch eliminates the redundant extensions and do further
optimization by disabling mode promotion for pseduos.
Paul Thomas [Thu, 6 May 2021 13:41:33 +0000 (14:41 +0100)]
Fortran: Assumed and explicit size class arrays [PR46691/99819].
2021-05-06 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran/ChangeLog
PR fortran/46691
PR fortran/99819
* class.c (gfc_build_class_symbol): Remove the error that
disables assumed size class arrays. Class array types that are
not deferred shape or assumed rank are given a unique name and
placed in the procedure namespace.
* trans-array.c (gfc_trans_g77_array): Obtain the data pointer
for class arrays.
(gfc_trans_dummy_array_bias): Suppress the runtime error for
extent violations in explicit shape class arrays because it
always fails.
* trans-expr.c (gfc_conv_procedure_call): Handle assumed size
class actual arguments passed to non-descriptor formal args by
using the data pointer, stored as the symbol's backend decl.
gcc/testsuite/ChangeLog
PR fortran/46691
PR fortran/99819
* gfortran.dg/class_dummy_6.f90: New test.
* gfortran.dg/class_dummy_7.f90: New test.
Harald Anlauf [Tue, 24 Aug 2021 19:07:50 +0000 (21:07 +0200)]
Fortran: fix pointless warning for static variables
gcc/fortran/ChangeLog:
PR fortran/98411
* trans-decl.c (gfc_finish_var_decl): Adjust check to handle
implicit SAVE as well as variables in the main program. Improve
warning message text.
gcc/testsuite/ChangeLog:
PR fortran/98411
* gfortran.dg/pr98411.f90: Adjust testcase options to restrict to
F2008, and verify case of implicit SAVE.