git.ipfire.org Git - thirdparty/gcc.git/log

c++: Fix thinko in enum_min_precision [PR61414]

When backporting the PR61414 fix to 8.4, I've noticed that the caching
of prec is actually broken, as it would fail to actually store the computed
precision into the hash_map's value and so next time we'd think the enum needs
0 bits.

2020-02-14 Jakub Jelinek <jakub@redhat.com>

PR c++/61414
* class.c (enum_min_precision): Change prec type from int to int &.

* g++.dg/cpp0x/enum39.C: New test.

sel-sched: allow negative insn priority (PR 88879)

PR rtl-optimization/88879
* sel-sched.c (sel_target_adjust_priority): Remove assert.

middle-end/90648 fend off builtin calls with not enough arguments from match

This adds guards to genmatch generated code before accessing call
expression or stmt arguments that might be out of bounds when
the user provided bogus prototypes for what we consider builtins.

2020-02-05 Richard Biener <rguenther@suse.de>

PR middle-end/90648
* genmatch.c (dt_node::gen_kids_1): Emit number of argument
checks before matching calls.

* gcc.dg/pr90648.c: New testcase.

tree-optimization/93381 fix integer offsetting in points-to analysis

We were incorrectly assuming a merge operation is conservative enough
for not explicitely handled operations but we also need to consider
offsetting within fields when field-sensitive analysis applies.

2020-01-22 Richard Biener <rguenther@suse.de>

PR tree-optimization/93381
* tree-ssa-structalias.c (find_func_aliases): Assume offsetting
throughout, handle all conversions the same.

* gcc.dg/torture/pr93381.c: New testcase.

tree-optimization/93439 move clique bookkeeping to OMP expansion

Autopar was doing clique bookkeeping too early when creating destination
functions but then later introducing new cliques via versioning loops.
The following moves the bookkeeping to the actual outlining process.

2020-02-14 Richard Biener <rguenther@suse.de>

Backport from mainline
2020-01-28 Richard Biener <rguenther@suse.de>

PR tree-optimization/93439
* tree-parloops.c (create_loop_fn): Move clique bookkeeping...
* tree-cfg.c (move_sese_region_to_fn): ... here.
(verify_types_in_gimple_reference): Verify used cliques are
tracked.

* gfortran.dg/graphite/pr93439.f90: New testcase.

middle-end/93054 deal with undefs in call gimplification

2020-02-14 Richard Biener <rguenther@suse.de>

Backport from mainline
2020-01-09 Richard Biener <rguenther@suse.de>

PR middle-end/93054
* gimplify.c (gimplify_expr): Deal with NOP definitions.

* gcc.dg/pr93054.c: New testcase.

debug/92763 keep DIEs that might be used in DW_TAG_inlined_subroutine

We were pruning type-local subroutine DIEs if their context is unused
despite us later needing those DIEs as abstract origins for inlines.
The patch makes code already present for -fvar-tracking-assignments
unconditional.

2020-02-14 Richard Biener <rguenther@suse.de>

Backport from mainline
2020-01-20 Richard Biener <rguenther@suse.de>

PR debug/92763
* dwarf2out.c (prune_unused_types): Unconditionally mark
called function DIEs.

* g++.dg/debug/pr92763.C: New testcase.

tree-optimization/92704 fix ifcvt ICE with loops without stores

2020-02-14 Richard Biener <rguenther@suse.de>

Backport from mainline
2019-11-29 Richard Biener <rguenther@suse.de>

PR tree-optimization/92704
* tree-if-conv.c (combine_blocks): Deal with virtual PHIs
in loops performing only loads.

* gcc.dg/torture/pr92704.c: New testcase.

middle-end/92674 delay purging EH edges when folding during inlining

2020-02-14 Richard Biener <rguenther@suse.de>

Backport from mainline
2019-11-27 Richard Biener <rguenther@suse.de>

PR middle-end/92674
* tree-inline.c (expand_call_inline): Delay purging EH/abnormal
edges and instead record blocks in bitmap.
(gimple_expand_calls_inline): Adjust.
(fold_marked_statements): Delay EH cleanup until all folding is
done.
(optimize_inline_calls): Do EH/abnormal cleanup for calls after
inlining finished.

Intrinsic macro of vpshr* and vpshl* lack a closing parenthesis which would cause failure in O0.

2020-02-14 Hongtao Liu <hongtao.liu@intel.com>

gcc/
PR target/93724
* config/i386/avx512vbmi2intrin.h
(_mm512_shrdi_epi16, _mm512_mask_shrdi_epi16,
_mm512_maskz_shrdi_epi16, _mm512_shrdi_epi32,
_mm512_mask_shrdi_epi32, _mm512_maskz_shrdi_epi32,
_m512_shrdi_epi64, _m512_mask_shrdi_epi64,
_m512_maskz_shrdi_epi64, _mm512_shldi_epi16,
_mm512_mask_shldi_epi16, _mm512_maskz_shldi_epi16,
_mm512_shldi_epi32, _mm512_mask_shldi_epi32,
_mm512_maskz_shldi_epi32, _mm512_shldi_epi64,
_mm512_mask_shldi_epi64, _mm512_maskz_shldi_epi64): Fix typo
of lacking a closing parenthesis.
* config/i386/avx512vbmi2vlintrin.h
(_mm256_shrdi_epi16, _mm256_mask_shrdi_epi16,
_mm256_maskz_shrdi_epi16, _mm256_shrdi_epi32,
_mm256_mask_shrdi_epi32, _mm256_maskz_shrdi_epi32,
_m256_shrdi_epi64, _m256_mask_shrdi_epi64,
_m256_maskz_shrdi_epi64, _mm256_shldi_epi16,
_mm256_mask_shldi_epi16, _mm256_maskz_shldi_epi16,
_mm256_shldi_epi32, _mm256_mask_shldi_epi32,
_mm256_maskz_shldi_epi32, _mm256_shldi_epi64,
_mm256_mask_shldi_epi64, _mm256_maskz_shldi_epi64,
_mm_shrdi_epi16, _mm_mask_shrdi_epi16,
_mm_maskz_shrdi_epi16, _mm_shrdi_epi32,
_mm_mask_shrdi_epi32, _mm_maskz_shrdi_epi32,
_mm_shrdi_epi64, _mm_mask_shrdi_epi64,
_m_maskz_shrdi_epi64, _mm_shldi_epi16,
_mm_mask_shldi_epi16, _mm_maskz_shldi_epi16,
_mm_shldi_epi32, _mm_mask_shldi_epi32,
_mm_maskz_shldi_epi32, _mm_shldi_epi64,
_mm_mask_shldi_epi64, _mm_maskz_shldi_epi64): Ditto.

gcc/testsuite/
* gcc.target/i386/avx512vbmi2-vpshld-1.c: New test.
* gcc.target/i386/avx512vbmi2-vpshrd-1.c: Ditto.
* gcc.target/i386/sse-12.c: Add -mavx512vbmi2.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-14.c: Add -mavx512vbmi2 and tests.
* gcc.target/i386/sse-22.c: Ditto.

Daily bump.

c: Fix ICE with cast to VLA [93576]

The following testcase ICEs, because the PR84305 changes try to evaluate
the size earlier.  If size has side-effects, that is desirable, and the
side-effects will actually be wrapped in a SAVE_EXPR.  The problem on this
testcase is that there are no side-effects, and c_fully_fold doesn't fold
those COMPOUND_EXPRs to constant, and while before gimplification we unshare
trees found in the expressions, the unsharing doesn't involve TYPE_SIZE etc.
of used types.  Gimplification is destructive though, so when we gimplify
the two nested COMPOUND_EXPRs and then try to gimplify it the second time
for the TYPE_SIZEs, we ICE.
Now, we could use unshare_expr in what we push to *expr, SAVE_EXPRs and
their operands in there aren't unshared, but I really don't see a point of
evaluating expressions that don't have side-effects before, so instead
this just pushes there expressions that do have side-effects.

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

PR c/93576
* c-decl.c (grokdeclarator): If this_size_varies, only push size into
*expr if it has side effects.

* gcc.dg/pr93576.c: New test.

i386: Fix up _mm*_mask_popcnt_epi* [PR93696]

As mentioned in the PR and as
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=_mask_popcnt_epi
also documents, _mm*_popcnt_epi* intrinsics are consistent with all other
unary AVX512* intrinsics regarding arguments, i.e. the
_mm*_whatever has just single argument (called a in the docs, and __A in the
GCC headers),
_mm*_mask_whatever has 3 arguments (called src, k, a in the docs and
_W, __U, __A in GCC headers) and
_mm*_maskz_whatever 2 arguments (called k, a in the docs and __U, __A in GCC
headers). Unfortunately, whomever implemented the _mm*_popcnt_epi*
intrinsics got it wrong for the _mm*_mask_popcnt_epi* ones, calling the
args __A, __U, __B and not passing them in the canonical order to the
builtins, making it API incompatible with ICC as well as clang (tested on
godbolts clang 7/8/9/trunk and ICC 19.0.{0,1}, older clang/ICC don't
understand those, so it isn't that it used to be broken even in other
compilers and got changed afterwards).

2020-02-13 Jakub Jelinek <jakub@redhat.com>

PR target/93696
* config/i386/avx512bitalgintrin.h (_mm512_mask_popcnt_epi8,
_mm512_mask_popcnt_epi16, _mm256_mask_popcnt_epi8,
_mm256_mask_popcnt_epi16, _mm_mask_popcnt_epi8,
_mm_mask_popcnt_epi16): Rename __B argument to __A and __A to __W,
pass __A to the builtin followed by __W instead of __A followed by
__B.
* config/i386/avx512vpopcntdqintrin.h (_mm512_mask_popcnt_epi32,
_mm512_mask_popcnt_epi64): Likewise.
* config/i386/avx512vpopcntdqvlintrin.h (_mm_mask_popcnt_epi32,
_mm256_mask_popcnt_epi32, _mm_mask_popcnt_epi64,
_mm256_mask_popcnt_epi64): Likewise.

* gcc.target/i386/pr93696-1.c: New test.
* gcc.target/i386/pr93696-2.c: New test.
* gcc.target/i386/avx512bitalg-vpopcntw-1.c (TEST): Fix argument order
of _mm*_mask_popcnt_*.
* gcc.target/i386/avx512vpopcntdq-vpopcntq-1.c (TEST): Likewise.
* gcc.target/i386/avx512vpopcntdq-vpopcntd-1.c (TEST): Likewise.
* gcc.target/i386/avx512bitalg-vpopcntb-1.c (TEST): Likewise.
* gcc.target/i386/avx512bitalg-vpopcntb.c (foo): Likewise.
* gcc.target/i386/avx512bitalg-vpopcntbvl.c (foo): Likewise.
* gcc.target/i386/avx512vpopcntdq-vpopcntd.c (foo): Likewise.
* gcc.target/i386/avx512bitalg-vpopcntwvl.c (foo): Likewise.
* gcc.target/i386/avx512bitalg-vpopcntw.c (foo): Likewise.
* gcc.target/i386/avx512vpopcntdq-vpopcntq.c (foo): Likewise.

i386: Fix k*shift* intrinsics [PR93673]

As mentioned in the PR, the intrinsics allow counts from 0 to 255, but
we actually reject values from 128 to 255.  That is because QImode
CONST_INTs can be only -128 to 127.  Fixed by using const_0_to_255_operand
and dropping the modes for the operands with those predicates
(the IL actually contains the CONST_INT which has VOIDmode).

2020-02-13  Jakub Jelinek  <jakub@redhat.com>

PR target/93673
* config/i386/sse.md (k<code><mode>): Drop mode from last operand and
use const_0_to_255_operand predicate instead of immediate_operand.
(avx512dq_fpclass<mode><mask_scalar_merge_name>,
avx512dq_vmfpclass<mode><mask_scalar_merge_name>,
vgf2p8affineinvqb_<mode><mask_name>,
vgf2p8affineqb_<mode><mask_name>): Drop mode from
const_0_to_255_operand predicated operands.

* gcc.target/i386/avx512f-pr93673.c: New test.
* gcc.target/i386/avx512dq-pr93673.c: New test.
* gcc.target/i386/avx512bw-pr93673.c: New test.

i386: Fix up vec_extract_lo* patterns [PR93670]

The VEXTRACT* insns have way too many different CPUID feature flags (ATT
syntax)
vextractf128 $imm, %ymm, %xmm/mem AVX
vextracti128 $imm, %ymm, %xmm/mem AVX2
vextract{f,i}32x4 $imm, %ymm, %xmm/mem {k}{z} AVX512VL+AVX512F
vextract{f,i}32x4 $imm, %zmm, %xmm/mem {k}{z} AVX512F
vextract{f,i}64x2 $imm, %ymm, %xmm/mem {k}{z} AVX512VL+AVX512DQ
vextract{f,i}64x2 $imm, %zmm, %xmm/mem {k}{z} AVX512DQ
vextract{f,i}32x8 $imm, %zmm, %ymm/mem {k}{z} AVX512DQ
vextract{f,i}64x4 $imm, %zmm, %ymm/mem {k}{z} AVX512F

As the testcase shows and the patch too, we didn't get it right in all
cases.

The first hunk is about avx512vl_vextractf128v8s[if] incorrectly
requiring TARGET_AVX512DQ.  The corresponding insn is the first
vextract{f,i}32x4 above, so it requires VL+F, and the builtins have it
correct (TARGET_AVX512VL implies TARGET_AVX512F):
BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8sf, "__builtin_ia32_extractf32x4_256_mask", IX86_BUILTIN_EXTRACTF32X4_256, UNKNOWN, (int) V4SF_FTYPE_V8SF_INT_V4SF_UQI)
BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8si, "__builtin_ia32_extracti32x4_256_mask", IX86_BUILTIN_EXTRACTI32X4_256, UNKNOWN, (int) V4SI_FTYPE_V8SI_INT_V4SI_UQI)
We only need TARGET_AVX512DQ for avx512vl_vextractf128v4d[if].

The second hunk is about vec_extract_lo_v16s[if]{,_mask}.  These are using
the vextract{f,i}32x8 insns (AVX512DQ above), but we weren't requiring that,
but instead incorrectly && 1 for non-masked and && (64 == 64 && TARGET_AVX512VL)
for masked insns.  This is extraction from ZMM, so it doesn't need VL for
anything.  The hunk actually only requires TARGET_AVX512DQ when the insn
is masked, if it is not masked, when TARGET_AVX512DQ isn't available we can
use vextract{f,i}64x4 instead which is available already in TARGET_AVX512F
and does the same thing, extracts the low 256 bits from 512 bits vector
(often we split it into just nothing, but there are some special cases like
when using xmm16+ when we can't without AVX512VL).

The last hunk is about vec_extract_lo_v8s[if]{,_mask}.  The non-_mask
suffixed ones are ok already and just split into nothing (lowpart subreg).
The masked ones were incorrectly requiring TARGET_AVX512VL and
TARGET_AVX512DQ, when we only need TARGET_AVX512VL.

2020-02-12  Jakub Jelinek  <jakub@redhat.com>

PR target/93670
* config/i386/sse.md (VI48F_256_DQ): New mode iterator.
(avx512vl_vextractf128<mode>): Use it instead of VI48F_256.  Remove
TARGET_AVX512DQ from condition.
(vec_extract_lo_<mode><mask_name>): Use <mask_avx512dq_condition>
instead of <mask_mode512bit_condition> in condition.  If
TARGET_AVX512DQ is false, emit vextract*64x4 instead of
vextract*32x8.
(vec_extract_lo_<mode><mask_name>): Drop <mask_avx512dq_condition>
from condition.

* gcc.target/i386/avx512vl-pr93670.c: New test.

i386: Fix -mavx -mno-mavx2 ICE with VEC_COND_EXPR [PR93637]

As mentioned in the PR, for -mavx -mno-avx2 the backend does support
vcondv4div4df and vcondv8siv8sf optabs (while generally 32-byte vectors
aren't much supported in that case, it is performed using
vandps/vandnps/vorps). The problem is that after the last generic vector
lowering (where the VEC_COND_EXPR still compares two V4DF vectors and
has two V4DI last operands and V4DI result and so is considered ok) fre4
folds the condition into constant, at which point the middle-end during
expansion will try vcond_mask_optab and fall back to trying to expand it
as the constant vector < 0 vcondv4div4di, but neither of them is supported
for -mavx -mno-avx2 and thus we ICE.

So, the options I see is either what the following patch does, also support
vcond_mask_v4div4di and vcond_mask_v4siv4si already for TARGET_AVX, or
require for vcondv4div4df and vcondv8siv8sf TARGET_AVX2 rather than current
TARGET_AVX.

2020-02-10 Jakub Jelinek <jakub@redhat.com>

PR target/93637
* config/i386/sse.md (VI_256_AVX2): New mode iterator.
(vcond_mask_<mode><sseintvecmodelower>): Use it instead of VI_256.
Change condition from TARGET_AVX2 to TARGET_AVX.

* gcc.target/i386/avx-pr93637.c: New test.

i386: Make xmm16-xmm31 call used even in ms ABI [PR65782]

On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
> I guess that Comment #9 patch form the PR should be trivially correct,
> but althouhg it looks obvious, I don't want to propose the patch since
> I have no means of testing it.

I don't have means of testing it either.
https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
128-bits only) are call preserved.

We are talking e.g. about
/* { dg-options "-O2 -mabi=ms -mavx512vl" } */

typedef double V __attribute__((vector_size (16)));
void foo (void);
V bar (void);
void baz (V);
void
qux (void)
{
  V c;
  {
    register V a __asm ("xmm18");
    V b = bar ();
    asm ("" : "=x" (a) : "0" (b));
    c = a;
  }
  foo ();
  {
    register V d __asm ("xmm18");
    V e;
    d = c;
    asm ("" : "=x" (e) : "0" (d));
    baz (e);
  }
}
where according to the MSDN doc gcc incorrectly holds the c value
in xmm18 register across the foo call; if foo is compiled by some Microsoft
compiler (or LLVM), then it could clobber %xmm18.
If all xmm18 occurrences are changed to say xmm15, then it is valid to hold
the 128-bit value across the foo call (though, surprisingly, LLVM saves it
into stack anyway).

The other parts are I guess mainly about SEH.  Consider e.g.
void
foo (void)
{
  register double x __asm ("xmm14");
  register double y __asm ("xmm18");
  asm ("" : "=x" (x));
  asm ("" : "=v" (y));
  x += y;
  y += x;
  asm ("" : : "x" (x));
  asm ("" : : "v" (y));
}
looking at cross-compiler output, with -O2 -mavx512f this emits
.file "abcdeq.c"
.text
.align 16
.globl foo
.def foo; .scl 2; .type 32; .endef
.seh_proc foo
foo:
subq $40, %rsp
.seh_stackalloc 40
vmovaps %xmm14, (%rsp)
.seh_savexmm %xmm14, 0
vmovaps %xmm18, 16(%rsp)
.seh_savexmm %xmm18, 16
.seh_endprologue
vaddsd %xmm18, %xmm14, %xmm14
vaddsd %xmm18, %xmm14, %xmm18
vmovaps (%rsp), %xmm14
vmovaps 16(%rsp), %xmm18
addq $40, %rsp
ret
.seh_endproc
.ident "GCC: (GNU) 10.0.1 20200207 (experimental)"
Does whatever assembler mingw64 uses even assemble this (I mean the
.seh_savexmm %xmm16, 16 could be problematic)?
I can find e.g.
https://stackoverflow.com/questions/43152633/invalid-register-for-seh-savexmm-in-cygwin/43210527
which then links to
https://gcc.gnu.org/PR65782

2020-02-08  Uroš Bizjak  <ubizjak@gmail.com>
    Jakub Jelinek  <jakub@redhat.com>

PR target/65782
* config/i386/i386.h (CALL_USED_REGISTERS): Make
xmm16-xmm31 call-used even in 64-bit ms-abi.

* gcc.target/i386/pr65782.c: New test.

Co-authored-by: Uroš Bizjak <ubizjak@gmail.com>

openmp: Fix handling of non-addressable shared scalars in parallel nested inside of target [PR93515]

As the following testcase shows, we need to consider even target to be a construct
that forces not to use copy in/out for shared on parallel inside of the target.
E.g. for parallel nested inside another parallel or host teams, we already avoid
copy in/out and we need to treat target the same.

2020-02-06 Jakub Jelinek <jakub@redhat.com>

PR libgomp/93515
* omp-low.c (use_pointer_for_field): For nested constructs, also
look for map clauses on target construct.
(scan_omp_1_stmt) <case GIMPLE_OMP_TARGET>: Bump temporarily
taskreg_nesting_level.

* testsuite/libgomp.c-c++-common/pr93515.c: New test.

openmp: Notice reduction decl in outer contexts after adding it to shared [PR93515]

If we call omp_add_variable, following omp_notice_variable will already find it
on that construct and not go through outer constructs, the following patch fixes that.
Note, this still doesn't follow OpenMP 5.0 semantics on target combined with other
constructs with reduction/lastprivate/linear clauses, will handle that for GCC11.

2020-02-06 Jakub Jelinek <jakub@redhat.com>

PR libgomp/93515
* gimplify.c (gimplify_scan_omp_clauses) <do_notice>: If adding
shared clause, call omp_notice_variable on outer context if any.

c++: Mark __builtin_convertvector operand as read [PR93557]

In C++ we weren't calling mark_exp_read on the __builtin_convertvector first
argument. I guess it could misbehave even with lambda implicit captures.

Fixed by calling decay_conversion on the argument, we use the argument as
rvalue so we want the standard lvalue to rvalue conversions, but as the
argument must be a vector type, e.g. integral promotions aren't really
needed.

2020-02-05 Jakub Jelinek <jakub@redhat.com>

PR c++/93557
* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
prior to passing it to c_build_vec_convert.

* c-c++-common/Wunused-var-17.c: New test.

openmp: Avoid ICEs with declare simd; declare simd inbranch [PR93555]

The testcases ICE because when processing the declare simd inbranch,
we don't create the i == 0 clone as it already exists, which means
clone_info->nargs is not adjusted, but we then rely on it being adjusted
when trying other clones.

2020-02-05 Jakub Jelinek <jakub@redhat.com>

PR middle-end/93555
* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
simd_clone_create failed when i == 0, adjust clone->nargs by
clone->inbranch.

* c-c++-common/gomp/pr93555-1.c: New test.
* c-c++-common/gomp/pr93555-2.c: New test.
* gfortran.dg/gomp/pr93555.f90: New test.

combine: Punt on out of range rotate counts [PR93505]

What happens on this testcase is with the out of bounds rotate we get:
Trying 13 -> 16:
   13: r129:SI=r132:DI#0<-<0x20
      REG_DEAD r132:DI
   16: r123:DI=r129:SI<0
      REG_DEAD r129:SI
Successfully matched this instruction:
(set (reg/v:DI 123 [ <retval> ])
    (const_int 0 [0]))
during combine.  So, perhaps we could also change simplify-rtx.c to punt
if it is out of bounds rather than trying to optimize anything.
Or, but probably GCC11 material, if we decide that ROTATE/ROTATERT doesn't
have out of bounds counts or introduce targetm.rotate_truncation_mask,
we should truncate the argument instead of punting.
Punting is better for backports though.

2020-01-30  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/93505
* combine.c (simplify_comparison) <case ROTATE>: Punt on out of range
rotate counts.

* gcc.c-torture/compile/pr93505.c: New test.

openmp: c++: Consider typeinfo decls to be predetermined shared [PR91118]

If the typeinfo decls appear in OpenMP default(none) regions, as we no longer
predetermine const with no mutable members, they are diagnosed as errors,
but it isn't something the users can actually provide explicit sharing for in
the clauses.

2020-01-29 Jakub Jelinek <jakub@redhat.com>

PR c++/91118
* cp-gimplify.c (cxx_omp_predetermined_sharing): Return
OMP_CLAUSE_DEFAULT_SHARED for typeinfo decls.

* g++.dg/gomp/pr91118-1.C: New test.
* g++.dg/gomp/pr91118-2.C: New test.

openmp: Handle rest of EXEC_OACC_* in oacc_code_to_statement [PR93463]

As the testcase shows, some EXEC_OACC_* codes weren't handled in
oacc_code_to_statement. Fixed thusly.

2020-01-29 Jakub Jelinek <jakub@redhat.com>

PR fortran/93463
* openmp.c (oacc_code_to_statement): Handle
EXEC_OACC_{ROUTINE,UPDATE,WAIT,CACHE,{ENTER,EXIT}_DATA,DECLARE}.

* gfortran.dg/goacc/pr93463.f90: New test.

i386: Fix ix86_fold_builtin shift folding [PR93418]

The following testcase is miscompiled, because the variable shift left
operand, { -1, -1, -1, -1 } is represented as a VECTOR_CST with
VECTOR_CST_NPATTERNS 1 and VECTOR_CST_NELTS_PER_PATTERN 1, so when
we call builder.new_unary_operation, builder.encoded_nelts () will be just 1
and thus we encode the resulting vector as if all the elements were the
same.
For non-masked is_vshift, we could perhaps call builder.new_binary_operation
(TREE_TYPE (args[0]), args[0], args[1], false), but then there are masked
shifts, for non-is_vshift we could perhaps call it too but with args[2]
instead of args[1], but there is no builder.new_ternary_operation.
All this stuff is primarily for aarch64 anyway, on x86 we don't have any
variable length vectors, and it is not a big deal to compute all elements
and just let builder.finalize () find the most efficient VECTOR_CST
representation of the vector. So, instead of doing too much, this just
keeps using new_unary_operation only if only one VECTOR_CST is involved
(i.e. non-masked shift by constant) and for the rest just compute all elts.

2020-01-28 Jakub Jelinek <jakub@redhat.com>

PR target/93418
* config/i386/i386.c (ix86_fold_builtin) <do_shift>: If mask is not
-1 or is_vshift is true, use new_vector with number of elts npatterns
rather than new_unary_operation.

* gcc.target/i386/avx2-pr93418.c: New test.

postreload: Fix up postreload combine [PR93402]

The following testcase is miscompiled, because the postreload pass changes:
-(insn 14 13 23 2 (parallel [
-            (set (reg:DI 1 dx [94])
-                (plus:DI (reg:DI 1 dx [95])
-                    (reg:DI 5 di [92])))
-            (clobber (reg:CC 17 flags))
-        ]) "pr93402.c":8:30 186 {*adddi_1}
-     (expr_list:REG_EQUAL (plus:DI (reg:DI 5 di [92])
-            (const_int 111111111111 [0x19debd01c7]))
-        (nil)))
-(insn 23 14 25 2 (set (reg:SI 0 ax)
+(insn 23 13 25 2 (set (reg:SI 0 ax)
         (const_int 0 [0])) "pr93402.c":10:1 67 {*movsi_internal}
      (nil))
(insn 25 23 26 2 (use (reg:SI 0 ax)) "pr93402.c":10:1 -1
      (nil))
-(insn 26 25 35 2 (use (reg:DI 1 dx)) "pr93402.c":10:1 -1
+(insn 26 25 35 2 (use (plus:DI (reg:DI 1 dx [95])
+            (reg:DI 5 di [92]))) "pr93402.c":10:1 -1
      (nil))
A USE insn is not a normal insn and verify_changes called from
apply_change_group is happy about any changes into it.
The following patch avoids this optimization if we were to change
the USE operand (this routine only changes a reg into (plus reg reg2)).

2020-01-23  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/93402
* postreload.c (reload_combine_recognize_pattern): Don't try to adjust
USE insns.

* gcc.c-torture/execute/pr93402.c: New test.

Daily bump.

middle-end: Fix logical shift truncation (PR rtl-optimization/91838) (gcc-9 backport)

This fixes a fall-out from a patch I had submitted two years ago which started
allowing simplify-rtx to fold logical right shifts by offsets a followed by b
into >> (a + b).

However this can generate inefficient code when the resulting shift count ends
up being the same as the size of the shift mode.  This will create some
undefined behavior on most platforms.

This patch changes to code to truncate to 0 if the shift amount goes out of
range.  Before my older patch this used to happen in combine when it saw the
two shifts.  However since we combine them here combine never gets a chance to
truncate them.

The issue mostly affects GCC 8 and 9 since on 10 the back-end knows how to deal
with this shift constant but it's better to do the right thing in simplify-rtx.

Note that this doesn't take care of the Arithmetic shift where you could replace
the constant with MODE_BITS (mode) - 1, but that's not a regression so punting it.

gcc/ChangeLog:

Backport from mainline
2020-01-31  Tamar Christina  <tamar.christina@arm.com>

PR rtl-optimization/91838
* simplify-rtx.c (simplify_binary_operation_1): Update LSHIFTRT case
to truncate if allowed or reject combination.

gcc/testsuite/ChangeLog:

Backport from mainline
2020-01-31  Tamar Christina  <tamar.christina@arm.com>
    Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/91838
* g++.dg/opt/pr91838.C: New test.

Daily bump.

i386: Properly pop restore token in signal frame

Linux CET kernel places a restore token on shadow stack for signal
handler to enhance security.  The restore token is 8 byte and aligned
to 8 bytes.  It is usually transparent to user programs since kernel
will pop the restore token when signal handler returns.  But when an
exception is thrown from a signal handler, now we need to pop the
restore token from shadow stack.  For x86-64, we just need to treat
the signal frame as normal frame.  For i386, we need to search for
the restore token to check if the original shadow stack is 8 byte
aligned.  If the original shadow stack is 8 byte aligned, we just
need to pop 2 slots, one restore token, from shadow stack.  Otherwise,
we need to pop 3 slots, one restore token + 4 byte padding, from
shadow stack.

This patch also includes 2 tests, one has a restore token with 4 byte
padding and one without.

Tested on Linux/x86-64 CET machine with and without -m32.

libgcc/

Backport from mainline
PR libgcc/85334
* config/i386/shadow-stack-unwind.h (_Unwind_Frames_Increment):
New.

gcc/testsuite/

Backport from mainline
PR libgcc/85334
* g++.target/i386/pr85334-1.C: New test.
* g++.target/i386/pr85334-2.C: Likewise.

(cherry picked from commit bf6465d0461234ccd45ae34d5e2375a0bee0081d)

Daily bump.

x86-64: Pass aggregates with only float/double in GPRs for MS_ABI

MS_ABI requires passing aggregates with only float/double in integer
registers as shown in the output from MSVC v19.10 at:

https://godbolt.org/z/2NPygd

This patch fixed:

FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test

in libffi testsuite.

gcc/

Backport from mainline
PR target/85667
* config/i386/i386.c (function_arg_ms_64): Add a type argument.
Don't return aggregates with only SFmode and DFmode in SSE
register.
(ix86_function_arg): Pass type to function_arg_ms_64.

gcc/testsuite/

Backport from mainline
PR target/85667
* gcc.target/i386/pr85667-10.c: New test.
* gcc.target/i386/pr85667-7.c: Likewise.
* gcc.target/i386/pr85667-8.c: Likewise.
* gcc.target/i386/pr85667-9.c: Likewise.

(cherry picked from commit ea5ca698dca15dc86b823661ac357a30b49dd0f6)

Daily bump.

[OpenMP] Add missing parameters to omp_lib documentation (PR fortran/93541)

        Backported from mainline
        2020-02-03  Tobias Burnus  <tobias@codesourcery.com>

        PR fortran/93541
        * intrinisic.texi (OpenMP Modules OMP_LIB and OMP_LIB_KINDS):
        Add undocumented parameters from omp_lib.f90.in.

[Fortran] Disable front-end optimization for OpenACC atomic (PR93462)

Backported from mainline
2020-01-31 Tobias Burnus <tobias@codesourcery.com>

PR fortran/93462
* frontend-passes.c (gfc_code_walker): For EXEC_OACC_ATOMIC, set
in_omp_atomic to true prevent front-end optimization.

PR fortran/93462
* gfortran.dg/goacc/atomic-1.f90: New.

Fortran] PR93309 – permit repeated 'implicit none(external)'

Backported from mainline
2020-01-21 Tobias Burnus <tobias@codesourcery.com>

PR fortran/93309
* interface.c (gfc_procedure_use): Also check parent namespace for
'implict none (external)'.
* symbol.c (gfc_get_namespace): Don't set has_implicit_none_export
to parent namespace's setting.

Backported from mainline
2020-01-21 Tobias Burnus <tobias@codesourcery.com>

PR fortran/93309
* gfortran.dg/external_implicit_none_2.f90: New.

Daily bump.

Fix ICE in pa_elf_select_rtx_section.

2020-01-30 John David Anglin <danglin@gcc.gnu.org>

* config/pa/pa.c (pa_elf_select_rtx_section): Place function pointers
without a DECL in .data.rel.ro.local.

RISC-V: Disallow regrenme if the TO register never used before for interrupt functions

gcc/ChangeLog

PR target/93304
* config/riscv/riscv-protos.h (riscv_hard_regno_rename_ok): New.
* config/riscv/riscv.c (riscv_hard_regno_rename_ok): New.
* config/riscv/riscv.h (HARD_REGNO_RENAME_OK): Defined.

gcc/testsuite/ChangeLog

PR target/93304
* gcc.target/riscv/pr93304.c: New test.

c++: Drop alignas restriction for stack variables.

Since expand_stack_vars and such know how to deal with variables aligned
beyond MAX_SUPPORTED_STACK_ALIGNMENT, we shouldn't reject alignas of large
alignments. And if we don't do that, there's no point in having
check_cxx_fundamental_alignment_constraints at all, since
check_user_alignment already enforces MAX_OFILE_ALIGNMENT.

PR c++/89357
* c-attribs.c (check_cxx_fundamental_alignment_constraints): Remove.

Daily bump.

[AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

This is a workaround that emits a BTI after the function label if that
is followed by a patch area. We try to remove the BTI that follows the
patch area (this may fail e.g. if the first instruction is a PACIASP).

So before this commit -fpatchable-function-entry=3,1 with bti generates

    .section __patchable_function_entries
    .8byte .LPFE
    .text
  .LPFE:
    nop
  foo:
    nop
    nop
    bti c // or paciasp
    ...

and after this commit

    .section __patchable_function_entries
    .8byte .LPFE
    .text
  .LPFE:
    nop
  foo:
    bti c
    nop
    nop
    // may be paciasp
    ...

and with -fpatchable-function-entry=1 (M=0) the code now is

  foo:
    bti c
    .section __patchable_function_entries
    .8byte .LPFE
    .text
  .LPFE:
    nop
    // may be paciasp
    ...

There is a new bti insn in the middle of the patchable area users need
to be aware of unless M=0 (patch area is after the new bti) or M=N
(patch area is before the label, no new bti). Note: bti is not added to
all functions consistently (it can be turned off per function using a
target attribute or the compiler may detect that the function is never
called indirectly), so if bti is inserted in the middle of a patch area
then user code needs to deal with detecting it.

Tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

PR target/92424
* config/aarch64/aarch64.c (aarch64_declare_function_name): Set
cfun->machine->label_is_assembled.
(aarch64_print_patchable_function_entry): New.
(TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
* config/aarch64/aarch64.h (struct machine_function): New field,
label_is_assembled.

gcc/testsuite/ChangeLog:

PR target/92424
* gcc.target/aarch64/pr92424-2.c: New test.
* gcc.target/aarch64/pr92424-3.c: New test.

Daily bump.

c++: Allow template rvalue-ref conv to bind to lvalue ref.

When I implemented the [over.match.ref] rule that a reference conversion
function needs to match l/rvalue of the target reference type it changed our
handling of this testcase. It seems to me that our current behavior is what
the standard says, but it doesn't seem desirable, and all the other
compilers have our old behavior. So let's limit the change to non-templates
until there's some clarification from the committee.

PR c++/90546
* call.c (build_user_type_conversion_1): Allow a template conversion
returning an rvalue reference to bind directly to an lvalue.

c++: Function declared with typedef with eh-specification.

We just need to handle the exception specification like other properties of
a function typedef.

PR c++/90731
* decl.c (grokdeclarator): Propagate eh spec from typedef.

c++: Fix array of char typedef in template (PR90966).

Since Martin Sebor's patch for PR 71625 to change braced array initializers
to STRING_CST in some cases, we need to be ready for STRING_CST with types
that are changed by tsubst. fold_convert doesn't know how to deal with
STRING_CST, which is reasonable; we really shouldn't expect it to here. So
let's handle STRING_CST separately.

PR c++/90966
* pt.c (tsubst_copy) [STRING_CST]: Don't use fold_convert.

c++: Fix ICE with lambda in member operator (PR93279)

Here the problem was that we were remembering the lookup in template scope,
and then trying to reuse that lookup in the instantiation without
substituting into it at all. The simplest solution is to not try to
remember a lookup that finds a class-scope declaration, as in that case
doing the normal lookup again at instantiation time will always find the
right declarations.

PR c++/93279 - ICE with lambda in member operator.
* name-lookup.c (maybe_save_operator_binding): Don't remember
class-scope bindings.

Daily bump.

c++: Bogus error using namespace alias [PR91826]

My changes to is_nested_namespace broke is_ancestor's use where a namespace
alias might be passed in. This changes is_ancestor to look through the alias.

PR c++/91826
* name-lookup.c (is_ancestor): Allow CHILD to be a namespace alias.

[AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

The separate shrinkwrapping pass may insert stores in the middle
of atomics loops which can cause issues on some implementations.
Avoid this by delaying splitting atomics patterns until after
prolog/epilog generation.

gcc/
PR target/92692
* config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
Add assert to ensure prolog has been emitted.
(aarch64_split_atomic_op): Likewise.
* config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
Use epilogue_completed rather than reload_completed.
(aarch64_atomic_exchange<mode>): Likewise.
(aarch64_atomic_<atomic_optab><mode>): Likewise.
(atomic_nand<mode>): Likewise.
(aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
(atomic_fetch_nand<mode>): Likewise.
(aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
(atomic_nand_fetch<mode>): Likewise.

(cherry picked from commit e5e07b68187b9aa334519746c45b8cffc5eb7e5c)

Daily bump.

testsuite: xfail gcc.target/i386/pr91298-?.c on Solaris/x86 with as

The new gcc.target/i386/pr91298-?.c testcases FAIL on Solaris/x86 with the
native assembler:

FAIL: gcc.target/i386/pr91298-1.c (test for excess errors)

Excess errors:
Assembler: pr91298-1.c
        "/var/tmp//ccE6r3xb.s", line 5 : Syntax error
        Near line: "    .globl  $quux"
        "/var/tmp//ccE6r3xb.s", line 6 : Syntax error
        Near line: "    .type   $quux, @function"
        "/var/tmp//ccE6r3xb.s", line 7 : Syntax error
        Near line: "$quux:"
        "/var/tmp//ccE6r3xb.s", line 15 : Syntax error
        Near line: "    .size   $quux, .-$quux"
        "/var/tmp//ccE6r3xb.s", line 24 : Syntax error
        Near line: "    movl    $($a), %eax"
        "/var/tmp//ccE6r3xb.s", line 38 : Syntax error
        Near line: "    leal    ($a)(,%eax,4), %eax"
        "/var/tmp//ccE6r3xb.s", line 51 : Syntax error
        Near line: "    movl    ($a), %eax"
        "/var/tmp//ccE6r3xb.s", line 63 : Syntax error
        Near line: "    movl    ($a)+16, %eax"
        "/var/tmp//ccE6r3xb.s", line 97 : Syntax error
        Near line: "    movl    $($quux), %eax"
        "/var/tmp//ccE6r3xb.s", line 101 : Syntax error
        Near line: "    .globl  $a"
        "/var/tmp//ccE6r3xb.s", line 104 : Syntax error
        Near line: "    .type   $a, @object"
        "/var/tmp//ccE6r3xb.s", line 105 : Syntax error
        Near line: "    .size   $a, 72"
        "/var/tmp//ccE6r3xb.s", line 106 : Syntax error
        Near line: "$a:"
        "/var/tmp//ccE6r3xb.s", line 228 : Syntax error
        Near line: "    .long   ($a)"

FAIL: gcc.target/i386/pr91298-2.c (test for excess errors)

It only allows letters, digits, '_' and '.' in identifiers:
https://docs.oracle.com/cd/E37838_01/html/E61064/eqbsx.html#XALRMeoqjw

For lack of an effective-target keyword matching -fdollars-in-identifiers,
this patch fixes this by xfailing them on *-*-solaris2.* && !gas.

Tested on i386-pc-solaris2.11 with as and gas and x86_64-pc-linux-gnu.

* gcc.target/i386/pr91298-1.c: xfail on Solaris/x86 with native
assembler.
* gcc.target/i386/pr91298-2.c: Likewise.

Daily bump.

c++: Unshare expressions from constexpr cache.

Another place we need to unshare cached expressions.

PR c++/92852 - ICE with generic lambda and reference var.
* constexpr.c (maybe_constant_value): Likewise.

libstdc++: Simplify makefile rule for largefile-config.h (PR91947)

The previous rule could leave an incomplete file if the build was
interrupted, which would then not be remade if make was run again.

This makes the rule more robust by writing to a temporary file and only
moving it into place as the final step. It also simplifies the rule so
that only the essential macro definitions are written to the file, not
the explanatory comments and commented out #undef lines.

Also, the macro for enabling LFS on Mac OS X 10.5 is now set
unconditionally, which is a bug fix from upstream autoconf.

Backport from mainline
2020-01-23 Jonathan Wakely <jwakely@redhat.com>

PR libstdc++/91947
* include/Makefile.am (${host_builddir}/largefile-config.h): Simplify
rule.
* include/Makefile.in: Regenerate.

libstdc++: Fix recent documentation changes

Backport from mainline
2020-01-20 Jonathan Wakely <jwakely@redhat.com>

* doc/xml/faq.xml: Fix grammar.
* doc/xml/manual/appendix_contributing.xml: Improve instructions.
* doc/xml/manual/spine.xml: Update copyright years.
* doc/html/*: Regenerate.

Daily bump.

Cherry-pick 15 bugfixes from mainline

r10-6140-gd80f0a8dc9c2e5886bb79bddee2674e1d3f9d105
r10-6137-gc892d8f58f6fed46c343bdb6dd4d365f08f801b8
r10-6136-g44a9d801a7080d39658754ad603536da6cff2cd0
r10-6135-ga38979d9d7a4ab08336436052704028c56187618
r10-6118-gbd0a3e244d94ad4a5e41f01ebf285f0861cb4a03
r10-6104-g51e010b5f75c1fff06425a72702c1bf82a3ab053
r10-6041-gc60a18f8056facdcf370ce0e5f51550c9df5b539
r10-5954-gfbbc4c24fd7ba87e0c47cd965ae624afba6fa375
r10-5897-g91df4397a1404df65de6de23426294c50ab88bd2
r10-5829-ga0ab54de0ec3e0d48b2a681f7f78fe14bc4099eb
r10-5723-g5a6e28b5bae7a236b35994d0f64fd902a574872c
r10-5712-g4ea5d54b3c7175de045589f994fc94ed7e59d80d
r10-5697-g2c8297996a7ab3496c5d2f798cdbe4cab749468e
r10-5650-g7cd268ad6a6f71877744539d17ed53e752774bfa
r10-5618-g6c7b84305a5e686644ee64bfd2d415f3f43fa85b

aarch64: Fix aarch64_expand_subvti constant handling [PR93335]

The two patterns that call aarch64_expand_subvti ensure that {low,high}_in1
is a register, while {low,high}_in2 can be a register or immediate.
subdi3_compare1_imm uses the aarch64_plus_immediate predicate for its last
two operands (the value and negated value), but aarch64_expand_subvti calls
it whenever low_in2 is a CONST_INT, which leads to ICEs during vregs pass,
as the emitted insn is not recognized as valid subdi3_compare1_imm.
The following patch fixes that by only using subdi3_compare1_imm if it is ok
to do so, and otherwise force the constant into register and use the
non-immediate version - subdi3_compare1.
Furthermore, previously the code was calling force_reg on high_in2 only if
low_in2 is CONST_INT, on the (reasonable) assumption is that only if low_in2
is a CONST_INT, high_in2 can be non-REG, but with the above changes even in
the else we might have CONST_INT and force_reg doesn't do anything if the
operand is already a REG, so this patch calls it unconditionally.

2020-01-22 Jakub Jelinek <jakub@redhat.com>

PR target/93335
* config/aarch64/aarch64.c (aarch64_expand_subvti): Only use
gen_subdi3_compare1_imm if low_in2 satisfies aarch64_plus_immediate
predicate, not whenever it is CONST_INT. Otherwise, force_reg it.
Call force_reg on high_in2 unconditionally.

* gcc.c-torture/compile/pr93335.c: New test.

i386: Fix up -fdollars-in-identifiers with identifiers starting with $ in -masm=att [PR91298]

In AT&T syntax leading $ is special, so if we have identifiers that start
with dollar, we usually fail to assemble it (or assemble incorrectly).
As mentioned in the PR, what works is wrapping the identifiers inside of
parens, like:
movl $($a), %eax
leaq ($a)(,%rdi,4), %rax
movl ($a)(%rip), %eax
movl ($a)+16(%rip), %eax
.globl $a
.type $a, @object
.size $a, 72
$a:
.string "$a"
.quad ($a)
(this is x86_64 -fno-pic -O2). In some places ($a) is not accepted,
like as .globl operand, in .type, .size, so the patch overrides
ASM_OUTPUT_SYMBOL_REF rather than e.g. ASM_OUTPUT_LABELREF.
I didn't want to duplicate what assemble_name is doing (following
transparent aliases), so split assemble_name into two parts; just
mere looking at the first character of a name before calling assemble_name
wouldn't be good enough, a transparent alias could lead from a name
not starting with $ to one starting with it and vice versa.

2020-01-22 Jakub Jelinek <jakub@redhat.com>

PR target/91298
* output.h (assemble_name_resolve): Declare.
* varasm.c (assemble_name_resolve): New function.
(assemble_name): Use it.
* config/i386/i386.h (ASM_OUTPUT_SYMBOL_REF): Define.

* gcc.target/i386/pr91298-1.c: New test.
* gcc.target/i386/pr91298-2.c: New test.

openmp: Fix up !$omp target parallel handling

The PR93329 fix revealed we ICE on !$omp target parallel, this change fixes
that.

2020-01-22 Jakub Jelinek <jakub@redhat.com>

* parse.c (parse_omp_structured_block): Handle ST_OMP_TARGET_PARALLEL.
* trans-openmp.c (gfc_trans_omp_target)
<case EXEC_OMP_TARGET_PARALLEL>: Call pushlevel first.

* gfortran.dg/gomp/target-parallel1.f90: New test.
* gfortran.dg/goacc/pr93329.f90: Enable commented out target parallel
test.

openmp: Teach omp_code_to_statement about rest of OpenMP statements

The omp_code_to_statement function added with the initial OpenACC support
only handled small subset of the OpenMP statements, leading to ICE if
any other OpenMP directive appeared inside of OpenACC directive.

2020-01-22 Jakub Jelinek <jakub@redhat.com>

PR fortran/93329
* openmp.c (omp_code_to_statement): Handle remaining EXEC_OMP_*
cases.

* gfortran.dg/goacc/pr93329.f90: New test.

riscv: Fix up riscv_rtx_costs for RTL checking (PR target/93333)

As mentioned in the PR, during combine rtx_costs can be called sometimes
even on RTL that has not been validated yet and so can contain even operands
that aren't valid in any instruction.

2020-01-21 Jakub Jelinek <jakub@redhat.com>

PR target/93333
* config/riscv/riscv.c (riscv_rtx_costs) <case ZERO_EXTRACT>: Verify
the last two operands are CONST_INT_P before using them as such.

* gcc.c-torture/compile/pr93333.c: New test.

powerpc: Fix ICE with fp conditional move (PR target/93073)

The following testcase ICEs, because for TFmode the particular subtraction
pattern (*subtf3) is not enabled with the given options. Using
expand_simple_binop instead of emitting the subtraction by hand just moves
the ICE one insn later, NEG of ABS is not then recognized, etc., but
ultimately the problem is that when rs6000_emit_cmove is called for floating
point operand mode (and earlier condition ensures that in that case
compare_mode is also floating point), the expander makes sure the
operand mode is SFDF, but for the comparison mode nothing checks it, yet
there is just one *fsel* pattern with 2 separate SFDF iterators.

The following patch fixes it by giving up if compare_mode is not SFmode or
DFmode.

2020-01-21 Jakub Jelinek <jakub@redhat.com>

PR target/93073
* config/rs6000/rs6000.c (rs6000_emit_cmove): If using fsel, punt for
compare_mode other than SFmode or DFmode.

* gcc.target/powerpc/pr93073.c: New test.

c++: Fix deprecated attribute handling on templates (PR c++/93228)

As the following testcase shows, when deprecated attribute is on a template,
we'd never print the message if any, because the attribute is not
present on the TEMPLATE_DECL with which warn_deprecated_use is called,
but on its DECL_TEMPLATE_RESULT or its type.

2020-01-17 Jakub Jelinek <jakub@redhat.com>

PR c++/93228
* parser.c (cp_parser_template_name): Look up deprecated attribute
in DECL_TEMPLATE_RESULT or its type's attributes.

* g++.dg/cpp1y/attr-deprecated-3.C: New test.

i386: Fix wrong-code x86 issue with avx512{f,vl} fma PR93009

As mentioned in the PR, the following testcase is miscompiled with avx512vl.
The reason is that the fma *_bcst_1 define_insns have two alternatives:
"=v,v" "0,v" "v,0" "m,m" and use the same
vfmadd213* %3<avx512bcst>, %2, %0<sd_mask_op4>
pattern.  If the first alternative is chosen, everything is ok, but if the
second alternative is chosen, %2 and %0 are the same register, so instead
of doing dest=dest*another+membcst we do dest=dest*dest+membcst.
Now, to fix this, either we'd need separate:
  "vfmadd213<ssemodesuffix>\t{%3<avx512bcst>, %2, %0<sd_mask_op4>|%0<sd_mask_op4>, %2, %3<avx512bcst>}
   vfmadd213<ssemodesuffix>\t{%3<avx512bcst>, %1, %0<sd_mask_op4>|%0<sd_mask_op4>, %1, %3<avx512bcst>}"
where for the second alternative, we'd just use %1 instead of %2, but
what I think is actually cleaner is just use a single alternative and
make the two multiplication operands commutative, which they really are.

2020-01-15  Jakub Jelinek  <jakub@redhat.com>

PR target/93009
* config/i386/sse.md
(*<sd_mask_codefor>fma_fmadd_<mode><sd_maskz_name>_bcst_1,
*<sd_mask_codefor>fma_fmsub_<mode><sd_maskz_name>_bcst_1,
*<sd_mask_codefor>fma_fnmadd_<mode><sd_maskz_name>_bcst_1,
*<sd_mask_codefor>fma_fnmsub_<mode><sd_maskz_name>_bcst_1): Use
just a single alternative instead of two, make operands 1 and 2
commutative.

* gcc.target/i386/avx512vl-pr93009.c: New test.

re PR libgomp/93219 (unused return value in affinity-fmt.c)

PR libgomp/93219
* libgomp.h (gomp_print_string): Change return type from void to int.
* affinity-fmt.c (gomp_print_string): Likewise. Return true if
not all characters have been written.

re PR inline-asm/93202 ([RISCV] ICE when using inline asm 'h' operand modifier)

PR inline-asm/93202
* config/riscv/riscv.c (riscv_print_operand_reloc): Use
output_operand_lossage instead of gcc_unreachable.
* doc/md.texi (riscv f constraint): Fix typo.

* gcc.target/riscv/pr93202.c: New test.

re PR rtl-optimization/93088 (Compile time hog on gcc/testsuite/gcc.target/i386/pr56348.c w/ -O3 -funroll-loops -fno-tree-dominator-opts -fno-tree-vrp)

PR rtl-optimization/93088
* loop-iv.c (find_single_def_src): Punt after looking through
128 reg copies for regs with single definitions. Move definitions
to first uses.

* gcc.target/i386/pr93088.c: New test.

re PR ipa/93087 (Bogus `-Wsuggest-attribute=cold` on function already marked as `__attribute__((cold))`)

PR ipa/93087
* predict.c (compute_function_frequency): Don't call
warn_function_cold on functions that already have cold attribute.

* c-c++-common/cold-1.c: New test.

re PR libgomp/93065 (libgomp: destructor missing to delete goacc_cleanup_key)

PR libgomp/93065
* oacc-init.c (goacc_runtime_deinitialize): New function.

re PR c++/92438 (Function declaration parsed incorrectly with `-std=c++1z`)

PR c++/92438
* parser.c (cp_parser_constructor_declarator_p): If open paren
is followed by RID_ATTRIBUTE, skip over the attribute tokens and
try to parse type specifier.

* g++.dg/ext/attrib61.C: New test.

re PR c++/92992 (Side-effects dropped when decltype(nullptr) typed expression is passed to ellipsis)

PR c++/92992
* call.c (convert_arg_to_ellipsis): For decltype(nullptr) arguments
that have side-effects use cp_build_compound_expr.

* g++.dg/cpp0x/nullptr45.C: New test.

Fix ICE with cast of division by zero (PR c/93348).

Bug 93348 reports an ICE on certain cases of casts of expressions that
may appear only in unevaluated parts of integer constant expressions,
arising from the generation of nested C_MAYBE_CONST_EXPRs. This patch
fixes it by adding a call to remove_c_maybe_const_expr in the
integer-operands case, as is done in other similar cases.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.

PR c/93348
gcc/c:
* c-typeck.c (build_c_cast): Call remove_c_maybe_const_expr on
argument with integer operands.

gcc/testsuite:
* gcc.c-torture/compile/pr93348-1.c: New test.

(cherry picked from commit ac68e287fc2e939ae6b45ba7ff04e493982b7f62)

Daily bump.

Bug 93234 - INQUIRE on pre-assigned files of ROUND and SIGN properties fails

2020-01-21 Jerry DeLisle <jvdelisle@gcc.gnu.org>

Backport from mainline
PR libfortran/93234
* io/unit.c (set_internal_unit): Set round and sign flags
correctly.

* gfortran.dg/inquire_pre.f90: New test.

PR c++/91476 - anon-namespace reference temp clash between TUs.

* call.c (make_temporary_var_for_ref_to_temp): Clear TREE_PUBLIC
if DECL is in the anonymous namespace.

Daily bump.

Update GCC zh_TW.po.

* zh_TW.po: Update.

[PATCH] PR Fortran/93263 Correct test case

Should've have checked for the existance of a non static integer
using scan-tree-dump instead of scan-tree-dump-not. A cut and paste
error.

PR middle-end/93246 - missing alias subsets

Starting with the introduction of TYPE_TYPELESS_STORAGE the situation
of having a alias-set zero aggregate field became more common which
prevents recording alias-sets of fields of said aggregate as subset
of the outer aggregate. component_uses_parent_alias_set_from in the
past fended off some of the issues with that but the alias oracles
use of the alias set of the base of an access path never appropriately
handled it.

The following makes it so that alias-sets of fields of alias-set zero
aggregate fields are still recorded as subset of the container.

2020-01-14 Richard Biener <rguenther@suse.de>

PR middle-end/93246
* alias.c (record_component_aliases): Take superset to record
into, recurse for alias-set zero fields.
(record_component_aliases): New oveerload wrapping around the above.

* g++.dg/torture/pr93246.C: New testcase.

Backport f48c6014133c8989702458f9082e34ba6dd326d4

Backport from mainline
2020-01-16 Martin Liska <mliska@suse.cz>

* lto-partition.c (lto_balanced_map): Remember
best_noreorder_pos and then restore to it
when we revert.

Clean up references to Subversion in documentation sources.

Clean up references to SVN in in the GCC docs, redirecting to Git
documentation as appropriate.

Where references to "the source code repository" rather than a
specific VCS make sense, I have used them. You might, after
all, change VCSes again someday.

I have not modified either generated HTML files nor maintainer scripts.
These changes should be complete with repect to the documentation tree.

2020-01-19  Eric S. Raymond <esr@thyrsus.com>
    Sandra Loosemore  <sandra@codesourcery.com>

Partial backport from mainline:

2020-01-19  Eric S. Raymond <esr@thyrsus.com>

gcc/
* doc/contribute.texi: Update for SVN -> Git transition.
* doc/install.texi: Likewise.

libstdc++-v3
* doc/xml/faq.xml: Update for SVN -> Git transition.
* doc/xml/manual/appendix_contributing.xml: Likewise.

Daily bump.

PR c++/92531 - ICE with noexcept(lambda).

This was failing because uses_template_parms didn't recognize LAMBDA_EXPR as
a kind of expression. Instead of trying to enumerate all the different
varieties of expression and then aborting if what's left isn't
error_mark_node, let's handle error_mark_node and then assume anything else
is an expression.

* pt.c (uses_template_parms): Don't try to enumerate all the
expression cases.

PR c++/93286 - ICE with __is_constructible and variadic template.

Here we had been recursing in tsubst_copy_and_build if type2 was a TREE_LIST
because that function knew how to deal with pack expansions, and tsubst
didn't. But tsubst_copy_and_build expects to be dealing with expressions,
so we crash when trying to convert_from_reference a type.

* pt.c (tsubst) [TREE_LIST]: Handle pack expansion.
(tsubst_copy_and_build) [TRAIT_EXPR]: Always use tsubst for type2.

Fortran: PR93263 -fno-automatic and RECURSIVE

The use of -fno-automatic should not affect the save attribute of a
recursive procedure. The first test case checks unsaved variables
and the second checks saved variables.