git.ipfire.org Git - thirdparty/gcc.git/log

libstdc++: Use using instead of typedef in opts-common.h

libstdc++-v3/ChangeLog:

* src/filesystem/ops-common.h (stat_type): Use using.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

libstdc++: Fix error handling in filesystem::equivalent [PR113250]

This patch made std::filesystem::equivalent correctly throw an exception
when either path does not exist as per [fs.op.equivalent]/4.

PR libstdc++/113250

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (fs::equivalent): Use || instead of &&.
* src/filesystem/ops.cc (fs::equivalent): Likewise.
* testsuite/27_io/filesystem/operations/equivalent.cc: Handle
error codes.
* testsuite/experimental/filesystem/operations/equivalent.cc:
Likewise.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Reviewed-by: Jonathan Wakely <jwakely@redhat.com>

LoongArch: Implement option save/restore

LTO option streaming and target attributes both require per-function
target configuration, which is achieved via option save/restore.

We implement TARGET_OPTION_{SAVE,RESTORE} to switch the la_target
context in addition to other automatically maintained option states
(via the "Save" option property in the .opt files).

Tested on loongarch64-linux-gnu without regression.

PR target/113233

gcc/ChangeLog:

* config/loongarch/genopts/loongarch.opt.in: Mark options with
the "Save" property.
* config/loongarch/loongarch.opt: Same.
* config/loongarch/loongarch-opts.cc: Refresh -mcmodel= state
according to la_target.
* config/loongarch/loongarch.cc: Implement TARGET_OPTION_{SAVE,
RESTORE} for the la_target structure; Rename option conditions
to have the same "la_" prefix.
* config/loongarch/loongarch.h: Same.

LOOP-UNROLL: Leverage HAS_SIGNED_ZERO for var expansion

The insert_var_expansion_initialization depends on the
HONOR_SIGNED_ZEROS to initialize the unrolling variables
to +0.0f when -0.0f and no-signed-option. Unfortunately,
we should always keep the -0.0f here because:

* The -0.0f is always the correct initial value.
* We need to support the target that always honor signed zero.

Thus, we need to leverage MODE_HAS_SIGNED_ZEROS when initialize
instead of HONOR_SIGNED_ZEROS. Then the target/backend can
decide to honor the no-signed-zero or not.

We also removed the testcase pr30957-1.c, as it makes undefined behavior
whether the return value is positive or negative.

The below tests are passed for this patch:

* The riscv regression tests.
* The aarch64 regression tests.
* The x86 bootstrap and regression tests.

gcc/ChangeLog:

* loop-unroll.cc (insert_var_expansion_initialization): Leverage
MODE_HAS_SIGNED_ZEROS for expansion variable initialization.

gcc/testsuite/ChangeLog:

* gcc.dg/pr30957-1.c: Remove.

Signed-off-by: Pan Li <pan2.li@intel.com>

aarch64: Fix dwarf2cfi ICEs due to recent CFI note changes [PR113077]

In r14-6604-gd7ee988c491cde43d04fe25f2b3dbad9d85ded45 we changed the CFI notes
attached to callee saves (in aarch64_save_callee_saves).  That patch changed
the ldp/stp representation to use unspecs instead of PARALLEL moves.  This meant
that we needed to attach CFI notes to all frame-related pair saves such that
dwarf2cfi could still emit the appropriate CFI (it cannot interpret the unspecs
directly).  The patch also attached REG_CFA_OFFSET notes to individual saves so
that the ldp/stp pass could easily preserve them when forming stps.

In that change I chose to use REG_CFA_OFFSET, but as the PR shows, that
choice was problematic in that REG_CFA_OFFSET requires the attached
store to be expressed in terms of the current CFA register at all times.
This means that even scheduling of frame-related insns can break this
invariant, leading to ICEs in dwarf2cfi.

The old behaviour (before that change) allowed dwarf2cfi to interpret the RTL
directly for sp-relative saves.  This change restores that behaviour by using
REG_FRAME_RELATED_EXPR instead of REG_CFA_OFFSET.  REG_FRAME_RELATED_EXPR
effectively just gives a different pattern for dwarf2cfi to look at instead of
the main insn pattern.  That allows us to attach the old-style PARALLEL move
representation in a REG_FRAME_RELATED_EXPR note and means we are free to always
express the save addresses in terms of the stack pointer.

Since the ldp/stp fusion pass can combine frame-related stores, this patch also
updates it to preserve REG_FRAME_RELATED_EXPR notes, and additionally gives it
the ability to synthesize those notes when combining sp-relative saves into an
stp (the latter always needs a note due to the unspec representation, the former
does not).

gcc/ChangeLog:

PR target/113077
* config/aarch64/aarch64-ldp-fusion.cc (filter_notes): Add
fr_expr param to extract REG_FRAME_RELATED_EXPR notes.
(combine_reg_notes): Handle REG_FRAME_RELATED_EXPR notes, and
synthesize these if needed.  Update caller ...
(ldp_bb_info::fuse_pair): ... here.
(ldp_bb_info::try_fuse_pair): Punt if either insn has writeback
and either insn is frame-related.
(find_trailing_add): Punt on frame-related insns.
* config/aarch64/aarch64.cc (aarch64_save_callee_saves): Use
REG_FRAME_RELATED_EXPR instead of REG_CFA_OFFSET.

gcc/testsuite/ChangeLog:

PR target/113077
* gcc.target/aarch64/pr113077.c: New test.

MIPS: Add ATTRIBUTE_UNUSED to mips_start_function_definition

Fix build warning:
mips.cc: warning: unused parameter 'decl'.

gcc
* config/mips/mips.cc (mips_start_function_definition):
Add ATTRIBUTE_UNUSED.

tree-optimization/111003 - new testcase

Testcase for fixed PR.

PR tree-optimization/111003
gcc/testsuite/
* gcc.dg/tree-ssa/pr111003.c: New testcase.

middle-end/112740 - vector boolean CTOR expansion issue

The optimization to expand uniform boolean vectors by sign-extension
works only for dense masks but it failed to check that.

PR middle-end/112740
* expr.cc (store_constructor): Check the integer vector
mask has a single bit per element before using sign-extension
to expand an uniform vector.

* gcc.dg/pr112740.c: New testcase.

RISC-V: VLA preempts VLS on unknown NITERS loop

This patch fixes the known issues on SLP cases:

ble a2,zero,.L11
addiw t1,a2,-1
li a5,15
bleu t1,a5,.L9
srliw a7,t1,4
slli a7,a7,7
lui t3,%hi(.LANCHOR0)
lui a6,%hi(.LANCHOR0+128)
addi t3,t3,%lo(.LANCHOR0)
li a4,128
addi a6,a6,%lo(.LANCHOR0+128)
add a7,a7,a0
addi a3,a1,37
mv a5,a0
vsetvli zero,a4,e8,m8,ta,ma
vle8.v v24,0(t3)
vle8.v v16,0(a6)
.L4:
li a6,128
vle8.v v0,0(a3)
vrgather.vv v8,v0,v24
vadd.vv v8,v8,v16
vse8.v v8,0(a5)
add a5,a5,a6
add a3,a3,a6
bne a5,a7,.L4
andi a5,t1,-16
mv t1,a5
.L3:
subw a2,a2,a5
li a4,1
beq a2,a4,.L5
slli a5,a5,32
srli a5,a5,32
addiw a2,a2,-1
slli a5,a5,3
csrr a4,vlenb
slli a6,a2,32
addi t3,a5,37
srli a3,a6,29
slli a4,a4,2
add t3,a1,t3
add a5,a0,a5
mv t5,a3
bgtu a3,a4,.L14
.L6:
li a4,50790400
addi a4,a4,1541
li a6,67633152
addi a6,a6,513
slli a4,a4,32
add a4,a4,a6
vsetvli t4,zero,e64,m4,ta,ma
vmv.v.x v16,a4
vsetvli a6,zero,e16,m8,ta,ma
vid.v v8
vsetvli zero,t5,e8,m4,ta,ma
vle8.v v20,0(t3)
vsetvli a6,zero,e16,m8,ta,ma
csrr a7,vlenb
vand.vi v8,v8,-8
vsetvli zero,zero,e8,m4,ta,ma
slli a4,a7,2
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,t5,e8,m4,ta,ma
vse8.v v4,0(a5)
bgtu a3,a4,.L15
.L7:
addw t1,a2,t1
.L5:
slliw a5,t1,3
add a1,a1,a5
lui a4,%hi(.LC2)
add a0,a0,a5
lbu a3,37(a1)
addi a5,a4,%lo(.LC2)
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.x v1,a3
vle8.v v2,0(a5)
vadd.vv v1,v1,v2
vse8.v v1,0(a0)
.L11:
ret
.L15:
sub a3,a3,a4
bleu a3,a4,.L8
mv a3,a4
.L8:
li a7,50790400
csrr a4,vlenb
slli a4,a4,2
addi a7,a7,1541
li t4,67633152
add t3,t3,a4
vsetvli zero,a3,e8,m4,ta,ma
slli a7,a7,32
addi t4,t4,513
vle8.v v20,0(t3)
add a4,a5,a4
add a7,a7,t4
vsetvli a5,zero,e64,m4,ta,ma
vmv.v.x v16,a7
vsetvli a6,zero,e16,m8,ta,ma
vid.v v8
vand.vi v8,v8,-8
vsetvli zero,zero,e8,m4,ta,ma
vrgatherei16.vv v4,v20,v8
vadd.vv v4,v4,v16
vsetvli zero,a3,e8,m4,ta,ma
vse8.v v4,0(a4)
j .L7
.L14:
mv t5,a4
j .L6
.L9:
li a5,0
li t1,0
j .L3

The vectorization codegen is quite inefficient since we choose a VLS modes to vectorize the loop body
with epilogue choosing a VLA modes.

cost.c:6:21: note:  ***** Choosing vector mode V128QI
cost.c:6:21: note:  ***** Choosing epilogue vector mode RVVM4QI

As we known, in RVV side, we have VLA modes and VLS modes. VLAmodes support partial vectors wheras
VLSmodes support full vectors.  The goal we add VLSmodes is to improve the codegen of known NITERS
or SLP codes.

If NITERS is unknown, that is i < n, n is unknown. We will always have partial vectors vectorization.
It can be loop body or epilogue. In this case, It's always more efficient to apply VLA partial vectorization
on loop body which doesn't have epilogue.

After this patch:

f:
ble a2,zero,.L7
li a5,1
beq a2,a5,.L5
li a6,50790400
addi a6,a6,1541
li a4,67633152
addi a4,a4,513
csrr a5,vlenb
addiw a2,a2,-1
slli a6,a6,32
add a6,a6,a4
slli a5,a5,2
slli a4,a2,32
vsetvli t1,zero,e64,m4,ta,ma
srli a3,a4,29
neg t4,a5
addi a7,a1,37
mv a4,a0
vmv.v.x v12,a6
vsetvli t3,zero,e16,m8,ta,ma
vid.v v16
vand.vi v16,v16,-8
.L4:
minu a6,a3,a5
vsetvli zero,a6,e8,m4,ta,ma
vle8.v v8,0(a7)
vsetvli t3,zero,e8,m4,ta,ma
mv t1,a3
vrgatherei16.vv v4,v8,v16
vsetvli zero,a6,e8,m4,ta,ma
vadd.vv v4,v4,v12
vse8.v v4,0(a4)
add a7,a7,a5
add a4,a4,a5
add a3,a3,t4
bgtu t1,a5,.L4
.L3:
slliw a2,a2,3
add a1,a1,a2
lui a5,%hi(.LC0)
lbu a4,37(a1)
add a0,a0,a2
addi a5,a5,%lo(.LC0)
vsetivli zero,8,e8,mf2,ta,ma
vmv.v.x v1,a4
vle8.v v2,0(a5)
vadd.vv v1,v1,v2
vse8.v v1,0(a0)
.L7:
ret

Tested on both RV32 and RV64 no regression. Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (costs::better_main_loop_than_p): VLA
preempt VLS on unknown NITERS loop.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/partial/slp-1.c: Remove xfail.
* gcc.target/riscv/rvv/autovec/partial/slp-16.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/partial/slp-5.c: Ditto.

libstdc++: Optimize std::is_compound compilation performance

This patch optimizes the compilation performance of std::is_compound.

libstdc++-v3/ChangeLog:

* include/std/type_traits (is_compound): Do not use __not_.
(is_compound_v): Use is_fundamental_v instead.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>

Add -mevex512 into invoke.texi

Hi Richard,

It seems that I send out a not updated patch. This patch should what
I want to send.

Thx,
Haochen

gcc/ChangeLog:

* doc/invoke.texi: Add -mevex512.

LoongArch: Optimized some of the symbolic expansion instructions generated during bitwise operations.

There are two mode iterators defined in the loongarch.md:
(define_mode_iterator GPR [SI (DI "TARGET_64BIT")])
and
(define_mode_iterator X [(SI "!TARGET_64BIT") (DI "TARGET_64BIT")])
Replace the mode in the bit arithmetic from GPR to X.

Since the bitwise operation instruction does not distinguish between 64-bit,
32-bit, etc., it is necessary to perform symbolic expansion if the bitwise
operation is less than 64 bits.
The original definition would have generated a lot of redundant symbolic
extension instructions. This problem is optimized with reference to the
implementation of RISCV.

Add this patch spec2017 500.perlbench performance improvement by 1.8%

gcc/ChangeLog:

* config/loongarch/loongarch.md (one_cmpl<mode>2): Replace GPR with X.
(*nor<mode>3): Likewise.
(nor<mode>3): Likewise.
(*negsi2_extended): New template.
(*<optab>si3_internal): Likewise.
(*one_cmplsi2_internal): Likewise.
(*norsi3_internal): Likewise.
(*<optab>nsi_internal): Likewise.
(bytepick_w_<bytepick_imm>_extend): Modify this template according to the
modified bit operation to make the optimization work.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/sign-extend-bitwise.c: New test.

Optimize A < B ? A : B to MIN_EXPR.

Similar for A < B ? B : A to MAX_EXPR.
There're codes in the frontend to optimize such pattern but failed to
handle testcase in the PR since it's exposed at gimple level when
folding backend builtins.

pr95906 now can be optimized to MAX_EXPR as it's commented in the
testcase.

// FIXME: this should further optimize to a MAX_EXPR
typedef signed char v16i8 __attribute__((vector_size(16)));
v16i8 f(v16i8 a, v16i8 b)

gcc/ChangeLog:

PR target/104401
* match.pd (VEC_COND_EXPR: A < B ? A : B -> MIN_EXPR): New patten match.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr104401.c: New test.
* gcc.dg/tree-ssa/pr95906.c: Adjust testcase.

PR modula2/112946 set expression type checking

This patch adds type checking for binary set operators.
It also checks the IN operator and improves the := type checking.

gcc/m2/ChangeLog:

PR modula2/112946
* gm2-compiler/M2GenGCC.mod (IsExpressionCompatible): Import.
(ExpressionTypeCompatible): Import.
(CodeStatement): Remove op1, op2, op3 parameters from CodeSetOr,
CodeSetAnd, CodeSetSymmetricDifference, CodeSetLogicalDifference.
(checkArrayElements): Rename op1 to des and op3 to expr.
Use despos and exprpos instead of CurrentQuadToken.
(checkRecordTypes): Rename op1 to des and op2 to expr.
Use virtpos instead of CurrentQuadToken.
(checkIncorrectMeta): Ditto.
(checkBecomes): Rename op1 to des and op3 to expr.
Use virtpos instead of CurrentQuadToken.
(NoWalkProcedure): New procedure stub.
(CheckBinaryExpressionTypes): New procedure function.
(CheckElementSetTypes): New procedure function.
(CodeBinarySet): Re-write.
(FoldBinarySet): Re-write.
(CodeSetOr): Remove parameters op1, op2 and op3.
(CodeSetAnd): Ditto.
(CodeSetLogicalDifference): Ditto.
(CodeSetSymmetricDifference): Ditto.
(CodeIfIn): Call CheckBinaryExpressionTypes and
CheckElementSetTypes.
* gm2-compiler/M2Quads.mod (BuildRotateFunction): Correct
parameters to MakeVirtualTok to reflect parameter block
passed to Rotate.

gcc/testsuite/ChangeLog:

PR modula2/112946
* gm2/pim/fail/badbecomes.mod: New test.
* gm2/pim/fail/badexpression.mod: New test.
* gm2/pim/fail/badexpression2.mod: New test.
* gm2/pim/fail/badifin.mod: New test.
* gm2/pim/pass/goodifin.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

config: delete unused CYG_AC_PATH_LIBERTY macro

Nothing uses this, so delete it to avoid confusion.

config/ChangeLog:

* acinclude.m4 (CYG_AC_PATH_LIBERTY): Delete.

Daily bump.

libstdc++: Use _GLIBCXX_USE_BUILTIN_TRAIT for _Nth_type

Since _Nth_type has a fallback native implementation, use
_GLIBCXX_USE_BUILTIN_TRAIT when checking for __type_pack_element
so that we can easily toggle which implementation to use.

libstdc++-v3/ChangeLog:

* include/bits/utility.h (_Nth_type): Use
_GLIBCXX_USE_BUILTIN_TRAIT instead of __has_builtin.

RISC-V: Switch RVV cost model.

This patch is preparing patch for the following cost model tweak.

Since we don't have vector cost model in default tune info (rocket),
we set the cost model default as generic cost model by default.

The reason we want to switch to generic vector cost model is the default
cost model generates inferior codegen for various benchmarks.

For example, PR113247, we have performance bug that we end up having over 70%
performance drop of SHA256. Currently, no matter how we adapt cost model,
we are not able to fix the performance bug since we always use default cost model by default.

Also, tweak the generic cost model back to default cost model since we have some FAILs in
current tests.

After this patch, we (me an Robin) can work on cost model tunning together to improve performane
in various benchmarks.

Tested on both RV32 and RV64, ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv.cc (get_common_costs): Switch RVV cost model.
(get_vector_costs): Ditto.
(riscv_builtin_vectorization_cost): Ditto.

RISC-V: Minor tweak dynamic cost model

v2 update: Robostify tests.

While working on cost model, I notice one case that dynamic lmul cost doesn't work well.

Before this patch:

foo:
        lui     a4,%hi(.LANCHOR0)
        li      a0,1953
        li      a1,63
        addi    a4,a4,%lo(.LANCHOR0)
        li      a3,64
        vsetvli a2,zero,e32,mf2,ta,ma
        vmv.v.x v5,a0
        vmv.v.x v4,a1
        vid.v   v3
.L2:
        vsetvli a5,a3,e32,mf2,ta,ma
        vadd.vi v2,v3,1
        vadd.vv v1,v3,v5
        mv      a2,a5
        vmacc.vv        v1,v2,v4
        slli    a1,a5,2
        vse32.v v1,0(a4)
        sub     a3,a3,a5
        add     a4,a4,a1
        vsetvli a5,zero,e32,mf2,ta,ma
        vmv.v.x v1,a2
        vadd.vv v3,v3,v1
        bne     a3,zero,.L2
        li      a0,0
        ret

Unexpected: Use scalable vector and LMUL = MF2 which is wasting computation resources.

Ideally, we should use LMUL = M8 VLS modes.

The root cause is the dynamic LMUL heuristic dominates the VLS heuristic.
Adapt the cost model heuristic.

After this patch:

foo:
lui a4,%hi(.LANCHOR0)
addi a4,a4,%lo(.LANCHOR0)
li a3,4096
li a5,32
li a1,2016
addi a2,a4,128
addiw a3,a3,-32
vsetvli zero,a5,e32,m8,ta,ma
li a0,0
vid.v v8
vsll.vi v8,v8,6
vadd.vx v16,v8,a1
vadd.vx v8,v8,a3
vse32.v v16,0(a4)
vse32.v v8,0(a2)
ret

Tested on both RV32/RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/riscv-vector-costs.cc (costs::better_main_loop_than_p): Minior tweak.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-10.c: Fix test.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-11.c: Ditto.
* gcc.dg/vect/costmodel/riscv/rvv/vla_vs_vls-12.c: Ditto.

libgccjit: Fix GGC segfault when using -flto

gcc/ChangeLog:
PR jit/111396
* ipa-fnsummary.cc (ipa_fnsummary_cc_finalize): Call
ipa_free_size_summary.
* ipa-icf.cc (ipa_icf_cc_finalize): New function.
* ipa-profile.cc (ipa_profile_cc_finalize): New function.
* ipa-prop.cc (ipa_prop_cc_finalize): New function.
* ipa-prop.h (ipa_prop_cc_finalize): New function.
* ipa-sra.cc (ipa_sra_cc_finalize): New function.
* ipa-utils.h (ipa_profile_cc_finalize, ipa_icf_cc_finalize,
ipa_sra_cc_finalize): New functions.
* toplev.cc (toplev::finalize): Call ipa_icf_cc_finalize,
ipa_prop_cc_finalize, ipa_profile_cc_finalize and
ipa_sra_cc_finalize
Include ipa-utils.h.

gcc/testsuite/ChangeLog:
PR jit/111396
* jit.dg/all-non-failing-tests.h: Add note about test-ggc-bugfix.
* jit.dg/test-ggc-bugfix.c: New test.

RISC-V: T-HEAD: Add support for the XTheadInt ISA extension

The XTheadInt ISA extension provides the following instructions
to accelerate interrupt processing:
* th.ipush
* th.ipop

Ref:
https://github.com/T-head-Semi/thead-extension-spec/releases/download/2.3.0/xthead-2023-11-10-2.3.0.pdf

gcc/ChangeLog:

* config/riscv/riscv-protos.h (th_int_get_mask): New prototype.
(th_int_get_save_adjustment): Likewise.
(th_int_adjust_cfi_prologue): Likewise.
* config/riscv/riscv.cc (BITSET_P): Moved away from here.
(TH_INT_INTERRUPT): New macro.
(riscv_expand_prologue): Add the processing of XTheadInt.
(riscv_expand_epilogue): Likewise.
* config/riscv/riscv.h (BITSET_P): Moved to here.
* config/riscv/riscv.md: New unspec.
* config/riscv/thead.cc (th_int_get_mask): New function.
(th_int_get_save_adjustment): Likewise.
(th_int_adjust_cfi_prologue): Likewise.
* config/riscv/thead.md (th_int_push): New pattern.
(th_int_pop): new pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadint-push-pop.c: New test.

middle-end: Don't apply copysign optimization if target does not implement optab [PR112468]

Currently GCC does not treat IFN_COPYSIGN the same as the copysign tree expr.
The latter has a libcall fallback and the IFN can only do optabs.

Because of this the change I made to optimize copysign only works if the
target has impemented the optab, but it should work for those that have the
libcall too.

More annoyingly if a target has vector versions of ABS and NEG but not COPYSIGN
then the change made them lose vectorization.

The proper fix for this is to treat the IFN the same as the tree EXPR and to
enhance expand_COPYSIGN to also support vector calls.

I have such a patch for GCC 15 but it's quite big and too invasive for stage-4.
As such this is a minimal fix, just don't apply the transformation and leave
targets which don't have the optab unoptimized.

Targets list for check_effective_target_ifn_copysign was gotten by grepping for
copysign and looking at the optab.

gcc/ChangeLog:

PR tree-optimization/112468
* doc/sourcebuild.texi: Document ifn_copysign.
* match.pd: Only apply transformation if target supports the IFN.

gcc/testsuite/ChangeLog:

PR tree-optimization/112468
* gcc.dg/fold-copysign-1.c: Modify tests based on if target supports
IFN_COPYSIGN.
* gcc.dg/pr55152-2.c: Likewise.
* gcc.dg/tree-ssa/abs-4.c: Likewise.
* gcc.dg/tree-ssa/backprop-6.c: Likewise.
* gcc.dg/tree-ssa/copy-sign-2.c: Likewise.
* gcc.dg/tree-ssa/mult-abs-2.c: Likewise.
* lib/target-supports.exp (check_effective_target_ifn_copysign): New.

reassoc vs uninitialized variable [PR112581]

Like r14-2293-g11350734240dba and r14-2289-gb083203f053f16,
reassociation can combine across a few bb and one of the usage
can be an uninitializated variable and if going from an conditional
usage to an unconditional usage can cause wrong code.
This uses maybe_undef_p like other passes where this can happen.

Note if-to-switch uses the function (init_range_entry) provided
by ressociation so we need to call mark_ssa_maybe_undefs there;
otherwise we assume almost all ssa names are uninitialized.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/112581
* gimple-if-to-switch.cc (pass_if_to_switch::execute): Call
mark_ssa_maybe_undefs.
* tree-ssa-reassoc.cc (can_reassociate_op_p): Uninitialized
variables can not be reassociated.
(init_range_entry): Check for uninitialized variables too.
(init_reassoc): Call mark_ssa_maybe_undefs.

gcc/testsuite/ChangeLog:

PR tree-optimization/112581
* gcc.c-torture/execute/pr112581-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

RISC-V/testsuite: Fix comment termination in pr105314.c

Add terminating `/' character missing from one of the test harness
command clauses in pr105314.c. This causes no issue with compilation
owing to another comment immediately following, but would cause a:

pr105314.c:3:1: warning: "/*" within comment [-Wcomment]

message if warnings were enabled.

gcc/testsuite/
* gcc.target/riscv/pr105314.c: Fix comment termination.

RISC-V: Also handle sign extension in branch costing

Complement commit c1e8cb3d9f94 ("RISC-V: Rework branch costing model for
if-conversion") and also handle extraneous sign extend operations that
are sometimes produced by `noce_try_cmove_arith' instead of zero extend
operations, making branch costing consistent. It is unclear what the
condition is for the middle end to choose between the zero extend and
sign extend operation, but the test case included uses sign extension
with 64-bit targets, preventing if-conversion from triggering across all
the architectural variants.

There are further anomalies revealed by the test case, specifically the
exceedingly high branch cost of 6 required for the `-mmovcc' variant
despite that the final branchless sequence only uses 4 instructions, the
missed conversion at -O1 for 32-bit targets even though code is machine
word size agnostic, and the missed conversion at -Os and -Oz for 32-bit
Zicond targets even though the branchless sequence would be shorter than
the branched one. These will have to be handled separately.

gcc/
* config/riscv/riscv.cc (riscv_noce_conversion_profitable_p):
Also handle sign extension.

gcc/testsuite/
* gcc.target/riscv/cset-sext-sfb.c: New test.
* gcc.target/riscv/cset-sext-thead.c: New test.
* gcc.target/riscv/cset-sext-ventana.c: New test.
* gcc.target/riscv/cset-sext-zicond.c: New test.
* gcc.target/riscv/cset-sext.c: New test.

testsuite: Add testcase for already fixed PR [PR112734]

This test was already fixed by r14-6051 aka PR112770 fix.

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/112734
* gcc.dg/bitint-64.c: New test.

aarch64: Make ldp/stp pass off by default

As discussed on IRC, this makes the aarch64 ldp/stp pass off by default. This
should stabilize the trunk and give some time to address the P1 regressions.

gcc/ChangeLog:

* config/aarch64/aarch64.opt (-mearly-ldp-fusion): Set default
to 0.
(-mlate-ldp-fusion): Likewise.

middle-end: correctly identify the edge taken when condition is true. [PR113287]

The vectorizer needs to know during early break vectorization whether the edge
that will be taken if the condition is true stays or leaves the loop.

This is because the code assumes that if you take the true branch you exit the
loop. If you don't exit the loop it has to generate a different condition.

Basically it uses this information to decide whether it's generating a
"any element" or an "all element" check.

Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu
and no issues with --enable-lto --with-build-config=bootstrap-O3
--enable-checking=release,yes,rtl,extra.

gcc/ChangeLog:

PR tree-optimization/113287
* tree-vect-stmts.cc (vectorizable_early_exit): Check the flags on edge
instead of using BRANCH_EDGE to determine true edge.

gcc/testsuite/ChangeLog:

PR tree-optimization/113287
* gcc.dg/vect/vect-early-break_100-pr113287.c: New test.
* gcc.dg/vect/vect-early-break_99-pr113287.c: New test.

tree-optimization/113078 - conditional subtraction reduction vectorization

When if-conversion was changed to use .COND_ADD/SUB for conditional
reduction it was forgotten to update reduction path handling to
canonicalize .COND_SUB to .COND_ADD for vectorizable_reduction
similar to what we do for MINUS_EXPR. The following adds this
and testcases exercising this at runtime and looking for the
appropriate masked subtraction in the vectorized code on x86.

PR tree-optimization/113078
* tree-vect-loop.cc (check_reduction_path): Canonicalize
.COND_SUB to .COND_ADD.

* gcc.dg/vect/vect-reduc-cond-sub.c: New testcase.
* gcc.target/i386/vect-pr113078.c: Likewise.

c++ frontend: initialize ivdep value

Should control enter the switch from one of the cases other than
the IVDEP one then the variable remains uninitialized.

This fixes it by initializing it to false.

gcc/cp/ChangeLog:

* parser.cc (cp_parser_pragma): Initialize to false.

gcc-urlifier: handle option prefixes such as '-fno-'

Given e.g. this missppelled option (omitting the trailing 's'):
$ LANG=C ./xgcc -B. -fno-inline-small-function
xgcc: error: unrecognized command-line option '-fno-inline-small-function'; did you mean '-fno-inline-small-functions'?

we weren't providing a documentation URL for the suggestion.

The issue is the URLification code uses find_opt, which doesn't consider
the various '-fno-' prefixes.

This patch adds a way to find the pertinent prefix remapping and uses it
when determining URLs.
With this patch, the suggestion '-fno-inline-small-functions' now gets a
documentation link (to that of '-finline-small-functions').

gcc/ChangeLog:
* gcc-urlifier.cc (gcc_urlifier::get_url_suffix_for_option):
Handle prefix mappings before calling find_opt.
(selftest::gcc_urlifier_cc_tests): Add example of urlifying a
"-fno-"-prefixed command-line option.
* opts-common.cc (get_option_prefix_remapping): New.
* opts.h (get_option_prefix_remapping): New decl.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

pretty-print: support urlification in phase 3

TL;DR: for the case when the user misspells a command-line option
and we suggest one, with this patch we now provide a documentation URL
for the suggestion.

In r14-5118-gc5db4d8ba5f3de I added a mechanism to automatically add
URLs to quoted strings in diagnostics, and in r14-6920-g9e49746da303b8
through r14-6923-g4ded42c2c5a5c9 wired this up so that any time
we mention a command-line option in a diagnostic message in quotes,
the user gets a URL to the HTML documentation for that option.

However this only worked for quoted strings that were fully within
a single "chunk" within the pretty-printer implementation, such as:

* "%<-foption%>" (handled in phase 1)
* "%qs", "-foption" (handled in phase 2)

but not where the the quoted string straddled multiple chunks, in
particular for this important case in the gcc.cc:

  error ("unrecognized command-line option %<-%s%>;"
" did you mean %<-%s%>?",
switches[i].part1, hint);

e.g. for:
$ LANG=C ./xgcc -B. -finling-small-functions
xgcc: error: unrecognized command-line option '-finling-small-functions'; did you mean '-finline-small-functions'?

which within pp_format becomes these chunks:

* chunk 0: "unrecognized command-line option `-"
* chunk 1: switches[i].part1  (e.g. "finling-small-functions")
* chunk 2: "'; did you mean `-"
* chunk 3: hint (e.g. "finline-small-functions")
* chunk 4: "'?"

where the first quoted run is in chunks 1-3 and the second in
chunks 2-4.

Hence we were not attempting to provide a URL for the two quoted runs,
and, in particular not for the hint.

This patch refactors the urlification mechanism in pretty-print.cc so
that it checks for quoted runs that appear in phase 3 (as well as in
phases 1 and 2, as before).  With this, the quoted text runs
"-finling-small-functions" and "-finline-small-functions" are passed
to the urlifier, which successfully finds a documentation URL for
the latter.

As before, the urlification code is only run if the URL escapes are
enabled, and only for messages from diagnostic.cc (error, warn, inform,
etc), not for all pretty_printer usage.

gcc/ChangeLog:
* diagnostic.cc (diagnostic_context::report_diagnostic): Pass
m_urlifier to pp_output_formatted_text.
* pretty-print.cc: Add #define of INCLUDE_VECTOR.
(obstack_append_string): New overload, taking a length.
(urlify_quoted_string): Pass in an obstack ptr, rather than using
that of the pp's buffer.  Generalize to handle trailing text in
the buffer beyond the run of quoted text.
(class quoting_info): New.
(on_begin_quote): New.
(on_end_quote): New.
(pp_format): Refactor phase 1 and phase 2 quoting support, moving
it to calls to on_begin_quote and on_end_quote.
(struct auto_obstack): New.
(quoting_info::handle_phase_3): New.
(pp_output_formatted_text): Add urlifier param.  Use it if there
is deferred urlification.  Delete m_quotes.
(selftest::pp_printf_with_urlifier): Pass urlifier to
pp_output_formatted_text.
(selftest::test_urlification): Update results for the existing
case of quoted text stradding chunks; add more such test cases.
* pretty-print.h (class quoting_info): New forward decl.
(chunk_info::m_quotes): New field.
(pp_output_formatted_text): Add optional urlifier param.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

pretty-print: add selftest coverage for numbered args

No functional change intended.

gcc/ChangeLog:
* pretty-print.cc (selftest::test_pp_format): Add selftest
coverage for numbered args.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

OpenMP: Fix g++.dg/gomp/bad-array-section-10.C for C++23 and up

This patch adjusts diagnostic output for C++23 and above for the test
case mentioned in the commit title.

2024-01-10 Julian Brown <julian@codesourcery.com>

gcc/testsuite/
* g++.dg/gomp/bad-array-section-10.C: Adjust diagnostics for C++23 and
up.

OpenMP: Fix new lvalue-parsing map/to/from tests for 32-bit targets

This patch fixes several tests introduced by the commit
r14-7033-g1413af02d62182 for 32-bit targets.

2024-01-10 Julian Brown <julian@codesourcery.com>

gcc/testsuite/
* g++.dg/gomp/array-section-1.C: Fix scan output for 32-bit target.
* g++.dg/gomp/array-section-2.C: Likewise.
* g++.dg/gomp/bad-array-section-4.C: Adjust error output for 32-bit
target.

middle-end: Fix dominators updates when peeling with multiple exits [PR113144]

When we peel at_exit we are moving the new loop at the exit of the previous
loop.  This means that the blocks outside the loop dat the previous loop used to
dominate are no longer being dominated by it.

The new dominators however are hard to predict since if the loop has multiple
exits and all the exits are an "early" one then we always execute the scalar
loop.  In this case the scalar loop can completely dominate the new loop.

If we later have skip_vector then there's an additional skip edge added that
might change the dominators.

The previous patch would force an update of all blocks reachable from the new
exits.  This one updates *only* blocks that we know the scalar exits dominated.

For the examples this reduces the blocks to update from 18 to 3.

gcc/ChangeLog:

PR tree-optimization/113144
PR tree-optimization/113145
* tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
Update all BB that the original exits dominated.

gcc/testsuite/ChangeLog:

PR tree-optimization/113144
PR tree-optimization/113145
* gcc.dg/vect/vect-early-break_94-pr113144.c: New test.

testsuite: Fix PR number [PR113297]

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113297
* gcc.dg/bitint-63.c: Fix PR number.

libgomp: Fix up FLOCK fallback handling [PR113192]

My earlier change broke Solaris testing, because @FLOCK@ isn't substituted
just into libgomp/Makefile where it worked, but also the
testsuite/libgomp-site-extra.exp file where Make variables aren't present
and can't be substituted.

The following patch instead computes the absolute srcdir path and uses it
for FLOCK.

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR libgomp/113192
* configure.ac (FLOCK): Use $libgomp_abs_srcdir/testsuite/flock
instead of \$(abs_top_srcdir)/testsuite/flock.
* configure: Regenerated.

Fix debug info for enumeration types with reverse Scalar_Storage_Order

This implements the support of DW_AT_endianity for enumeration types because
they are scalar and therefore, reverse Scalar_Storage_Order is supported for
them, but only when the -gstrict-dwarf switch is not passed because this is
an extension.

There is an associated GDB patch to be submitted to grok the new DWARF.

gcc/
* dwarf2out.cc (modified_type_die): Extend the support of reverse
storage order to enumeration types if -gstrict-dwarf is not passed.
(gen_enumeration_type_die): Add REVERSE parameter and generate the
DIE immediately after the existing one if it is true.
(gen_tagged_type_die): Add REVERSE parameter and pass it in the
call to gen_enumeration_type_die.
(gen_type_die_with_usage): Add REVERSE parameter and pass it in the
first recursive call as well as the call to gen_tagged_type_die.
(gen_type_die): Add REVERSE parameter and pass it in the call to
gen_type_die_with_usage.

LoongArch: testsuite: Add loongarch support to slp-21.c.

The function of this test is to check that the compiler supports vectorization
using SLP and vec_{load/store/*}_lanes. However, vec_{load/store/*}_lanes are
not supported on LoongArch, such as the corresponding "st4/ld4" directives on
aarch64.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/slp-21.c: Add loongarch.

LoongArch: testsuite:Fixed a bug that added a target check error.

After the code is committed in r14-6948, GCC regression testing on some
architectures will produce the following error:

"error executing dg-final: unknown effective target keyword `loongarch*-*-*'"

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Removed an issue with "target keyword"
checking errors on LoongArch architecture.

sra: Partial fix for BITINT_TYPEs [PR113120]

As changed in other parts of the compiler, using
build_nonstandard_integer_type is not appropriate for arbitrary precisions,
especially if the precision comes from a BITINT_TYPE or something based on
that, build_nonstandard_integer_type relies on some integral mode being
supported that can support the precision.

The following patch uses build_bitint_type instead for BITINT_TYPE
precisions.

Note, it would be good if we were able to punt on the optimization
(but this code doesn't seem to be able to punt, so it needs to be done
somewhere earlier) at least in cases where building it would be invalid.
E.g. right now BITINT_TYPE can support precisions up to 65535 (inclusive),
but 65536 will not work anymore (we can't have > 16-bit TYPE_PRECISION).
I've tried to replace 513 with 65532 in the testcase and it didn't ICE,
so maybe it ran into some other SRA limit.

2024-01-10 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113120
* tree-sra.cc (analyze_access_subtree): For BITINT_TYPE
with root->size TYPE_PRECISION don't build anything new.
Otherwise, if root->type is a BITINT_TYPE, use build_bitint_type
rather than build_nonstandard_integer_type.

* gcc.dg/bitint-63.c: New test.

i386: [APX] Document inline asm behavior and new switch for APX

For APX, the inline asm behavior was not mentioned in any document
before. Add description for it.

gcc/ChangeLog:

* config/i386/i386.opt: Adjust document.
* doc/invoke.texi: Add description for
-mapx-inline-asm-use-gpr32.

RISC-V: Refine unsigned avg_floor/avg_ceil

This patch is inspired by LLVM patches:
https://github.com/llvm/llvm-project/pull/76550
https://github.com/llvm/llvm-project/pull/77473

Use vaaddu for AVG vectorization.

Before this patch:

        vsetivli        zero,8,e8,mf2,ta,ma
        vle8.v  v3,0(a1)
        vle8.v  v2,0(a2)
        vwaddu.vv        v1,v3,v2
        vsetvli zero,zero,e16,m1,ta,ma
        vadd.vi v1,v1,1
        vsetvli zero,zero,e8,mf2,ta,ma
        vnsrl.wi        v1,v1,1
        vse8.v  v1,0(a0)
        ret

After this patch:

vsetivli zero,8,e8,mf2,ta,ma
csrwi vxrm,0
vle8.v v1,0(a1)
vle8.v v2,0(a2)
vaaddu.vv v1,v1,v2
vse8.v v1,0(a0)
ret

Note on signed averaging addition

Based on the rvv spec, there is also a variant for signed averaging addition called vaadd.
But AFAIU, no matter in which rounding mode, we cannot achieve the semantic of signed averaging addition through vaadd.
Thus this patch only introduces vaaddu.

More details in:
https://github.com/riscv/riscv-v-spec/issues/935
https://github.com/riscv/riscv-v-spec/issues/934

Tested on both RV32 and RV64 no regression.

Ok for trunk ?

gcc/ChangeLog:

* config/riscv/autovec.md (<u>avg<v_double_trunc>3_floor): Remove.
(avg<v_double_trunc>3_floor): New pattern.
(<u>avg<v_double_trunc>3_ceil): Remove.
(avg<v_double_trunc>3_ceil): New pattern.
(uavg<mode>3_floor): Ditto.
(uavg<mode>3_ceil): Ditto.
* config/riscv/riscv-protos.h (enum insn_flags): Add for average addition.
(enum insn_type): Ditto.
* config/riscv/riscv-v.cc: Ditto.
* config/riscv/vector-iterators.md (ashiftrt): Remove.
(ASHIFTRT): Ditto.
* config/riscv/vector.md: Add VLS modes.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vls/avg-1.c: Adapt test.
* gcc.target/riscv/rvv/autovec/vls/avg-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/avg-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv32gcv.c: Ditto.
* gcc.target/riscv/rvv/autovec/widen/vec-avg-rv64gcv.c: Ditto.

testsuite, rs6000: Adjust pcrel-sibcall-1.c with noipa [PR112751]

As PR112751 shows, commit r14-5628 caused pcrel-sibcall-1.c
to fail as it enables ipa-vrp which makes return values of
functions {x,y,xx} as known and propagated. This patch is
to adjust it with noipa to make it not fragile.

PR testsuite/112751

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pcrel-sibcall-1.c: Replace noinline as noipa.

rs6000: Eliminate zext fed by vclzlsbb [PR111480]

As PR111480 shows, commit r14-4079 only optimizes the case
of vctzlsbb but not for the similar vclzlsbb. This patch
is to consider vclzlsbb as well and avoid the failure on
the reported test case. It also simplifies the patterns
with iterator and attribute.

PR target/111480

gcc/ChangeLog:

* config/rs6000/vsx.md (VCZLSBB): New int iterator.
(vczlsbb_char): New int attribute.
(vclzlsbb_<mode>, vctzlsbb_<mode>): Merge to ...
(vc<vczlsbb_char>zlsbb_<mode>): ... this.
(*vctzlsbb_zext_<mode>): Rename to ...
(*vc<vczlsbb_char>zlsbb_zext_<mode>): ... this, and extend it to
cover vclzlsbb.

rs6000: Make copysign (x, -1) back to -abs (x) for IEEE128 float [PR112606]

I noticed that commit r14-6192 can't help PR112606 #c3 as
it only takes care of SF/DF but TF/KF can still suffer the
issue. Similar to commit r14-6192, this patch is to take
care of copysign<mode>3 with IEEE128 as well.

PR target/112606

gcc/ChangeLog:

* config/rs6000/rs6000.md (copysign<mode>3 IEEE128): Change predicate
of the last argument from altivec_register_operand to any_operand. If
operands[2] is CONST_DOUBLE, emit abs or neg abs depending on its sign
otherwise if it doesn't satisfy altivec_register_operand, force it to
REG using copy_to_mode_reg.

strub: Only unbias stack point for SPARC_STACK_BOUNDARY_HACK [PR113100]

As PR113100 shows, the unbiasing introduced by r14-6737 can
cause the scrubbing to overrun and screw some critical data
on stack like saved toc base consequently cause segfault.

By checking PR112917, IMHO we should keep this unbiasing
guarded under SPARC_STACK_BOUNDARY_HACK (TARGET_ARCH64 &&
TARGET_STACK_BIAS), similar to some existing code special
treating SPARC stack bias.

PR middle-end/113100

gcc/ChangeLog:

* builtins.cc (expand_builtin_stack_address): Guard stack point
adjustment with SPARC_STACK_BOUNDARY_HACK.

LoongArch: Simplify -mexplicit-reloc definitions

Since we do not need printing or manual parsing of this option,
(whether in the driver or for target attributes to be supported later)
it can be handled in the .opt file framework.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Remove explicit-reloc
argument string definitions.
* config/loongarch/loongarch-str.h: Same.
* config/loongarch/genopts/loongarch.opt.in: Mark -m[no-]explicit-relocs
as aliases to -mexplicit-relocs={always,none}
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc: Same.

LoongArch: Use enums for constants

Target features constants from loongarch-def.h are currently defined as macros.
Switch to enums for better look in the debugger.

gcc/ChangeLog:

* config/loongarch/loongarch-def.h: Define constants with
enums instead of Macros.

LoongArch: Rename ISA_BASE_LA64V100 to ISA_BASE_LA64

LoongArch ISA manual v1.10 suggests that software should not depend on
the ISA version number for marking processor features. The ISA version
number is now defined as a collective name of individual ISA evolutions.
Since there is a independent ISA evolution mask now, we can drop the
version information from the base ISA.

gcc/ChangeLog:

* config/loongarch/genopts/loongarch-strings: Rename.
* config/loongarch/genopts/loongarch.opt.in: Same.
* config/loongarch/loongarch-cpu.cc: Same.
* config/loongarch/loongarch-def.cc: Same.
* config/loongarch/loongarch-def.h: Same.
* config/loongarch/loongarch-opts.cc: Same.
* config/loongarch/loongarch-opts.h: Same.
* config/loongarch/loongarch-str.h: Same.
* config/loongarch/loongarch.opt: Same.

LoongArch: Handle ISA evolution switches along with other options

gcc/ChangeLog:

* config/loongarch/genopts/genstr.sh: Prepend the isa_evolution
variable with the common la_ prefix.
* config/loongarch/genopts/loongarch.opt.in: Mark ISA evolution
flags as saved using TargetVariable.
* config/loongarch/loongarch.opt: Same.
* config/loongarch/loongarch-def.h: Define evolution_set to
mark changes to the -march default.
* config/loongarch/loongarch-driver.cc: Same.
* config/loongarch/loongarch-opts.cc: Same.
* config/loongarch/loongarch-opts.h: Define and use ISA evolution
conditions around the la_target structure.
* config/loongarch/loongarch.cc: Same.
* config/loongarch/loongarch.md: Same.
* config/loongarch/loongarch-builtins.cc: Same.
* config/loongarch/loongarch-c.cc: Same.
* config/loongarch/lasx.md: Same.
* config/loongarch/lsx.md: Same.
* config/loongarch/sync.md: Same.

RISC-V: Robostify dynamic lmul test

While working on refining the cost model, I notice this test will generate unexpected
scalar xor instructions if we don't tune cost model carefully.

Add more assembler to avoid future regression.

Committed.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul2-7.c: Add assembler-not check.

Daily bump.

libstdc++: Fix Unicode property detection functions

Fix some copy & pasted logic in __is_extended_pictographic. This
function should yield false for the values before the first edge, not
true. Also add a missing boundary condition check in __incb_property.

Also Fix an off-by-one error in _Utf_iterator::operator++() that would
make dereferencing a past-the-end iterator undefined (where the intended
design is that the iterator is always incrementable and dereferenceable,
for better memory safety).

Also simplify the grapheme view iterator, which still contained some
remnants of an earlier design I was experimenting with.

Slightly tweak the gen_libstdcxx_unicode_data.py script so that the
_Gcb_property enumerators are in the order we encounter them in the data
file, instead of sorting them alphabetically. Start with the "Other"
property at value 0, because that's the default property for anything
not in the file. This makes no practical difference, but seems cleaner.
It causes the values in the __gcb_edges table to change, so can only be
done now before anybody is using this code yet. The enumerator values
and table entries become ABI artefacts for the function using them.

contrib/ChangeLog:

* unicode/gen_libstdcxx_unicode_data.py: Print out Gcb_property
enumerators in the order they're seen, not alphabetical order.

libstdc++-v3/ChangeLog:

* include/bits/unicode-data.h: Regenerate.
* include/bits/unicode.h (_Utf_iterator::operator++()): Fix off
by one error.
(__incb_property): Add missing check for values before the
first edge.
(__is_extended_pictographic): Invert return values to fix
copy&pasted logic.
(_Grapheme_cluster_view::_Iterator): Remove second iterator
member and find end of cluster lazily.
* testsuite/ext/unicode/grapheme_view.cc: New test.
* testsuite/ext/unicode/properties.cc: New test.
* testsuite/ext/unicode/view.cc: New test.

Fix spurious match in extract_symvers

Tighten the regex to find the start of the .dynsym symtab in the readelf
output to avoid matching the section symbol in the normal symtab.

libstdc++-v3:
* scripts/extract_symvers.in: Require final colon to only match
.dsynsym in the header of the dynamic symtab.

c++: adjust accessor fixits for explicit object parm

In a couple of places in the xobj patch I noticed that is_this_parameter
probably wanted to change to is_object_parameter; this implements that and
does the additional adjustments needed to make the accessor fixits handle
xobj parms.

gcc/cp/ChangeLog:

* semantics.cc (is_object_parameter): New.
* cp-tree.h (is_object_parameter): Declare.
* call.cc (maybe_warn_class_memaccess): Use it.
* search.cc (field_access_p): Use it.
(class_of_object_parm): New.
(field_accessor_p): Adjust for explicit object parms.

gcc/testsuite/ChangeLog:

* g++.dg/torture/accessor-fixits-9-xobj.C: New test.

c++: explicit object cleanups

The FIXME in xobj_iobj_parameters_correspond was due to expecting
TYPE_MAIN_VARIANT to be the same for all equivalent types, which is not the
case. And I adjusted some comments that I disagree with; the iobj parameter
adjustment only applies to overload resolution, we can handle that in
cand_parms_match (and I have WIP for that).

gcc/cp/ChangeLog:

* call.cc (build_over_call): Refactor handle_arg lambda.
* class.cc (xobj_iobj_parameters_correspond): Fix FIXME.
* method.cc (defaulted_late_check): Adjust comments.

c++: P0847R7 (deducing this) - CWG2586 [PR102609]

This adds support for defaulted comparison operators and copy/move
assignment operators, as well as allowing user defined xobj copy/move
assignment operators. It turns out defaulted comparison operators already
worked though, so this just adds a test for them. Defaulted comparison
operators were not so nice and required a bit of a hack. Should work fine
though!

The diagnostics leave something to be desired, and there are some things
that could be improved with more extensive design changes. There are a few
notes left indicating where I think we could make improvements.

Aside from some small bugs, with this commit xobj member functions should be
feature complete.

PR c++/102609

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - CWG2586.
* decl.cc (copy_fn_p): Accept xobj copy assignment functions.
(move_signature_fn_p): Accept xobj move assignment functions.
* method.cc (do_build_copy_assign): Handle defaulted xobj member
functions.
(defaulted_late_check): Comment.
(defaultable_fn_check): Comment.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - CWG2586.
* g++.dg/cpp23/explicit-obj-basic6.C: New test.
* g++.dg/cpp23/explicit-obj-default1.C: New test.
* g++.dg/cpp23/explicit-obj-default2.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - xobj lambdas. [PR102609]

This implements support for xobj lambdas.  There are extensive tests
included, but not exhaustive.  Dependent lambdas should work and have been
tested lightly, but we need more exhaustive tests for them.

PR c++/102609

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - xobj lambdas.
* lambda.cc (build_capture_proxy): Don't fold direct object types.
* parser.cc (cp_parser_lambda_declarator_opt): Handle xobj lambdas,
diagnostics.  Comments also updated.
* pt.cc (tsubst_function_decl): Handle xobj lambdas.  Check object
type of xobj lambda call operator, diagnose incorrect types.
(tsubst_lambda_expr): Update comment.
* semantics.cc (finish_decltype_type): Also consider by-value object
parameter qualifications.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - xobj lambdas.
* g++.dg/cpp23/explicit-obj-diagnostics8.C: New test.
* g++.dg/cpp23/explicit-obj-lambda1.C: New test.
* g++.dg/cpp23/explicit-obj-lambda10.C: New test.
* g++.dg/cpp23/explicit-obj-lambda11.C: New test.
* g++.dg/cpp23/explicit-obj-lambda12.C: New test.
* g++.dg/cpp23/explicit-obj-lambda13.C: New test.
* g++.dg/cpp23/explicit-obj-lambda2.C: New test.
* g++.dg/cpp23/explicit-obj-lambda3.C: New test.
* g++.dg/cpp23/explicit-obj-lambda4.C: New test.
* g++.dg/cpp23/explicit-obj-lambda5.C: New test.
* g++.dg/cpp23/explicit-obj-lambda6.C: New test.
* g++.dg/cpp23/explicit-obj-lambda7.C: New test.
* g++.dg/cpp23/explicit-obj-lambda8.C: New test.
* g++.dg/cpp23/explicit-obj-lambda9.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - diagnostics. [PR102609]

Diagnostics for xobj member functions. Also includes some diagnostics for
xobj lambdas which are not implemented here. CWG2554 is also implemented
here, we explicitly error when an xobj member function overrides a virtual
function.

PR c++/102609

gcc/c-family/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - diagnostics.
* c-cppbuiltin.cc (c_cpp_builtins): Define
__cpp_explicit_this_parameter=202110L feature test macro.

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - diagnostics.
* class.cc (resolve_address_of_overloaded_function): Diagnostics.
* cp-tree.h (TFF_XOBJ_FUNC): Define.
* decl.cc (grokfndecl): Diagnostics.
(grokdeclarator): Diagnostics.
* error.cc (dump_aggr_type): Pass TFF_XOBJ_FUNC.
(dump_lambda_function): Formatting for xobj lambda.
(dump_function_decl): Pass TFF_XOBJ_FUNC.
(dump_parameters): Formatting for xobj member functions.
(function_category): Formatting for xobj member functions.
* parser.cc (cp_parser_decl_specifier_seq): Diagnostics.
(cp_parser_parameter_declaration): Diagnostics.
* search.cc (look_for_overrides_here): Make xobj member functions
override.
(look_for_overrides_r): Reject an overriding xobj member function
and diagnose it.
* semantics.cc (finish_this_expr): Diagnostics.
* typeck.cc (cp_build_addr_expr_1): Diagnostics.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - diagnostics.
* g++.dg/cpp23/feat-cxx2b.C: Test existance and value of
__cpp_explicit_this_parameter feature test macro.
* g++.dg/cpp26/feat-cxx26.C: Likewise.
* g++.dg/cpp23/explicit-obj-cxx-dialect-A.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-B.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-C.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-D.C: New test.
* g++.dg/cpp23/explicit-obj-cxx-dialect-E.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics1.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics2.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics3.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics4.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics5.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics6.C: New test.
* g++.dg/cpp23/explicit-obj-diagnostics7.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - initial functionality. [PR102609]

This implements the initial functionality for P0847R7.  CWG2789 is
implemented, but instead of "same type" for the object parameters we take
correspondence into account instead.  Without this alteration, the behavior
here would be slightly different than the behavior with constrained member
function templates, which I believe would be undesirable.

There are a few outstanding issues related to xobj member functions
overloading iobj member functions that are introduced by using declarations.
Unfortunately, fixing this will be a little more involved and will have to
be pushed back until later.

Most diagnostics have been split out into another patch to improve its
clarity and allow all the strictly functional changes to be more
distinct. Explicit object lambdas and CWG2586 are addressed in a follow up
patch.

PR c++/102609

gcc/cp/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - initial functionality.
* class.cc (xobj_iobj_parameters_correspond): New function, checks
for corresponding object parameters between xobj and iobj member
functions.
(add_method): Handle object parameters of xobj member functions, use
xobj_iobj_parameters_correspond.
* call.cc (build_over_call): Refactor, handle xobj member functions.
(cand_parms_match): Handle object parameters of xobj and iobj member
functions, use xobj_iobj_parameters_correspond.
* cp-tree.h (enum cp_decl_spec): Add ds_this, add comments.
* decl.cc (grokfndecl): Add xobj_func_p parameter.  For xobj member
functions, Set xobj_flag, don't set static_function flag.
(grokdeclarator): Handle xobj member functions, tell grokfndecl.
(grok_op_properties): Don't error for xobj operators.
* parser.cc (cp_parser_decl_specifier_seq): Handle this specifier.
(cp_parser_parameter_declaration): Set default argument to
"this_identifier" for xobj parameters.
(set_and_check_decl_spec_loc): Add "this", add comments.
* tree.cc (build_min_non_dep_op_overload): Handle xobj operators.
* typeck.cc (cp_build_addr_expr_1): Handle address-of xobj member
functions.

gcc/testsuite/ChangeLog:

PR c++/102609
C++23 P0847R7 (deducing this) - initial functionality.
* g++.dg/cpp23/explicit-obj-basic1.C: New test.
* g++.dg/cpp23/explicit-obj-basic2.C: New test.
* g++.dg/cpp23/explicit-obj-basic3.C: New test.
* g++.dg/cpp23/explicit-obj-basic4.C: New test.
* g++.dg/cpp23/explicit-obj-basic5.C: New test.
* g++.dg/cpp23/explicit-obj-by-value1.C: New test.
* g++.dg/cpp23/explicit-obj-by-value2.C: New test.
* g++.dg/cpp23/explicit-obj-by-value3.C: New test.
* g++.dg/cpp23/explicit-obj-by-value4.C: New test.
* g++.dg/cpp23/explicit-obj-constraints.C: New test.
* g++.dg/cpp23/explicit-obj-constraints2.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-arrow.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-assignment.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-call.C: New test.
* g++.dg/cpp23/explicit-obj-ops-mem-subscript.C: New test.
* g++.dg/cpp23/explicit-obj-ops-non-mem-dep.C: New test.
* g++.dg/cpp23/explicit-obj-ops-non-mem-non-dep.C: New test.
* g++.dg/cpp23/explicit-obj-ops-non-mem.h: New test.
* g++.dg/cpp23/explicit-obj-ops-requires-mem.C: New test.
* g++.dg/cpp23/explicit-obj-ops-requires-non-mem.C: New test.
* g++.dg/cpp23/explicit-obj-redecl.C: New test.
* g++.dg/cpp23/explicit-obj-redecl2.C: New test.
* g++.dg/cpp23/explicit-obj-redecl3.C: New test.
* g++.dg/cpp23/explicit-obj-redecl4.C: New test.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>

c++: P0847R7 (deducing this) - prerequisite changes. [PR102609]

Adds the xobj_flag member to lang_decl_fn and a corresponding member access
macro and predicate to support the addition of explicit object member
functions. Additionally, since explicit object member functions are also
non-static member functions, we need to change uses of
DECL_NONSTATIC_MEMBER_FUNCTION_P to clarify whether they intend to include
or exclude them.

PR c++/102609

gcc/cp/ChangeLog:

* cp-tree.h (struct lang_decl_fn): New data member.
(DECL_NONSTATIC_MEMBER_FUNCTION_P): Poison.
(DECL_IOBJ_MEMBER_FUNCTION_P): Define.
(DECL_FUNCTION_XOBJ_FLAG): Define.
(DECL_XOBJ_MEMBER_FUNCTION_P): Define.
(DECL_OBJECT_MEMBER_FUNCTION_P): Define.
(DECL_FUNCTION_MEMBER_P): Don't use
DECL_NONSTATIC_MEMBER_FUNCTION_P.
(DECL_CONST_MEMFUNC_P): Likewise.
(DECL_VOLATILE_MEMFUNC_P): Likewise.
(DECL_NONSTATIC_MEMBER_P): Likewise.
* module.cc (trees_out::lang_decl_bools): Handle xobj_flag.
(trees_in::lang_decl_bools): Handle xobj_flag.
* call.cc (build_this_conversion)
(add_function_candidate)
(add_template_candidate_real)
(add_candidates)
(maybe_warn_class_memaccess)
(cand_parms_match)
(joust)
(do_warn_dangling_reference)
* class.cc (finalize_literal_type_property)
(finish_struct)
(resolve_address_of_overloaded_function)
* constexpr.cc (is_valid_constexpr_fn)
(cxx_bind_parameters_in_call)
* contracts.cc (build_contract_condition_function)
* cp-objcp-common.cc (cp_decl_dwarf_attribute)
* cxx-pretty-print.cc (cxx_pretty_printer::postfix_expression)
(cxx_pretty_printer::declaration_specifiers)
(cxx_pretty_printer::direct_declarator)
* decl.cc (cp_finish_decl)
(grok_special_member_properties)
(start_preparsed_function)
(record_key_method_defined)
* decl2.cc (cp_handle_deprecated_or_unavailable)
* init.cc (find_uninit_fields_r)
(build_offset_ref)
* lambda.cc (lambda_expr_this_capture)
(maybe_generic_this_capture)
(nonlambda_method_basetype)
* mangle.cc (write_nested_name)
* method.cc (early_check_defaulted_comparison)
(skip_artificial_parms_for)
(num_artificial_parms_for)
* pt.cc (is_specialization_of_friend)
(determine_specialization)
(copy_default_args_to_explicit_spec)
(check_explicit_specialization)
(tsubst_contract_attribute)
(check_non_deducible_conversions)
(more_specialized_fn)
(maybe_instantiate_noexcept)
(register_parameter_specializations)
(value_dependent_expression_p)
* search.cc (shared_member_p)
(lookup_member)
(field_access_p)
* semantics.cc (finish_omp_declare_simd_methods)
* tree.cc (lvalue_kind)
* typeck.cc (invalid_nonstatic_memfn_p): Don't use
DECL_NONSTATIC_MEMBER_FUNCTION_P.

libcc1/ChangeLog:

* libcp1plugin.cc (plugin_pragma_push_user_expression): Don't use
DECL_NONSTATIC_MEMBER_FUNCTION_P.

Signed-off-by: Waffl3x <waffl3x@protonmail.com>
Co-authored-by: Jason Merrill <jason@redhat.com>

[committed] Adding missing prototype for __clzhi2 to xstormy port

xstormy16 has failed since the c99 transition due to a missing prototype for
__clzhi2 in the implementation of stormy16_count_leading_zeros.

This fixes the missing prototype. Pushed to the trunk.

include/
* longlong.h (__stormy16_count_leading_zeros): Add prototype for
__clzhi2.

libstdc++: Simplify some chrono formatters

I don't remember exactly why I made these bits of code reserve space in
a COW string and append to it, rather than just use the string returned
from std::format (which will undergo copy elision). The _Str_sink type
used by std::format means the string only performs a single allocation
for the formatted output, and the returned string's reference count will
be one, so won't reallocate when indexing into it. We can remove these
non-optimizations.

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_F): Simplify
handling of string returned from std::format.
(__formatter_chrono::_M_R_T): Likewise.

[committed] Fix minor bug in epiphany port

So I consider this port dead as it semi-randomly fails in reload due to
unrelated changes earlier in the gimple and RTL pipelines.  Regardless Richard
S's late-combine work did show a very obvious error in the port that we should
go ahead and fix as long as the port is in-tree.

The epiphany add-with-immediate instruction allows an 11 bit signed immediate.
That gives the instruction an immediate range of -1024..1023.

The port actually allowed -8192..8191 due to the uber-weird constraint
definition.  I've simplified the constraint to match the hardware documentation
I was able to find.  That was enough to get the epiphany port to build
libgcc/newlib with Richard S's late-combine work.

The testsuite is so flakey on that port (due to the reload failures) that my
tester doesn't run it.  So no comparisons are available.

gcc/
* config/epiphany/constraints.md (Car): Allow -1024..1023, no more,
no less.

[committed] Fix minor bug on mn103 port

Richard Sandiford debugged a failure on the mn103 port with his late-combine
patches down to the subdi3 pattern not specifying the isa on alternatives which
required newer variants of the chip family.

This patch adds the missing isa attribute and the port now works with his
late-combine patch. I'm pushing this to the trunk on his behalf.

gcc/
* config/mn10300/mn10300.md (subdi3_degenerate): Add isa attribute.

middle-end: removed unused variable in vectorizable_live_operation_1

It looks like the previous patch had an unused variable.
It's odd that my bootstrap didn't catch it (I'm assuming
-Werror is still on for O3 bootstraps) but this fixes it.

gcc/ChangeLog:

* tree-vect-loop.cc (vectorizable_live_operation_1): Drop unused
restart_loop.
(vectorizable_live_operation): Likewise.

SECURITY.txt: Drop "exploitable" in reference to hardening issues

The "exploitable vulnerability" may lead to a misunderstanding that
missed hardening issues are considered vulnerabilities, just that
they're not exploitable. This is not true, since while hardening bugs
may be security-relevant, the absence of hardening does not make a
program any more vulnerable to exploits than without.

Drop the "exploitable" word to make it clear that missed hardening is
not considered a vulnerability.

Signed-off-by: Siddhesh Poyarekar <siddhesh@gotplt.org>
ChangeLog:

* SECURITY.txt: Drop "exploitable" in the hardening section.

Pass GUILE down to subdirectories

When I enable cgen rebuilding in the binutils-gdb tree, the default is
to run cgen using 'guile'. However, on my host, guile is guile 2.2,
which doesn't work for me -- I have to use guile3.0.

This patch arranges to pass "GUILE" down to subdirectories, so I can
use 'make GUILE=guile3.0'.

* Makefile.in: Rebuild.
* Makefile.tpl (BASE_EXPORTS): Add GUILE.
(GUILE): New variable.
* Makefile.def (flags_to_pass): Add GUILE.

c-family: copy attribute diagnostic fixes [PR113262]

The copy attributes is allowed on decls as well as types and even has
checks whether decl (set to *node) is DECL_P or TYPE_P, but for diagnostics
unconditionally uses DECL_SOURCE_LOCATION (decl), which obviously only works
if it applies to a decl.

2024-01-09 Jakub Jelinek <jakub@redhat.com>

PR c/113262
* c-attribs.cc (handle_copy_attribute): Don't use
DECL_SOURCE_LOCATION (decl) if decl is not DECL_P, use input_location
instead. Formatting fixes.

* gcc.dg/pr113262.c: New test.

middle-end: check if target can do extract first for early breaks [PR113199]

I was generating the vector reverse mask without checking if the target
actually supported such an operation.

This patch changes it to if the bitstart is 0 then use BIT_FIELD_REF instead
to extract the first element since this is supported by all targets.

This is good for now since masks always come from whilelo. But in the future
when masks can come from other sources we will need the old code back.

gcc/ChangeLog:

PR tree-optimization/113199
* tree-vect-loop.cc (vectorizable_live_operation_1): Use
BIT_FIELD_REF.

gcc/testsuite/ChangeLog:

PR tree-optimization/113199
* gcc.target/gcn/pr113199.c: New test.

PR modula2/112920 cc1gm2 hangs in the type resolver

This patch contains a fix to gcc/m2/gm2-compiler/M2GCCDeclare.mod.
The fix introduces a group of sets which can be compared.  The resolver
will loop until there is no change in all sets within the group.
Since symbols migrate from set to set without ever looping this
will never hang.  Previously only the number of elements in a set
were compared which resulted in a infinite spin.

gcc/m2/ChangeLog:

PR modula2/112920
* gm2-compiler/M2GCCDeclare.mod (Group): New declaration.
Import MakeSubrange, MakeConstVar, MakeConstLit and DivTrunc.
(FreeGroup): New declaration.
(GlobalGroup): New declaration.
(ToBeSolvedByQuads): Remove.
(NilTypedArrays): Remove.
(PartiallyDeclared): Remove.
(HeldByAlignment): Remove.
(FinishedAlignment): Remove.
(ToDoList): Remove.
(DebugSet): Re-format.
(DebugNumber): Re-format.
(DebugSetNumbers): Reference sets using GlobalGroup.
(AddSymToWatch): Re-format.
(WatchIncludeList): Reference sets using GlobalGroup.
(WatchRemoveList): Reference sets using GlobalGroup.
(NewGroup): New procedure.
(DisposeGroup): New procedure.
(InitGroup): New procedure.
(KillGroup): New procedure.
(DupGroup): New procedure.
(EqualGroup): New procedure.
(LookupSet): New procedure.
(CanDeclareTypePartially): Reference sets using GlobalGroup.
(CompletelyResolved): Reference sets using GlobalGroup.
(IsNilTypedArrays): Reference sets using GlobalGroup.
(IsFullyDeclared): Reference sets using GlobalGroup.
(IsPartiallyDeclared): Reference sets using GlobalGroup.
(IsPartiallyOrFullyDeclared): Reference sets using GlobalGroup.
(DeclareTypeConstFully): Reference sets using GlobalGroup.
(bodyl): Remove.
(Body): Use bodyt and to lookup the required set.
(ForeachTryDeclare): Remove parameter l.  Lookup set instead.
(DeclareOutstandingTypes): Add new rules setarraynul and setfully.
Reference sets using GlobalGroup.
(ActivateWatch): New procedure.
(DeclareTypesConstantsProceduresInRange): Re-written to check
group change.
(DeclareTypesConstantsProcedures): Re-written to check
group change.
(DeclareBoolean): Reference sets using GlobalGroup.
(DeclarePackedBoolean): Ditto.
(DeclareDefaultConstants): Ditto.
(FreeGroup): Initialized.
(GlobalGroup): Ditto.
* gm2-compiler/Sets.def (EqualSet): New procedure function.
Remove export qualified list of identifiers.
* gm2-compiler/Sets.mod (EqualSet): New procedure function.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

arm: Update early-break tests to accept thumb output too.

The tests I recently added for early break fail in thumb mode
because in thumb mode `cbz/cbnz` exist and so the cmp+branch
is fused. This updates the testcases to accept either output.

gcc/testsuite/ChangeLog:

* gcc.target/arm/vect-early-break-cbranch.c: Accept thumb output.

ada: Fix bogus Constraint_Error on allocator for access to array of access type

This occurs because the access element type is not its own TYPE_CANONICAL,
which creates a discrepancy between the aliasing support code, which deals
with types directly, and the middle-end which looks at TYPE_CANONICAL only.

gcc/ada/

* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Array_Type>: Use the
TYPE_CANONICAL of types when it comes to aliasing.
* gcc-interface/utils.cc (relate_alias_sets): Likewise.

ada: Preliminary cleanup in aliasing support code

This declares an explicit temporary for the fields of the fat pointer type
in gnat_to_gnu_entity and removes the GNU_ prefix of the parameters of the
relate_alias_sets routine for the sake of brevity. No functional changes.

gcc/ada/

* gcc-interface/decl.cc (gnat_to_gnu_entity) <E_Array_Type>: Use a
separate FLD local variable to hold the first field of the fat
pointer type being built.
* gcc-interface/gigi.h (relate_alias_sets): Remove GNU_ prefix on
the first two parameters.
* gcc-interface/utils.cc (relate_alias_sets): Likewise and adjust.

ada: Do not count comparison of addresses as a modification

In some extended code we generate comparisons between
the Addresses of some variables. This causes those
variables to be considered modified. Whereas in this
particular scenario the variables are just referenced.

gcc/ada/

* sem_attr.adb: avoid marking a use of the Address attribute
as a modification of its prefix.

ada: Minor change replacing "not Present" tests with "No" tests

Fixing two places flagged by gnatcheck to use "No" instead of "not Present".

gcc/ada/

* exp_aggr.adb (Expand_Container_Aggregate): Change "not Present"
tests to tests using "No" (in two places).

ada: Allow passing private types to generic formal incomplete types

It is legal to pass a private type, or a type with a component whose
type is private, as a generic actual type if the formal is a generic
formal incomplete type. This patch fixes a bug in which the compiler
would give an error in some such cases.

Also misc cleanup.

gcc/ada/

* sem_ch12.adb (Instantiate_Type): Make the relevant error message
conditional upon "Ekind (A_Gen_T) /= E_Incomplete_Type". Misc
cleanup.

ada: Excess elements created for indexed aggregates with iterator_specifications

In the case of an indexed aggregate of a container type with both Add_Unnamed
and New_Indexed specified in the Aggregate aspect of the type (such as for
the Vector type in Ada.Containers.Vectors), in cases where a component
association is given by an iterator_specification, the compiler could end
up generating a call to the New_Indexed operation rather than the Empty
operation. For example, in the case of a Vector type, this could result
in allocating a container of the size of the defaulted Capacity formal of
the New_Vector function (with uninitialized components), and elements added
in the aggregate would append to that preallocated Vector. The compiler is
corrected so that the Empty function is called to initialize the implicit
aggregate object, rather than the New_Indexed function.

gcc/ada/

* exp_aggr.adb (Expand_Container_Aggregate): Add code to determine
whether the aggregate is an indexed aggregate, setting a flag
(Is_Indexed_Aggregate), which is tested to have proper separation
of treatment for the Add_Unnamed
(for positional aggregates) and New_Indexed (for indexed
aggregates) cases. In the code generating associations for indexed
aggregates, remove the code for Expressions cases entirely, since
the code for indexed aggregates is governed by the presence of
Component_Associations, and add an assertion that Expressions must
be Empty. Also, exclude empty aggregates from entering that code.

ada: Remove unused runtime entity

The compiler has not generated direct attachments for a long time.

gcc/ada/

* rtsfind.ads (RE_Id): Remove RE_Attach.
(RE_Unit_Table): Likewise.
* libgnat/s-finmas.ads (Attach): Delete.
* libgnat/s-finmas.adb (Attach): Likewise.

ada: Fix limited_with in Check_Scil; allow for <> in pp of aggregate

Check_Scil failed due to not handling a type that came from a package that was
mentioned in a limited-with clause. Also, an aggregate with an uninitialized
component was not being pretty-printed properly.

gcc/ada/

* pprint.adb (List_Name): Check for "Box_Present" when displaying
a list, and emit "<>" if returns True.
* sem_scil.adb (Check_SCIL_Node): Handle case when the type of a
parameter is from a package that was mentioned in a limited with
clause, and make no further checks, since this check routine does
not have all the logic to check such a usage.

ada: Fix internal error on class-wide allocator inside if-expression

The problem is that the freeze node for the class-wide subtype built for the
expression of the allocator escapes from the dependent expression instead of
being stored in its list of actions.

gcc/ada/

* freeze.adb (Freeze_Expression.Has_Decl_In_List): Deal specifically
with itypes that are class-wide subtypes.

ada: Add __atomic_store_n binding to System.Atomic_Primitives

This is modeled on the existing binding for __atomic_load_n.

gcc/ada/

* libgnat/s-atopri.ads (Atomic_Store): New generic procedure.
(Atomic_Store_8): New instantiated procedure.
(Atomic_Store_16): Likewise.
(Atomic_Store_32): Likewise.
(Atomic_Store_64): Likewise.
* libgnat/s-atopri__32.ads (Atomic_Store): New generic procedure.
(Atomic_Store_8): New instantiated procedure.
(Atomic_Store_16): Likewise.
(Atomic_Store_32): Likewise.
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Implement the
support for __atomic_store_n and __sync_bool_compare_and_swap_n.
* gcc-interface/gigi.h (list_second): New inline function.

ada: Cannot requeue to a procedure implemented by an entry

Add missing support for RM 9.5.4(5.6/4): the target of a requeue
statement may be a procedure when its name denotes a renaming of
an entry.

gcc/ada/

* sem_ch6.adb (Analyze_Subprogram_Specification): Do not replace
the type of the formals with its corresponding record in
init-procs.
* sem_ch9.adb (Analyze_Requeue): Add missing support to requeue to
a procedure that denotes a renaming of an entry.

ada: Remove side effects depending on the context of subtype declaration

In GNATprove mode the removal of side effects is only needed in certain
syntactic contexts, which include subtype declarations. Now this removal
is limited to genuine subtype declarations and not to itypes coming from
expressions where side effects are not expected.

gcc/ada/

* exp_util.adb (Possible_Side_Effect_In_SPARK): Refine handling of
itype declarations.

ada: More aggressive inlining of subprogram calls in GNATprove mode

Previously if a subprogram call could not be inlined in GNATprove mode,
then all subsequent calls to the same subprogram were not inlined
either (because a failed attempt to inline clears flag Is_Inlined_Always
and we tested this flag when attempting to inline subsequent calls).

Now a failure in inlining of a particular call does not prevent inlining
of subsequent calls to the same subprogram, except when inlining failed
because the subprogram was detected to be recursive (which clears the
Is_Inlined flag that we now examine).

This change allows more checks to be proved and reduces interactions
between inlining and SPARK legality checks.

gcc/ada/

* sem_ch6.adb (Analyze_Subprogram_Specification): Set Is_Inlined
flag by default in GNATprove mode.
* sem_res.adb (Resolve_Call): Only look at flag which is cleared
when inlined subprogram is detected to be recursive.

ada: Remove dead detection of recursive inlined subprograms

Inlining of subprogram calls happens in routine Expand_Inlined_Call
which calls Establish_Actual_Mapping_For_Inlined_Call. Both routines
had detection of recursive calls. The detection in the second routine
was dead code.

gcc/ada/

* inline.adb (Establish_Actual_Mapping_For_Inlined_Call):
Remove detection of recursive calls.

ada: Remove dead code for GNATprove inlining

Removed code was dead because it could only be executed when
Back_End_Inlining is True and that flag is always false in
GNATprove_Mode.

gcc/ada/

* inline.adb (Cannot_Inline): Cleanup use of 'Length; remove
dead code.

ada: Fix uses of not Present

Fix style violation reported by GNATcheck.

gcc/ada/

* sem_aggr.adb (Resolve_Container_Aggregate): Use "No".
* sem_ch8.adb (Find_Direct_Name): Likewise.

ada: Fix bug in Sem_Util.Enclosing_Declaration

Fix Sem_Util.Enclosing_Declaration to not return an N_Subprogram_Specification
node. Remove code in various places that was formerly needed to cope with this
misbehavior.

gcc/ada/

* sem_util.adb (Enclosing_Declaration): Instead of returning a
subprogram specification node, return its parent (which is
presumably a subprogram declaration).
* contracts.adb (Insert_Stable_Property_Check): Remove code
formerly needed to compensate for incorrect behavior of
Sem_Util.Enclosing_Declaration.
* exp_attr.adb (In_Available_Context): Remove code formerly needed
to compensate for incorrect behavior of
Sem_Util.Enclosing_Declaration.
* sem_ch8.adb (Is_Actual_Subp_Of_Inst): Remove code formerly
needed to compensate for incorrect behavior of
Sem_Util.Enclosing_Declaration.

ada: Error compiling Ada 2022 object renaming with no subtype mark

In some cases the compiler would crash or generate spurious errors
compiling a legal object renaming declaration that lacks a subtype mark.
In addition to fixing the immediate problem, change Atree.Copy_Slots
so that attempts to modify either the Empty or the Error nodes
(e.g., by passing one of them as the target in a call to Rewrite)
are ineffective. Cope with the consequences of this.

gcc/ada/

* sem_ch8.adb (Check_Constrained_Object): Before updating the
subtype mark of an object renaming declaration by calling Rewrite,
first check whether the destination of the Rewrite call exists.
* atree.adb (Copy_Slots): Return without performing any updates if
Destination equals Empty or Error, or if Source equals Empty. Any
of those conditions indicates an error case.
* sem_ch12.adb (Analyze_Formal_Derived_Type): Avoid cascading
errors.
* sem_ch3.adb (Analyze_Number_Declaration): In an error case, do
not pass Error as destination in a call to Rewrite.
(Find_Type_Of_Subtype_Indic): In an error case, do not pass Error
or Empty as destination in a call to Rewrite.

ada: Fix precondition in Interfaces.C.Strings

The precondition of both Update procedures in Interfaces.C.Strings were
incorrect. This patch fixes this.

gcc/ada/

* libgnat/i-cstrin.ads (Update): Fix precondition.

ada: Remove unreachable code in Resolve_Extension_Aggregate

The only functions using the BIP protocol are now those returning a limited
type: Is_Build_In_Place_Result_Type => Is_Inherently_Limited_Type.

gcc/ada/

* sem_aggr.adb (Resolve_Extension_Aggregate): Remove the unreachable
call to Transform_BIP_Assignment as well as the procedure.

ada: Avoid xref on out params of TSS

For an actual passed as an 'in out' parameter of a type support
subprogram such as deep finalize, do not count it as a read
reference of the actual. Clearly these should not count.
Furthermore, counting them causes different warnings in -gnatc
mode compared to normal mode, because the calls only exist in
normal mode, which would disable the warnings. Such warnings now
occur in both modes, instead of just with -gnatc.

gcc/ada/

* lib-xref.adb (Generate_Reference): Do not count it as a read
reference if we're calling a TSS.

ada: Document new SPARK aspect and pragma Always_Terminates

Add description of a recently added SPARK contract.

gcc/ada/

* doc/gnat_rm/implementation_defined_aspects.rst,
doc/gnat_rm/implementation_defined_pragmas.rst: Add sections for
Always_Terminates.
* gnat-style.texi: Regenerate.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.

aarch64: Fix up GC of aarch64_simd_types [PR113270]

The r14-6524 changes created aarch64-builtins.h header and moved
struct aarch64_simd_type_info definition in there.
Unfortunately, the new header wasn't added to target_gtfiles, so the
trees and const char * pointer elements in the aarch64_simd_types
array aren't marked as GC roots anymore.  That breaks e.g. PCH, when
the array elements then can refer to ggc_freed memory instead of the expected
types, but also any other GC collection could free them and further uses would
not work correctly.

Unfortunately, just adding the new header to target_gtfiles doesn't fix this,
because non-static variable definitions marked with GTY(()) aren't considered
by gengtype, it looks in those cases for an extern GTY(()) declaration, and
there was none - the aarch64-builtins.h header contains an extern declaration
without GTY(()).  Adding GTY(()) to that extern declaration doesn't work, because
then gengtype attempts to emit the aarch64_simd_types GC roots in gtype-desc.cc
but the corresponding header isn't included there.

So, the patch instead adds another extern declaration in aarch64-builtins.cc
right before the actual definition, which makes sure the GC roots are registered
correctly in gt-aarch64-builtins.h (where we want them).

2024-01-09  Jakub Jelinek  <jakub@redhat.com>

PR target/113270
* config.gcc (aarch64*-*-*): Add aarch64-builtins.h to target_gtfiles.
* config/aarch64/aarch64-builtins.cc (aarch64_simd_types): Add extern
GTY(()) declaration before the definition, drop GTY(()) drom the
definition.

tree-optimization/113026 - fix vector epilogue maximum iter bound

The late amendment with a limit based on VF was redundant and wrong
for peeled early exits. The following moves the adjustment done
when we don't have a skip edge down to the place where the already
existing VF based max iter check is done and removes the amendment.

PR tree-optimization/113026
* tree-vect-loop-manip.cc (vect_do_peeling): Remove
redundant and wrong niter bound setting. Move niter
bound adjustment down.

Fix outdated comment

gcc/ada/
PR ada/78207
* libgnat/g-regexp.ads: Fix outdated comment.

frontend: don't ice with pragma NOVECTOR if loop has no condition [PR113267]

In C you can have loops without a condition, the original version of the patch
was rejecting the use of #pragma GCC novector, however during review it was
changed to not due this with the reason that we didn't want to give a compile
error with such cases.

However because annotations seem to be only be allowed on conditions (unless
I'm mistaken?) the attached example ICEs because there's no condition.

This will have it ignore the pragma instead of ICEing. I don't know if this is
the best solution, but as far as I can tell we can't attach the annotation to
anything else.

gcc/c/ChangeLog:

PR c/113267
* c-parser.cc (c_parser_for_statement): Skip the pragma is no cond.

gcc/testsuite/ChangeLog:

PR c/113267
* gcc.dg/pr113267.c: New test.