git.ipfire.org Git - thirdparty/gcc.git/log

libiberty: Fix error return value in pex_unix_exec_child [PR113957].

r14-5310-g879cf9ff45d940 introduced some new handling for spawning sub
processes. The return value from the generic exec_child is examined
and needs to be < 0 to signal an error. However, the unix flavour of
this routine is returning the PID value set from the posix_spawn{p}.

This latter value is undefined per the manual pages for both Darwin
and Linux, and it seems Darwin, at least, sets the value to some
usually positive number (presumably the PID that would have been used
if the fork had succeeded).

The fix proposed here is to set the pid = -1 in the relevant error
paths.

PR other/113957

libiberty/ChangeLog:

* pex-unix.c (pex_unix_exec_child): Set pid = -1 in the error
paths, since that is used to signal an erroneous outcome for
the routine.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>

aarch64: Register rng builtins with uint64_t pointers.

Currently, these are registered as unsigned_intDI_type_node which is not
necessarily the same type definition as uint64_t. On platforms where these
differ that causes fails in consuming the arm_acle.h header.

gcc/ChangeLog:

* config/aarch64/aarch64-builtins.cc (aarch64_init_rng_builtins):
Register these builtins with a pointer to uint64_t rather than unsigned
DI mode.

Update cpplib es.po

* es.po: Update.

Update .po files

gcc/po/
* be.po, da.po, de.po, el.po, es.po, fi.po, fr.po, hr.po, id.po,
ja.po, nl.po, ru.po, sr.po, sv.po, tr.po, uk.po, vi.po, zh_CN.po,
zh_TW.po: Update.

libcpp/po/
* be.po, ca.po, da.po, de.po, el.po, eo.po, es.po, fi.po, fr.po,
id.po, ja.po, ka.po, nl.po, pt_BR.po, ro.po, ru.po, sr.po, sv.po,
tr.po, uk.po, vi.po, zh_CN.po, zh_TW.po: Update.

GCN: Conditionalize 'define_expand "reduc_<fexpander>_scal_<mode>"' on '!TARGET_RDNA2_PLUS' [PR113615]

On top of commit c7ec7bd1c6590cf4eed267feab490288e0b8d691
"amdgcn: add -march=gfx1030 EXPERIMENTAL" conditionalizing
'define_expand "reduc_<reduc_op>_scal_<mode>"' on
'!TARGET_RDNA2' (later: '!TARGET_RDNA2_PLUS'), we then did similar in
commit 7cc2262ec9a410dc56d1c1c6b950c922e14f621d
"gcn/gcn-valu.md: Disable fold_left_plus for TARGET_RDNA2_PLUS [PR113615]"
to conditionalize 'define_expand "fold_left_plus_<mode>"' on
'!TARGET_RDNA2_PLUS', but I found we also need to conditionalize the related
'define_expand "reduc_<fexpander>_scal_<mode>"' on '!TARGET_RDNA2_PLUS', to
avoid ICEs like:

 [...]/gcc.dg/vect/pr108608.c: In function 'foo':
 [...]/gcc.dg/vect/pr108608.c:9:1: error: unrecognizable insn:
 (insn 34 33 35 2 (set (reg:V64DF 723)
 (unspec:V64DF [
 (reg:V64DF 690 [ vect_m_11.20 ])
 (const_int 1 [0x1])
 ] UNSPEC_MOV_DPP_SHR)) -1
 (nil))
 during RTL pass: vregs

Similar for 'gcc.dg/vect/vect-fmax-2.c', 'gcc.dg/vect/vect-fmin-2.c', and
'UNSPEC_SMAX_DPP_SHR' for 'gcc.dg/vect/vect-fmax-1.c', and
'UNSPEC_SMIN_DPP_SHR' for 'gcc.dg/vect/vect-fmin-1.c', when running 'vect.exp'
for 'check-gcc-c'.

PR target/113615
gcc/
* config/gcn/gcn-valu.md (define_expand "reduc_<fexpander>_scal_<mode>"):
Conditionalize on '!TARGET_RDNA2_PLUS'.
* config/gcn/gcn.cc (gcn_expand_dpp_shr_insn)
(gcn_expand_reduc_scalar):
'gcc_checking_assert (!TARGET_RDNA2_PLUS);'.

GCN: Restore lost '__gfx90a__' target CPU definition

Also, add some safeguards for the future.

Fix-up for commit 52a2c659ae6c21f84b6acce0afcb9b93b9dc71a0
"GCN: Add pre-initial support for gfx1100".

gcc/
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS): Restore lost
'__gfx90a__' target CPU definition. Add some safeguards for the future.

c++: compound-requirement partial substitution [PR113966]

When partially substituting a requires-expr, we don't want to perform
any additional checks beyond the substitution itself so as to minimize
checking requirements out of order. So don't check the return-type-req
of a compound-requirement during partial substitution. And don't check
the noexcept condition either since we can't do that on templated trees.

PR c++/113966

gcc/cp/ChangeLog:

* constraint.cc (tsubst_compound_requirement): Don't check
the noexcept condition or the return-type-requirement when
partially substituting.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-friend17.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

Fix testism where __seg_gs was being used for all targets

Replaced uses of __seg_gs with the MACRO SEG defined in the testcase to pick
(if any) the right __seg_{gs,fs} keyword based on target.

gcc/testsuite/ChangeLog:

* gcc.dg/bitint-86.c (__seg_gs): Replace with SEG MACRO.

rtl-optimization/54052 - RTL SSA PHI insertion compile-time hog

The following tries to address the PHI insertion compile-time hog in
RTL fwprop observed with the PR54052 testcase where the loop computing
the "unfiltered" set of variables possibly needing PHI nodes for each
block exhibits quadratic compile-time and memory-use.

It does so by pruning the local DEFs with LR_OUT of the block, removing
regs that can never be LR_IN (defined by this block) in the dominance
frontier.

PR rtl-optimization/54052
* rtl-ssa/blocks.cc (function_info::place_phis): Filter
local defs by LR_OUT.

PR modula2/113889 Incorrect constant string value if declared in a definition module

This patch fixes a bug exposed when a constant string is declared in a
definition module and imported by a program module. The bug fix
was to defer the string assignment and concatenation until quadruples
were generated. The conststring symbol has a known field which
must be checked prior to retrieving the string contents.

gcc/m2/ChangeLog:

PR modula2/113889
* gm2-compiler/M2ALU.mod (StringFitsArray): Add tokeno parameter
to GetStringLength.
(InitialiseArrayOfCharWithString): Add tokeno parameter to
GetStringLength.
(CheckGetCharFromString): Add tokeno parameter to GetStringLength.
* gm2-compiler/M2Const.mod (constResolveViaMeta): Replace
PutConstString with PutConstStringKnown.
* gm2-compiler/M2GCCDeclare.mod (DeclareCharConstant): Add tokenno
parameter and add assert. Use tokenno to generate location.
(DeclareStringConstant): Add tokenno and add asserts.
Add tokenno parameter to calls to GetStringLength.
(PromoteToString): Add assert and add tokenno parameter to
GetStringLength.
(PromoteToCString): Add assert and add tokenno parameter to
GetStringLength.
(DeclareConstString): New procedure function.
(TryDeclareConst): Remove size local variable.
Check IsConstStringKnown.
Call DeclareConstString.
(PrintString): New procedure.
(PrintVerboseFromList): Call PrintString.
(CheckResolveSubrange): Check IsConstStringKnown before creating
subrange for char or issuing an error.
* gm2-compiler/M2GenGCC.mod (ResolveConstantExpressions): Add
StringLengthOp, StringConvertM2nulOp, StringConvertCnulOp case
clauses.
(FindSize): Add assert IsConstStringKnown.
(StringToChar): New variable tokenno.
Add tokenno parameter to GetStringLength.
(FoldStringLength): New procedure.
(FoldStringConvertM2nul): New procedure.
(FoldStringConvertCnul): New procedure.
(CodeAddr): Add tokenno parameter.
Replace CurrentQuadToken with tokenno.
Add tokenno parameter to GetStringLength.
(PrepareCopyString): Rewrite.
(IsConstStrKnown): New procedure function.
(FoldAdd): Detect conststring op2 and op3 which are known and
concat. Place result into op1.
(FoldStandardFunction): Pass tokenno as a parameter to
GetStringLength.
(CodeXIndr): Rewrite comment.
Rename op1 to left, op3 to right.
Pass rightpos to GetStringLength.
* gm2-compiler/M2Quads.def (QuadrupleOp): Add
StringConvertCnulOp, StringConvertM2nulOp and StringLengthOp.
* gm2-compiler/M2Quads.mod (import): Remove MakeConstLitString.
Add CopyConstString and PutConstStringKnown.
(IsInitialisingConst): Add StringConvertCnulOp,
StringConvertM2nulOp and StringLengthOp.
(callRequestDependant): Replace MakeConstLitString with
MakeConstString.
(DeferMakeConstStringCnul): New procedure function.
(DeferMakeConstStringM2nul): New procedure function.
(CheckParameter): Add early return if the string const is unknown.
(DescribeType): Add token parameter to GetStringLength.
Check for IsConstStringKnown.
(ManipulateParameters): Use DeferMakeConstStringCnul and
DeferMakeConstStringM2nul.
(MakeLengthConst): Remove and replace with...
(DeferMakeLengthConst): ... this.
(doBuildBinaryOp): Create ConstString and set it to contents
unknown.
Check IsConstStringKnown before generating error message.
(WriteQuad): Add StringConvertCnulOp, StringConvertM2nulOp and
StringLengthOp.
(WriteOperator): Add StringConvertCnulOp, StringConvertM2nulOp and
StringLengthOp.
* gm2-compiler/M2SymInit.mod (CheckReadBeforeInitQuad): Add
StringConvertCnulOp, StringConvertM2nulOp and StringLengthOp.
* gm2-compiler/NameKey.mod (LengthKey): Allow NulName to return 0.
* gm2-compiler/P2SymBuild.mod (BuildString): Replace
MakeConstLitString with MakeConstString.
(DetermineType): Replace PutConstString with PutConstStringKnown.
* gm2-compiler/SymbolTable.def (MakeConstVar): Tidy up comment.
(MakeConstLitString): Remove.
(MakeConstString): New procedure function.
(MakeConstStringCnul): New procedure function.
(MakeConstStringM2nul): New procedure function.
(PutConstStringKnown): New procedure.
(CopyConstString): New procedure.
(IsConstStringKnown): New procedure function.
(IsConstStringM2): New procedure function.
(IsConstStringC): New procedure function.
(IsConstStringM2nul): New procedure function.
(IsConstStringCnul): New procedure function.
(GetStringLength): Add token parameter.
(PutConstString): Remove.
(GetConstStringM2): Remove.
(GetConstStringC): Remove.
(GetConstStringM2nul): Remove.
(GetConstStringCnul): Remove.
(MakeConstStringC): Remove.
* gm2-compiler/SymbolTable.mod (SymConstString): Remove
M2Variant, NulM2Variant, CVariant, NulCVariant.
Add Known.
(CheckAnonymous): Replace $$ with __anon.
(IsNameAnonymous): Replace $$ with __anon.
(MakeConstVar): Detect whether the name is nul and treat as
a temporary constant.
(MakeConstLitString): Remove.
(BackFillString): Remove.
(InitConstString): Rewrite.
(GetConstStringM2): Remove.
(GetConstStringC): Remove.
(GetConstStringContent): New procedure function.
(GetConstStringM2nul): Remove.
(GetConstStringCnul): Remove.
(MakeConstStringCnul): Rewrite.
(MakeConstStringM2nul): Rewrite.
(MakeConstStringC): Remove.
(MakeConstString): Rewrite.
(PutConstStringKnown): New procedure.
(CopyConstString): New procedure.
(PutConstString): Remove.
(IsConstStringKnown): New procedure function.
(IsConstStringM2): New procedure function.
(IsConstStringC): Rewrite.
(IsConstStringM2nul): Rewrite.
(IsConstStringCnul): Rewrite.
(GetConstStringKind): New procedure function.
(GetString): Check Known.
(GetStringLength): Add token parameter and check Known.

gcc/testsuite/ChangeLog:

PR modula2/113889
* gm2/pim/run/pass/pim-run-pass.exp: Add filter for
constdef.mod.
* gm2/extensions/run/pass/callingc2.mod: New test.
* gm2/extensions/run/pass/callingc3.mod: New test.
* gm2/extensions/run/pass/callingc4.mod: New test.
* gm2/extensions/run/pass/callingc5.mod: New test.
* gm2/extensions/run/pass/callingc6.mod: New test.
* gm2/extensions/run/pass/callingc7.mod: New test.
* gm2/extensions/run/pass/callingc8.mod: New test.
* gm2/extensions/run/pass/fixedarray.mod: New test.
* gm2/extensions/run/pass/fixedarray2.mod: New test.
* gm2/pim/run/pass/constdef.def: New test.
* gm2/pim/run/pass/constdef.mod: New test.
* gm2/pim/run/pass/testimportconst.mod: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>

d: Add UTF BOM tests to gdc.dg testsuite

Some of these are part of the upstream DMD `gdc.test' testsuite, but
they had been omitted because they get mangled by the lib/gdc-utils.exp
helpers when parsing and staging the tests. Translate them over to the
gdc.dg testsuite instead.

gcc/testsuite/ChangeLog:

* gdc.dg/bom_UTF16BE.d: New test.
* gdc.dg/bom_UTF16LE.d: New test.
* gdc.dg/bom_UTF32BE.d: New test.
* gdc.dg/bom_UTF32LE.d: New test.
* gdc.dg/bom_UTF8.d: New test.
* gdc.dg/bom_characters.d: New test.
* gdc.dg/bom_error_UTF8.d: New test.
* gdc.dg/bom_infer_UTF16BE.d: New test.
* gdc.dg/bom_infer_UTF16LE.d: New test.
* gdc.dg/bom_infer_UTF32BE.d: New test.
* gdc.dg/bom_infer_UTF32LE.d: New test.
* gdc.dg/bom_infer_UTF8.d: New test.

match.pd: Fix ICE on BIT_INSERT_EXPR of BIT_FIELD_REF folding [PR113967]

The following testcase ICEs, because BIT_FIELD_REF's position is not
multiple of the vector element's bit size and the code uses exact_div
to divide those 2 values.

For BIT_INSERT_EXPR, the tree-cfg.cc verification verifies the position
is a multiple of the inserted bit size when inserting into vectors,
but for BIT_FIELD_REF the position can be arbitrary if within the range.

The following patch fixes that.

2024-02-19 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113967
* match.pd (bit_insert @0 (BIT_FIELD_REF @1 ..) ..): Require
in condition that @rpos is multiple of vector element size.

* gcc.dg/pr113967.c: New test.

RISC-V: Suppress the vsetvl fusion for conflict successors

Update in v2: Add dump information.

This patch fixes the following ineffective vsetvl insertion:

void f (int32_t * restrict in, int32_t * restrict out, size_t n, size_t cond, size_t cond2)
{
 for (size_t i = 0; i < n; i++)
 {
 if (i == cond) {
 vint8mf8_t v = *(vint8mf8_t*)(in + i + 100);
 *(vint8mf8_t*)(out + i + 100) = v;
 } else if (i == cond2) {
 vfloat32mf2_t v = *(vfloat32mf2_t*)(in + i + 200);
 *(vfloat32mf2_t*)(out + i + 200) = v;
 } else if (i == (cond2 - 1)) {
 vuint16mf2_t v = *(vuint16mf2_t*)(in + i + 300);
 *(vuint16mf2_t*)(out + i + 300) = v;
 } else {
 vint8mf4_t v = *(vint8mf4_t*)(in + i + 400);
 *(vint8mf4_t*)(out + i + 400) = v;
 }
 }
}

Before this patch:

f:
.LFB0:
 .cfi_startproc
 beq a2,zero,.L12
 addi a7,a0,400
 addi a6,a1,400
 addi a0,a0,1600
 addi a1,a1,1600
 li a5,0
 addi t6,a4,-1
 vsetvli t3,zero,e8,mf8,ta,ma ---> ineffective uplift
.L7:
 beq a3,a5,.L15
 beq a4,a5,.L16
 beq t6,a5,.L17
 vsetvli t1,zero,e8,mf4,ta,ma
 vle8.v v1,0(a0)
 vse8.v v1,0(a1)
 vsetvli t3,zero,e8,mf8,ta,ma
.L4:
 addi a5,a5,1
 addi a7,a7,4
 addi a6,a6,4
 addi a0,a0,4
 addi a1,a1,4
 bne a2,a5,.L7
.L12:
 ret
.L15:
 vle8.v v1,0(a7)
 vse8.v v1,0(a6)
 j .L4
.L17:
 vsetvli t1,zero,e8,mf4,ta,ma
 addi t5,a0,-400
 addi t4,a1,-400
 vle16.v v1,0(t5)
 vse16.v v1,0(t4)
 vsetvli t3,zero,e8,mf8,ta,ma
 j .L4
.L16:
 addi t5,a0,-800
 addi t4,a1,-800
 vle32.v v1,0(t5)
 vse32.v v1,0(t4)
 j .L4

It's obvious that we are hoisting the e8mf8 vsetvl to the top. It's ineffective since e8mf8 comes from
low probability block which is if (i == cond).

For this case, we disable such fusion.

After this patch:

f:
beq a2,zero,.L12
addi a7,a0,400
addi a6,a1,400
addi a0,a0,1600
addi a1,a1,1600
li a5,0
addi t6,a4,-1
.L7:
beq a3,a5,.L15
beq a4,a5,.L16
beq t6,a5,.L17
vsetvli t1,zero,e8,mf4,ta,ma
vle8.v v1,0(a0)
vse8.v v1,0(a1)
.L4:
addi a5,a5,1
addi a7,a7,4
addi a6,a6,4
addi a0,a0,4
addi a1,a1,4
bne a2,a5,.L7
.L12:
ret
.L15:
vsetvli t3,zero,e8,mf8,ta,ma
vle8.v v1,0(a7)
vse8.v v1,0(a6)
j .L4
.L17:
addi t5,a0,-400
addi t4,a1,-400
vsetvli t1,zero,e8,mf4,ta,ma
vle16.v v1,0(t5)
vse16.v v1,0(t4)
j .L4
.L16:
addi t5,a0,-800
addi t4,a1,-800
vsetvli t3,zero,e32,mf2,ta,ma
vle32.v v1,0(t5)
vse32.v v1,0(t4)
j .L4

Tested on both RV32/RV64 no regression. Ok for trunk ?

PR target/113696

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info):
Suppress vsetvl fusion.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr113696.c: New test.

Daily bump.

x86-64: Generate push2/pop2 only if the incoming stack is 16-byte aligned

Since push2/pop2 requires 16-byte stack alignment, don't generate them
if the incoming stack isn't 16-byte aligned.

gcc/

PR target/113912
* config/i386/i386.cc (ix86_can_use_push2pop2): New.
(ix86_pro_and_epilogue_can_use_push2pop2): Use it.
(ix86_emit_save_regs): Don't generate push2 if
ix86_can_use_push2pop2 return false.
(ix86_expand_epilogue): Don't generate pop2 if
ix86_can_use_push2pop2 return false.

gcc/testsuite/

PR target/113912
* gcc.target/i386/apx-push2pop2-2.c: New test.

AVR: Improve documentation for -mmcu=.

gcc/
* doc/invoke.texi (AVR Options) <-mmcu>: Remove "Atmel".
Note on complete device support.

AVR: Add examples for ISR macro to interrupt attribute doc.

gcc/
* doc/extend.texi (AVR Function Attributes): Fuse description
of "signal" and "interrupt" attribute. Link pseudo instruction.

testsuite: Mark non-optimized variants as expensive

When not optimized for speed, the test for PR112344 takes several
seconds to execute on native x86_64, and 15 minutes on PRU target
simulator. Thus mark those variants as expensive. The -O2 variant
which originally triggered the PR is not expensive, hence it is
still run by default.

PR middle-end/112344

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr112344.c: Run non-optimized variants only
if expensive tests are allowed.

Signed-off-by: Dimitar Dimitrov <dimitar@dinux.eu>

LoongArch: Remove redundant symbol type conversions in larchintrin.h.

gcc/ChangeLog:

* config/loongarch/larchintrin.h (__movgr2fcsr): Remove redundant
symbol type conversions.
(__cacop_d): Likewise.
(__cpucfg): Likewise.
(__asrtle_d): Likewise.
(__asrtgt_d): Likewise.
(__lddir_d): Likewise.
(__ldpte_d): Likewise.
(__crc_w_b_w): Likewise.
(__crc_w_h_w): Likewise.
(__crc_w_w_w): Likewise.
(__crc_w_d_w): Likewise.
(__crcc_w_b_w): Likewise.
(__crcc_w_h_w): Likewise.
(__crcc_w_w_w): Likewise.
(__crcc_w_d_w): Likewise.
(__csrrd_w): Likewise.
(__csrwr_w): Likewise.
(__csrxchg_w): Likewise.
(__csrrd_d): Likewise.
(__csrwr_d): Likewise.
(__csrxchg_d): Likewise.
(__iocsrrd_b): Likewise.
(__iocsrrd_h): Likewise.
(__iocsrrd_w): Likewise.
(__iocsrrd_d): Likewise.
(__iocsrwr_b): Likewise.
(__iocsrwr_h): Likewise.
(__iocsrwr_w): Likewise.
(__iocsrwr_d): Likewise.
(__frecipe_s): Likewise.
(__frecipe_d): Likewise.
(__frsqrte_s): Likewise.
(__frsqrte_d): Likewise.

LoongArch: Fix wrong return value type of __iocsrrd_h.

gcc/ChangeLog:

* config/loongarch/larchintrin.h (__iocsrrd_h): Modify the
function return value type to unsigned short.

Daily bump.

d: Merge dmd, druntime 9471b25db9, phobos 547886846.

D front-end changes:

- Import dmd v2.107.1-rc.1.

D runtime changes:

- Import druntime v2.107.1-rc.1.

Phobos changes:

- Import phobos v2.107.1-rc.1.

gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 9471b25db9.
* dmd/VERSION: Bump version to v2.107.1-rc.1.
* Make-lang.in (D_FRONTEND_OBJS): Add d/cxxfrontend.o.
* d-attribs.cc (build_attributes): Update for new front-end interface.
* d-builtins.cc (build_frontend_type): Likewise.
(strip_type_modifiers): Likewise.
(covariant_with_builtin_type_p): Likewise.
* d-codegen.cc (declaration_type): Likewise.
(parameter_type): Likewise.
(build_array_struct_comparison): Likewise.
(void_okay_p): Likewise.
* d-convert.cc (convert_expr): Likewise.
(check_valist_conversion): Likewise.
* d-lang.cc (d_generate_ddoc_file): Likewise.
(d_parse_file): Likewise.
* d-target.cc (TargetCPP::toMangle): Likewise.
(TargetCPP::typeInfoMangle): Likewise.
(TargetCPP::thunkMangle): Likewise.
(TargetCPP::parameterType): Likewise.
* decl.cc (d_mangle_decl): Likewise.
(DeclVisitor::visit): Likewise.
(DeclVisitor::visit (CAsmDeclaration *)): New method.
(get_symbol_decl): Update for new front-end interface.
(layout_class_initializer): Likewise.
* expr.cc (ExprVisitor::visit): Likewise.
* intrinsics.cc (maybe_set_intrinsic): Likewise.
(expand_intrinsic_rotate): Likewise.
* modules.cc (layout_moduleinfo_fields): Likewise.
(layout_moduleinfo): Likewise.
* runtime.cc (get_libcall_type): Likewise.
* typeinfo.cc (make_frontend_typeinfo): Likewise.
(TypeInfoVisitor::visit): Likewise.
(create_typeinfo): Likewise.
* types.cc (same_type_p): Likewise.
(build_ctype): Likewise.

libphobos/ChangeLog:

* libdruntime/MERGE: Merge upstream druntime 9471b25db9.
* src/MERGE: Merge upstream phobos 547886846.

libgfortran: [PR105473] Fix checks for decimal='comma'.

PR libfortran/105473

libgfortran/ChangeLog:

* io/list_read.c (eat_separator): Reject comma as a
seprator when it is being used as a decimal point.
(parse_real): Reject a '.' when is should be a comma.
(read_real): Likewise.
* io/read.c (read_f): Add more checks for ',' and '.'
conditions.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105473.f90: New test.

fortran: gfc_trans_subcomponent_assign fixes [PR113503]

The r14-870 changes broke xtb package tests (reduced testcase is the first
one below) and caused ICEs on a test derived from that (the second one).
For the
 x = T(u = trim (us(1)))
statement, before that change gfortran used to emit weird code with
2 trim calls:
 _gfortran_string_trim (&len.2, (void * *) &pstr.1, 20, &us[0]);
 if (len.2 > 0)
 {
 __builtin_free ((void *) pstr.1);
 }
 D.4275 = len.2;
 t.0.u = (character(kind=1)[1:0] *) __builtin_malloc (MAX_EXPR <(sizetype) D.4275, 1>);
 t.0._u_length = D.4275;
 _gfortran_string_trim (&len.4, (void * *) &pstr.3, 20, &us[0]);
 (void) __builtin_memcpy ((void *) t.0.u, (void *) pstr.3, (unsigned long) NON_LVALUE_EXPR <len.4>);
 if (len.4 > 0)
 {
 __builtin_free ((void *) pstr.3);
 }
That worked at runtime, though it is wasteful.
That commit changed it to:
 slen.3 = len.2;
 t.0.u = (character(kind=1)[1:0] *) __builtin_malloc (MAX_EXPR <(sizetype) slen.3, 1>);
 t.0._u_length = slen.3;
 _gfortran_string_trim (&len.2, (void * *) &pstr.1, 20, &us[0]);
 (void) __builtin_memcpy ((void *) t.0.u, (void *) pstr.1, (unsigned long) NON_LVALUE_EXPR <len.2>);
 if (len.2 > 0)
 {
 __builtin_free ((void *) pstr.1);
 }
which results in -Wuninitialized warning later on and if one is unlucky and
the uninitialized len.2 variable is smaller than the trimmed length, it
results in heap overflow and often crashes later on.
The bug above is clear, len.2 is only initialized in the
_gfortran_string_trim (&len.2, (void * *) &pstr.1, 20, &us[0]);
call, but used before that. Now, the
 slen.3 = len.2;
 t.0.u = (character(kind=1)[1:0] *) __builtin_malloc (MAX_EXPR <(sizetype) slen.3, 1>);
 t.0._u_length = slen.3;
statements come from the alloc_scalar_allocatable_subcomponent call,
while
 _gfortran_string_trim (&len.2, (void * *) &pstr.1, 20, &us[0]);
from the gfc_conv_expr (&se, expr); call which is done before the
alloc_scalar_allocatable_subcomponent call, but is only appended later on
with gfc_add_block_to_block (&block, &se.pre);
Now, obviously the alloc_scalar_allocatable_subcomponent emitted statements
can depend on the se.pre sequence statements which can compute variables
used by alloc_scalar_allocatable_subcomponent like the length.
On the other side, I think the se.pre sequence really shouldn't depend
on the changes done by alloc_scalar_allocatable_subcomponent, that is
initializing the FIELD_DECLs of the destination allocatable subcomponent
only, the gfc_conv_expr statements are already created, so all they could
in theory depend above is on t.0.u or t.0._u_length, but I believe if the
rhs dependened on the lhs content (which is allocated by those statements
but really uninitialized), it would need to be discovered by the dependency
analysis and forced into a temporary.
So, in order to fix the first testcase, the second hunk of the patch just
emits the se.pre block before the alloc_scalar_allocatable_subcomponent
changes rather than after it.

The second problem is an ICE on the second testcase. expr in the caller
(expr2 inside of alloc_scalar_allocatable_subcomponent) has
expr2->ts.u.cl->backend_decl already set, INTEGER_CST 20, but
alloc_scalar_allocatable_subcomponent overwrites it to a new VAR_DECL
which it assigns a value to before the malloc. That can work if the only
places the expr2->ts is ever used are in the same local block or its
subblocks (and only if it is dominated by the code emitted by
alloc_scalar_allocatable_subcomponent, so e.g. not if that call is inside
of a conditional code and use later unconditional), but doesn't work
if expr2->ts is used before that block or after it. So, the exact ICE is
because of:
 slen.1 = 20;
 static character(kind=1) us[1][1:20] = {"foo "};
 x.u = 0B;
 x._u_length = 0;
 {
 struct t t.0;
 struct t D.4308;

 {
 integer(kind=8) slen.1;

 slen.1 = 20;
 t.0.u = (character(kind=1)[1:0] *) __builtin_malloc (MAX_EXPR <(sizetype) slen.1, 1>);
 t.0._u_length = slen.1;
 (void) __builtin_memcpy ((void *) t.0.u, (void *) &us[0], 20);
 }
where the first slen.1 = 20; is emitted because it sees us has a VAR_DECL
ts.u.cl->backend_decl and so it wants to initialize it to the actual length.
This is invalid GENERIC, because the slen.1 variable is only declared inside
of a {} later on and so uses outside of it are wrong. Similarly wrong would
be if it is used later on. E.g. in the same testcase if it has
 type(T) :: x, y
 x = T(u = us(1))
 y%u = us(1)
then there is
 {
 integer(kind=8) slen.1;

 slen.1 = 20;
 t.0.u = (character(kind=1)[1:0] *) __builtin_malloc (MAX_EXPR <(sizetype) slen.1, 1>);
 t.0._u_length = slen.1;
 (void) __builtin_memcpy ((void *) t.0.u, (void *) &us[0], 20);
 }
...
 if (y.u != 0B) goto L.1;
 y.u = (character(kind=1)[1:0] *) __builtin_malloc (MAX_EXPR <(sizetype) slen.1, 1>);
i.e. another use of slen.1, this time after slen.1 got out of scope.

I really don't understand why the code modifies
expr2->ts.u.cl->backend_decl, expr2 isn't used there anywhere except for
expr2->ts.u.cl->backend_decl expressions, so hacks like save the previous
value, overwrite it temporarily over some call that will use expr2 and
restore afterwards aren't needed - there are no such calls, so the
following patch fixes it just by not messing up with
expr2->ts.u.cl->backend_decl, only set it to size variable and overwrite
that with a temporary if needed.

2024-02-17 Jakub Jelinek <jakub@redhat.com>

PR fortran/113503
* trans-expr.cc (alloc_scalar_allocatable_subcomponent): Don't
overwrite expr2->ts.u.cl->backend_decl, instead set size to
expr2->ts.u.cl->backend_decl first and use size instead of
expr2->ts.u.cl->backend_decl.
(gfc_trans_subcomponent_assign): Emit se.pre into block
before calling alloc_scalar_allocatable_subcomponent instead of
after it.

* gfortran.dg/pr113503_1.f90: New test.
* gfortran.dg/pr113503_2.f90: New test.

libgfortran: Fix namelist read.

PR libfortran/107068

libgfortran/ChangeLog:

* io/list_read.c (read_logical): When looking for a possible
variable name, check for left paren, indicating a possible
array reference.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr107068.f90: New test.

c++: wrong looser excep spec for dep noexcept [PR113158]

Here we find ourselves in maybe_check_overriding_exception_spec in
a template context where we can't instantiate a dependent noexcept.
That's OK, but we have to defer the checking otherwise we give wrong
errors.

PR c++/113158

gcc/cp/ChangeLog:

* search.cc (maybe_check_overriding_exception_spec): Defer checking
when a noexcept couldn't be instantiated & evaluated to false/true.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept83.C: New test.

libstdc++: [_GLIBCXX_DEBUG] Fix std::__niter_base behavior

std::__niter_base is used in _GLIBCXX_DEBUG mode to remove _Safe_iterator<>
wrapper on random access iterators. But doing so it should also preserve original
behavior to remove __normal_iterator wrapper.

libstdc++-v3/ChangeLog:

* include/bits/stl_algobase.h (std::__niter_base): Redefine the overload
definitions for __gnu_debug::_Safe_iterator.
* include/debug/safe_iterator.tcc (std::__niter_base): Adapt declarations.

Fortran: deferred length of character variables shall not get lost [PR113911]

PR fortran/113911

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_deferred_array): Do not clobber
deferred length for a character variable passed as dummy argument.

gcc/testsuite/ChangeLog:

* gfortran.dg/allocatable_length_2.f90: New test.
* gfortran.dg/bind_c_optional-2.f90: Enable deferred-length test.

testsuite: Fix up lra effective target

Given the recent discussions on IRC started with Andrew P. mentioning that
an asm goto outputs test should have { target lra } and the lra effective
target in GCC 11/12 only returning 0 for PA and in 13/14 for PA/AVR, while
we clearly have 14 other targets which don't support LRA and a couple of
further ones which have an -mlra/-mno-lra switch (whatever default they
have), seems to me the effective target is quite broken.

The following patch rewrites it, such that it has a fast path for heavily
used targets which are for years known to use only LRA (just an
optimization) plus determines whether it is a LRA target or reload target
by scanning the -fdump-rtl-reload-details dump on an empty function,
LRA has quite a few always emitted messages in that case while reload has
none of those.

Tested on x86_64-linux and cross to s390x-linux, for the latter with both
make check-gcc RUNTESTFLAGS='--target_board=unix/-mno-lra dg.exp=pr107385.c'
where the test is now UNSUPPORTED and
make check-gcc RUNTESTFLAGS='--target_board=unix/-mlra dg.exp=pr107385.c'
where it fails because I don't have libc around.

There is one special case, NVPTX, which is a TARGET_NO_REGISTER_ALLOCATION
target. I think claiming for it that it is a lra target is strange (even
though it effectively returns true for targetm.lra_p ()), unsure if it
supports asm goto with outputs or not, if it does and we want to test it,
perhaps we should introduce asm_goto_outputs effective target and use
lra || nvptx-*-* for that?

2024-02-17 Jakub Jelinek <jakub@redhat.com>

* lib/target-supports.exp (check_effective_target_lra): Rewrite
to list some heavily used always LRA targets and otherwise check the
-fdump-rtl-reload-details dump for messages specific to LRA.

Daily bump.

libgcc: fix Win32 CV abnormal spurious wakeups in timed wait [PR113850]

Fix a typo in __gthr_win32_abs_to_rel_time that caused it to return a
relative time in seconds instead of milliseconds. As a consequence,
__gthr_win32_cond_timedwait called SleepConditionVariableCS with a
1000x shorter timeout; this caused ~1000x more spurious wakeups in
CV timed waits such as std::condition_variable::wait_for or wait_until,
resulting generally in much higher CPU usage.

This can be demonstrated by this sample program:

```

int main() {
 std::condition_variable cv;
 std::mutex mx;
 bool pass = false;

 auto thread_fn = [&](bool timed) {
 int wakeups = 0;
 using sc = std::chrono::system_clock;
 auto before = sc::now();
 std::unique_lock<std::mutex> ml(mx);
 if (timed) {
 cv.wait_for(ml, std::chrono::seconds(2), [&]{
 ++wakeups;
 return pass;
 });
 } else {
 cv.wait(ml, [&]{
 ++wakeups;
 return pass;
 });
 }
 printf("pass: %d; wakeups: %d; elapsed: %d ms\n", pass, wakeups,
 int((sc::now() - before) / std::chrono::milliseconds(1)));
 pass = false;
 };

 {
 // timed wait, let expire
 std::thread t(thread_fn, true);
 t.join();
 }

 {
 // timed wait, wake up explicitly after 1 second
 std::thread t(thread_fn, true);
 std::this_thread::sleep_for(std::chrono::seconds(1));
 {
 std::unique_lock<std::mutex> ml(mx);
 pass = true;
 }
 cv.notify_all();
 t.join();
 }

 {
 // non-timed wait, wake up explicitly after 1 second
 std::thread t(thread_fn, false);
 std::this_thread::sleep_for(std::chrono::seconds(1));
 {
 std::unique_lock<std::mutex> ml(mx);
 pass = true;
 }
 cv.notify_all();
 t.join();
 }
 return 0;
}
```

On builds based on non-affected threading models (e.g. POSIX on Linux,
or winpthreads or MCF on Win32) the output is something like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```

while with the Win32 threading model we get
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```
(notice the huge number of wakeups in the timed wait cases only).

This commit fixes the conversion, adjusting the final division by
NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
file and not used in any other place, so probably just a typo).

libgcc/ChangeLog:

PR libgcc/113850
* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
fix absolute timespec to relative milliseconds count
conversion (it incorrectly returned seconds instead of
milliseconds); this avoids spurious wakeups in
__gthr_win32_cond_timedwait

Add -Wstrict-aliasing to vector-struct-1.C testcase

As noticed by Marek Polacek in https://gcc.gnu.org/pipermail/gcc-patches/2024-February/645836.html,
this testcase was not failing before without -Wstrict-aliasing so let's add that option.

Committed as obvious after testing to make sure the test is now testing with `-Wstrict-aliasing` and `-flto`.

gcc/testsuite/ChangeLog:

* g++.dg/torture/vector-struct-1.C: Add -Wstrict-aliasing.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Regenerate .pot files

gcc/po/
* gcc.pot: Regenerate.

libcpp/po/
* cpplib.pot: Regenerate.

c++: wrong looser exception spec with deleted fn

I noticed we don't implement the "unless the overriding function is
defined as deleted" wording added to [except.spec] via CWG 1351.

DR 1351

gcc/cp/ChangeLog:

* search.cc (maybe_check_overriding_exception_spec): Don't error about
a looser exception specification if the overrider is deleted.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept82.C: New test.

libstdc++: Fix FAIL: 26_numerics/random/pr60037-neg.cc again [PR113961]

PR libstdc++/87744
PR libstdc++/113961

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.

c++: Add testcase for this PR [PR97990]

This testcase was fixed by r14-5934-gf26d68d5d128c8 but we should add
one to make sure it does not regress again.

Committed as obvious after a quick test on the testcase.

PR c++/97990

gcc/testsuite/ChangeLog:

* g++.dg/torture/vector-struct-1.C: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

testsuite: Add support for scanning assembly with comparitor

There is currently no support for matching at least x lines of assembly
(only scan-assembler-times). This patch would allow setting upper or lower
bounds.

Use case: using different scheduler descriptions and/or cost models will change
assembler output. Testing common functionality across tunes would require a
separate testcase per tune since each assembly output would be different. If we
know a base number of lines should appear across all tunes (i.e. testing return
values: we expect at minimum n stores into register x), we can lower-bound the
test to search for scan-assembler-bound {RE for storing into register x} >= n.
This avoids artificially inflating the scan-assembler-times expected count due
to the assembler choosing to perform extra stores into register x (using it as
a temporary register).

The testcase would be more robust to cpu/tune changes at the cost of not being
as granular towards specific cpu tuning.

gcc/ChangeLog:

* doc/sourcebuild.texi: add scan-assembler-bound

gcc/testsuite/ChangeLog:

* lib/scanasm.exp: add scan-assembler-bound

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>

c++: add fixed testcase [PR111682]

Fixed by the PR113612 fix r14-8960-g19ac327de421fe.

PR c++/111682

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/var-templ86.C: New test.

c++: implicit move with throw [PR113853]

Here we have

 template<class T>
 auto is_throwable(T t) -> decltype(throw t, true) { ... }

where we didn't properly mark 't' as IMPLICIT_RVALUE_P, which caused
the wrong overload to have been chosen. Jason figured out it's because
we don't correctly implement [expr.prim.id.unqual]#4.2, which post-P2266
says that an id-expression is move-eligible if

"the id-expression (possibly parenthesized) is the operand of
a throw-expression, and names an implicitly movable entity that belongs
to a scope that does not contain the compound-statement of the innermost
lambda-expression, try-block, or function-try-block (if any) whose
compound-statement or ctor-initializer contains the throw-expression."

I worked out that it's trying to say that given

 struct X {
 X();
 X(const X&);
 X(X&&) = delete;
 };

the following should fail: the scope of the throw is an sk_try, and it's
also x's scope S, and S "does not contain the compound-statement of the
*try-block" so x is move-eligible, so we move, so we fail.

 void f ()
 try {
 X x;
 throw x; // use of deleted function
 } catch (...) {
 }

Whereas here:

 void g (X x)
 try {
 throw x;
 } catch (...) {
 }

the throw is again in an sk_try, but x's scope is an sk_function_parms
which *does* contain the {} of the *try-block, so x is not move-eligible,
so we don't move, so we use X(const X&), and the code is fine.

The current code also doesn't seem to handle

 void h (X x) {
 void z (decltype(throw x, true));
 }

where there's no enclosing lambda or sk_try so we should move.

I'm not doing anything about lambdas because we shouldn't reach the
code at the end of the function: the DECL_HAS_VALUE_EXPR_P check
shouldn't let us go further.

PR c++/113789
PR c++/113853

gcc/cp/ChangeLog:

* typeck.cc (treat_lvalue_as_rvalue_p): Update code to better
reflect [expr.prim.id.unqual]#4.2.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/sfinae69.C: Remove dg-bogus.
* g++.dg/cpp0x/sfinae70.C: New test.
* g++.dg/cpp0x/sfinae71.C: New test.
* g++.dg/cpp0x/sfinae72.C: New test.
* g++.dg/cpp2a/implicit-move4.C: New test.

c++: Diagnose this specifier on template parameters [PR113929]

For template parameters, the optional this specifier is in the grammar
template-parameter-list -> template-parameter -> parameter-declaration,
just [dcl.fct/6] says that it is only valid in parameter-list of certain
functions. So, unlike the case of decl-specifier-seq used in non-terminals
other than parameter-declaration, I think it is better not to fix this
by
 cp_parser_decl_specifier_seq (parser,
- flags | CP_PARSER_FLAGS_PARAMETER,
+ flags | (template_parameter_p ? 0
+ : CP_PARSER_FLAGS_PARAMETER),
 &decl_specifiers,
 &declares_class_or_enum);
which would be pretending it isn't in the grammar, but by diagnosing it
separately, which is what the following patch does.

2024-02-16 Jakub Jelinek <jakub@redhat.com>

PR c++/113929
* parser.cc (cp_parser_parameter_declaration): Diagnose this specifier
on template parameter declaration.

* g++.dg/parse/pr113929.C: New test.

gdbhooks: regex syntax error

Recent python complains about this pattern with
SyntaxWarning: invalid escape sequence '\s'
because \s in a regular string just means 's'; for it to mean whitespace,
you need \\ or for the pattern to be a raw string.

Curiously, break-on-pass completion works for me either with or without this
change, but at least this avoids the warning.

gcc/ChangeLog:

* gdbhooks.py: Fix regex syntax.

c++/modules: stream TREE_UNAVAILABLE and LAMBDA_EXPR_REGEN_INFO

gcc/cp/ChangeLog:

* module.cc (trees_out::core_bools): Stream TREE_UNAVAILABLE.
(trees_in::core_bools): Likewise.
(trees_out::core_vals): Stream LAMBDA_EXPR_REGEN_INFO.
(trees_in::core_vals): Likewise.

Reviewed-by: Jason Merrill <jason@redhat.com>

libsanitizer: Intercept __makecontext_v2 on Solaris/SPARC [PR113785]

c-c++-common/asan/swapcontext-test-1.c FAILs on Solaris/SPARC:

FAIL: c-c++-common/asan/swapcontext-test-1.c -O0 execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -O1 execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -O2 execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -O2 -flto execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -O2 -flto -flto-partition=none
execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -O3 -g execution test
FAIL: c-c++-common/asan/swapcontext-test-1.c -Os execution test

As detailed in PR sanitizer/113785, this happens because an ABI change
in Solaris 10/SPARC caused the external symbol for makecontext to be
changed to __makecontext_v2, which isn't intercepted.

The following patch, submitted upstream at
https://github.com/llvm/llvm-project/pull/81588, fixes that.

Tested on sparc-sun-solaris2.11 and i386-pc-solaris2.11.

2024-02-16 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

libsanitizer:
PR sanitizer/113785
* asan/asan_interceptors.cpp: Cherry-pick llvm-project revision
8c2033719a843a1880427a5e8caa5563248bce78.

tree-optimization/113895 - consistency check fails in copy_reference_ops_from_ref

The following addresses consistency check fails in copy_reference_ops_from_ref
when we are handling out-of-bound array accesses (it's almost impossible
to identically mimic the get_ref_base_and_extent behavior).  It also
addresses the case where an out-of-bound constant offset computes to a
-1 off which is the special value for "unknown".  This patch basically
turns off verification in those cases.

PR tree-optimization/113895
* tree-ssa-sccvn.cc (copy_reference_ops_from_ref): Disable
consistency checking when there are out-of-bound array
accesses.  Allow -1 off when from an array reference with
constant index.

* gcc.dg/torture/pr113895-2.c: New testcase.
* gcc.dg/torture/pr113895-3.c: Likewise.
* gcc.dg/torture/pr113895-4.c: Likewise.

libstdc++: Fix FAIL: 26_numerics/random/pr60037-neg.cc [PR113931]

PR libstdc++/87744
PR libstdc++/113931

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.

libstdc++: Improve docs for debug mode backtraces

The configure option is no longer necessary.

libstdc++-v3/ChangeLog:

* doc/xml/manual/debug_mode.xml: Update docs for backtraces.
* doc/html/manual/debug_mode_using.html: Regenerate.

libstdc++: Fix spelling of <envar> elements in manual

libstdc++-v3/ChangeLog:

* doc/xml/manual/test.xml: Fix spelling of <envar> elements.
* doc/html/manual/test.html: Regenerate.

RISC-V: Fix *sge_<X:mode><GPR:mode> pattern

*sge_<X:mode><GPR:mode> pattern has referenced operand[2] which is
invalid...it should just use `slti` rather than `slti%i2`.

gcc/ChangeLog:

PR target/106543
* config/riscv/riscv.md (*sge_<X:mode><GPR:mode>): Fix asm
pattern.

testsuite: Require lto-plugin support in gcc.dg/lto/modref-3 etc. [PR98237]

gcc.dg/lto/modref-3 etc. FAIL on Solaris with the native linker:

FAIL: gcc-dg-lto-modref-3-01.exe scan-wpa-ipa-dump modref "parm 1 flags: no_direct_clobber no_direct_escape"
FAIL: gcc-dg-lto-modref-4-01.exe scan-wpa-ipa-dump modref "parm 1 flags: no_direct_clobber no_direct_escape"
FAIL: gcc.dg/lto/modref-3 c_lto_modref-3_0.o-c_lto_modref-3_1.o execute -O2 -flto-partition=max -fdump-ipa-modref -fno-ipa-sra -fno-ipa-cp -flto
FAIL: gcc.dg/lto/modref-4 c_lto_modref-4_0.o-c_lto_modref-4_1.o execute -O2 -flto-partition=max -fdump-ipa-modref -fno-ipa-sra -flto

The issue is that the tests require the linker plugin, which isn't
available with Solaris ld. Thus, it also FAILs when gcc is configured
with --disable-lto-plugin.

This patch thus declares the requirement. As it turns out, there's an
undocumented dg-require-linker-plugin already, but I introduce and use
the corresponding effective-target keyword and document both.

Given that the effective-target form is more flexible, I'm tempted to
remove dg-require-* with an empty arg as already mentioned in
sourcebuild.texi. That is not this patch, however.

Tested on i386-pc-solaris2.11 with ld and gld.

2024-02-14 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR ipa/98237
* lib/target-supports.exp (is-effective-target): Handle
linker_plugin.
* gcc.dg/lto/modref-3_0.c: Require linker_plugin support.
* gcc.dg/lto/modref-4_0.c: Likewise.

gcc:
* doc/sourcebuild.texi (Effective-Target Keywords, Other
attribugs): Document linker_plugin.
(Require Support): Document dg-require-linker-plugin.

RISC-V: Add new option -march=help to print all supported extensions

The output of -march=help is like below:

```
All available -march extensions for RISC-V:
 Name Version
 i 2.0, 2.1
 e 2.0
 m 2.0
 a 2.0, 2.1
 f 2.0, 2.2
 d 2.0, 2.2
...
```

Also support -print-supported-extensions and --print-supported-extensions for
clang compatibility.

gcc/ChangeLog:

PR target/109349

* common/config/riscv/riscv-common.cc (riscv_arch_help): New.
* config/riscv/riscv-protos.h (RISCV_MAJOR_VERSION_BASE): New.
(RISCV_MINOR_VERSION_BASE): Ditto.
(RISCV_REVISION_VERSION_BASE): Ditto.
* config/riscv/riscv-c.cc (riscv_ext_version_value): Use enum
rather than magic number.
* config/riscv/riscv.h (riscv_arch_help): New.
(EXTRA_SPEC_FUNCTIONS): Add riscv_arch_help.
(DRIVER_SELF_SPECS): Handle -march=help, -print-supported-extensions and
--print-supported-extensions.
* config/riscv/riscv.opt (march=help): New.
(print-supported-extensions): New.
(-print-supported-extensions): New.
* doc/invoke.texi (RISC-V Options): Document -march=help.

Reviewed-by: Christoph Müllner <christoph.muellner@vrull.eu>

Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

This patch fixes a bug that causes indirect calls in PAC-enabled functions
to be tailcalled incorrectly when all argument registers R0-R3 are used.

2024-02-07 Tejas Belagod <tejas.belagod@arm.com>

PR target/113780
* config/arm/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
for indirect calls with 4 or more arguments in pac-enabled functions.

* lib/target-supports.exp (v8_1m_main_pacbti): Add __ARM_FEATURE_PAUTH.
* gcc.target/arm/pac-sibcall.c: New.

Daily bump.

libgomp: Update documentation for indirect calls in target regions

Support for indirect calls to procedures/functions in offloaded target
regions is now available for C, C++ and Fortran.

2024-02-15 Kwok Cheung Yeung <kcyeung@baylibre.com>

libgomp/
* libgomp.texi (OpenMP 5.1): Mark indirect call support as fully
implemented.

openmp, fortran: Add Fortran support for indirect clause on the declare target directive

2024-02-15 Kwok Cheung Yeung <kcyeung@baylibre.com>

gcc/fortran/
* dump-parse-tree.cc (show_attr): Handle omp_declare_target_indirect
attribute.
* f95-lang.cc (gfc_gnu_attributes): Add entry for 'omp declare
target indirect'.
* gfortran.h (symbol_attribute): Add omp_declare_target_indirect
field.
(struct gfc_omp_clauses): Add indirect field.
* openmp.cc (omp_mask2): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_clauses): Match indirect clause.
(OMP_DECLARE_TARGET_CLAUSES): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_declare_target): Check omp_device_type and apply
omp_declare_target_indirect attribute to symbol if indirect clause
active. Show warning if there are only device_type and/or indirect
clauses on the directive.
* trans-decl.cc (add_attributes_to_decl): Add 'omp declare target
indirect' attribute if symbol has indirect attribute set.

gcc/testsuite/
* gfortran.dg/gomp/declare-target-4.f90 (f1): Update expected warning.
* gfortran.dg/gomp/declare-target-indirect-1.f90: New.
* gfortran.dg/gomp/declare-target-indirect-2.f90: New.

libgomp/
* testsuite/libgomp.fortran/declare-target-indirect-1.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-2.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-3.f90: New.

analyzer: remove offset_region size overloads [PR111266]

PR analyzer/111266 reports a missing -Wanalyzer-out-of-bounds when
accessing relative to a concrete byte offset.

Root cause is that offset_region::get_{byte,bit}_size_sval were
attempting to compute the size that's valid to access, rather than the
size of the access attempt.

Fixed by removing these vfunc overrides from offset_region as the
base class implementation does the right thing.

gcc/analyzer/ChangeLog:
PR analyzer/111266
* region.cc (offset_region::get_byte_size_sval): Delete.
(offset_region::get_bit_size_sval): Delete.
* region.h (region::get_byte_size): Add comment clarifying that
this relates to the size of the access, rather than the size
that's valid to access.
(region::get_bit_size): Likewise.
(region::get_byte_size_sval): Likewise.
(region::get_bit_size_sval): Likewise.
(offset_region::get_byte_size_sval): Delete.
(offset_region::get_bit_size_sval): Delete.

gcc/testsuite/ChangeLog:
PR analyzer/111266
* c-c++-common/analyzer/out-of-bounds-pr111266.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

testsuite: Require lra effective target for pr107385.c

Old reload doesn't support asm goto with output operands.
We have lra effective target (though, strangely it returns
0 just for 2 targets out of at least 16 targets with no LRA support),
so this patch uses it, similarly how it is done in other asm goto
tests with output operands.

2024-02-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/107385
* gcc.dg/pr107385.c: Require lra effective target.

aarch64: Fix undefined code in vect_ctz_1.c

The testcase gcc.target/aarch64/vect_ctz_1.c fails execution when running
with -march=armv9-a due to the testcase calls __builtin_ctz with a value of 0.
The testcase should not depend on undefined behavior of __builtin_ctz. So this
changes it to use the g form with the 2nd argument of 32. Now the execution part
of the testcase work. It still has a scan-assembler failure which should be fixed
seperately.

Tested on aarch64-linux-gnu.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/vect_ctz_1.c (TEST): Use g form of the builtin and pass 32
as the value expected at 0.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

testsuite: Define _POSIX_SOURCE for tests [PR113278]

As the tests assume that fileno() is visible (only part of POSIX),
define the guard to ensure that it's visible. Currently, glibc appears
to always have this defined in C++, newlib does not.

Without this patch, fails like this can be seen:

Testing analyzer/fileno-1.c, -std=c++98
.../fileno-1.c: In function 'int test_pass_through(FILE*)':
.../fileno-1.c:5:10: error: 'fileno' was not declared in this scope
FAIL: c-c++-common/analyzer/fileno-1.c -std=c++98 (test for excess errors)

Patch has been verified on Linux.

gcc/testsuite/ChangeLog:
PR testsuite/113278
* c-c++-common/analyzer/fileno-1.c: Define _POSIX_SOURCE.
* c-c++-common/analyzer/flex-with-call-summaries.c: Same.
* c-c++-common/analyzer/flex-without-call-summaries.c: Same.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

bpf: fix zero_extendqidi2 ldx template

Commit 77d0f9ec3809b4d2e32c36069b6b9239d301c030 inadvertently changed
the normal asm dialect instruction template for zero_extendqidi2 from
ldxb to ldxh. Fix that.

gcc/

* config/bpf/bpf.md (zero_extendqidi2): Correct asm template to
use ldxb instead of ldxh.

testsuite: Add testcase for already fixed PR [PR107385]

This testcase has been fixed by the PR113921 fix, but unlike testcase
in there this one is not target specific.

2024-02-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/107385
* gcc.dg/pr107385.c: New test.

expand: Fix handling of asm goto outputs vs. PHI argument adjustments [PR113921]

The Linux kernel and the following testcase distilled from it is
miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
fixups on some of the edges (but doesn't commit edge insertions).
Later expand_asm_stmt emits further instructions on the same edge.
Now the problem is that expand_asm_stmt uses insert_insn_on_edge
to add its own fixups, but that function appends to the existing
sequence on the edge if any. And the bug triggers when the
fixup sequence emitted by eliminate_phi uses a pseudo which the
fixup sequence emitted by expand_asm_stmt later on sets.
So, we end up with
 (set (reg A) (asm_operands ...))
and on one of the edges queued sequence
 (set (reg C) (reg B)) // added by eliminate_phi
 (set (reg B) (reg A)) // added by expand_asm_stmt
That is wrong, what we emit by expand_asm_stmt needs to be as close
to the asm_operands as possible (they aren't known until expand_asm_stmt
is called, the PHI fixup code assumes it is reg B which holds the right
value) and the PHI adjustments need to be done after it.

So, the following patch introduces a prepend_insn_to_edge function and
uses it from expand_asm_stmt, so that we queue
 (set (reg B) (reg A)) // added by expand_asm_stmt
 (set (reg C) (reg B)) // added by eliminate_phi
instead and so the value from the asm_operands output propagates correctly
to the PHI result.

2024-02-15 Jakub Jelinek <jakub@redhat.com>

PR middle-end/113921
* cfgrtl.h (prepend_insn_to_edge): New declaration.
* cfgrtl.cc (insert_insn_on_edge): Clarify behavior in function
comment.
(prepend_insn_to_edge): New function.
* cfgexpand.cc (expand_asm_stmt): Use prepend_insn_to_edge instead of
insert_insn_on_edge.

* gcc.target/i386/pr113921.c: New test.

tree-optimization/111156 - properly dissolve SLP only groups

The following fixes the omission of failing to look at pattern
stmts when we need to dissolve SLP only groups.

PR tree-optimization/111156
* tree-vect-loop.cc (vect_dissolve_slp_only_groups): Look
at the pattern stmt if any.

arm: testuite: Missing optimization pattern for rev16 with thumb1

This patch marks a rev16 test as XFAIL for architectures having only
Thumb1 support.  The generated code is functionally correct, but the
optimization is disabled when -mthumb is equivalent to Thumb1.  Fixing
the root issue would requires changes that are not suitable for GCC14
stage 4.  More information at
https://linaro.atlassian.net/browse/GNU-1141

gcc/testsuite/ChangeLog:

* gcc.target/arm/rev16_2.c: XFAIL when compiled with Thumb1.

AVR: target 113927 - Simple code triggers stack frame for Reduced Tiny.

The -mmcu=avrtiny cores have no ADIW and SBIW instructions. This was
implemented by clearing all regs out of regclass ADDW_REGS so that
constraint "w" never matched. This corrupted the subset relations of
the register classes as they appear in enum reg_class.

This patch keeps ADDW_REGS like for all other cores, i.e. it contains
R24...R31. Instead of tests like test_hard_reg_class (ADDW_REGS, *)
the code now uses avr_adiw_reg_p (*). And all insns with constraint "w"
get "isa" insn attribute value of "adiw".

Plus, a new built-in macro __AVR_HAVE_ADIW__ is provided, which is more
specific than __AVR_TINY__.

gcc/
PR target/113927
* config/avr/avr.h (AVR_HAVE_ADIW): New macro.
* config/avr/avr-protos.h (avr_adiw_reg_p): New proto.
* config/avr/avr.cc (avr_adiw_reg_p): New function.
(avr_conditional_register_usage) [AVR_TINY]: Don't clear ADDW_REGS.
Replace test_hard_reg_class (ADDW_REGS, ...) with calls to
* config/avr/avr.md: Same.
(attr "isa") <tiny, no_tiny>: Remove.
<adiw, no_adiw>: Add.
(define_insn, define_insn_and_split): When an alternative has
constraint "w", then set attribute "isa" to "adiw".
* config/avr/avr-c.cc (avr_cpu_cpp_builtins) [AVR_HAVE_ADIW]:
Built-in define __AVR_HAVE_ADIW__.
* doc/invoke.texi (AVR Options): Document it.

amdgcn: Disallow unsupported permute on RDNA devices

The RDNA architecture has limited support for permute operations. This should
allow use of the permutations that do work, and fall back to linear code for
other cases.

gcc/ChangeLog:

* config/gcn/gcn-valu.md
(vec_extract<V_MOV:mode><V_MOV_ALT:mode>): Add conditions for RDNA.
* config/gcn/gcn.cc (gcn_vectorize_vec_perm_const): Check permutation
details are supported on RDNA devices.

gccrs: Avoid *.bak suffixed tests - use dg-skip-if instead

On Fri, Feb 09, 2024 at 11:03:38AM +0100, Jakub Jelinek wrote:
> On Wed, Feb 07, 2024 at 12:43:59PM +0100, arthur.cohen@embecosm.com wrote:
> > This patch introduces one regression because generics are getting better
> > understood over time. The code here used to apply generics with the same
> > symbol from previous segments which was a bit of a hack with out limited
> > inference variable support. The regression looks like it will be related
> > to another issue which needs to default integer inference variables much
> > more aggresivly to default integer.
> >
> > Fixes #2723
> > * rust/compile/issue-1773.rs: Moved to...
> > * rust/compile/issue-1773.rs.bak: ...here.
>
> Please don't use such suffixes in the testsuite.
> Either delete the testcase, or xfail it somehow until the bug is fixed.

To be precise, I have scripts to look for backup files in the tree (*~,
*.bak, *.orig, *.rej etc.) and this stands in the way several times a day.

Here is a fix for that in patch form, tested on x86_64-linux with
make check-rust RUNTESTFLAGS='compile.exp=issue-1773.rs'

2024-02-15 Jakub Jelinek <jakub@redhat.com>

* rust/compile/issue-1773.rs.bak: Rename to ...
* rust/compile/issue-1773.rs: ... this. Add dg-skip-if directive.

doc: Add documentation of which operand matches the mode of the standard pattern name [PR113508]

In some of the standard pattern names, it is not obvious which mode is being used in the pattern
name. Is it operand 0, 1, or 2? Is it the wider mode or the narrower mode?
This fixes that so there is no confusion by adding a sentence to some of them.

Built the documentation to make sure that it builds.

gcc/ChangeLog:

PR middle-end/113508
* doc/md.texi (sdot_prod@var{m}, udot_prod@var{m},
usdot_prod@var{m}, ssad@var{m}, usad@var{m}, widen_usum@var{m}3,
smulhs@var{m}3, umulhs@var{m}3, smulhrs@var{m}3, umulhrs@var{m}3):
Add sentence about what the mode m is.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

doc: Fix some standard named pattern documentation modes

Currently these use `@var{m3}` but the 3 here is a literal 3
and not part of the mode itself so it should not be inside
the var. Fixed as such.

Built the documentation to make sure it looks correct now.

gcc/ChangeLog:

* doc/md.texi (widen_ssum, widen_usum, smulhs, umulhs,
smulhrs, umulhrs, sdiv_pow2): Move the 3 outside of the
var.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Do not record dependences from debug stmts in tail merging

The following avoids recording BB dependences for debug stmt uses.

* tree-ssa-tail-merge.cc (same_succ_hash): Skip debug
stmts.

libstdc++: Remove redundant zeroing in std::bitset::operator>>= [PR113806]

The unused bits in the high word are already zero before this operation.
Shifting the used bits to the right cannot affect the unused bits, so we
don't need to sanitize them.

libstdc++-v3/ChangeLog:

PR libstdc++/113806
* include/std/bitset (bitset::operator>>=): Remove redundant
call to _M_do_sanitize.

libstdc++: Use memset to optimize std::bitset::set() [PR113807]

As pointed out in the PR we already do this for reset().

libstdc++-v3/ChangeLog:

PR libstdc++/113807
* include/std/bitset (bitset::set()): Use memset instead of a
loop over the individual words.

libstdc++: Use unsigned division in std::rotate [PR113811]

Signed 64-bit division is much slower than unsigned, so cast the n and
k values to unsigned before doing n %= k. We know this is safe because
neither value can be negative.

libstdc++-v3/ChangeLog:

PR libstdc++/113811
* include/bits/stl_algo.h (__rotate): Use unsigned values for
division.

libstdc++: Avoid aliasing violation in std::valarray [PR99117]

The call to __valarray_copy constructs an _Array object to refer to
this->_M_data but that means that accesses to this->_M_data are through
a restrict-qualified pointer. This leads to undefined behaviour when
copying from an _Expr object that actually aliases this->_M_data.

Replace the call to __valarray_copy with a plain loop. I think this
removes the only use of that overload of __valarray_copy, so it could
probably be removed. I haven't done that here.

libstdc++-v3/ChangeLog:

PR libstdc++/99117
* include/std/valarray (valarray::operator=(const _Expr&)):
Use loop to copy instead of __valarray_copy with _Array.
* testsuite/26_numerics/valarray/99117.cc: New test.

libstdc++: Update tzdata to 2024a

Import the new 2024a tzdata.zi file. The leapseconds file was also
updated to have a new expiry (no new leap seconds were added).

libstdc++-v3/ChangeLog:

* src/c++20/tzdata.zi: Import new file from 2024a release.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds)
Update expiry date for leap seconds list.

libstdc++: Use 128-bit arithmetic for std::linear_congruential_engine [PR87744]

For 32-bit targets without __int128 we need to implement the LCG
transition function by hand using 64-bit types.

We can also slightly simplify the __mod function by using if-constexpr
unconditionally, disabling -Wc++17-extensions warnings with diagnostic
pragmas.

libstdc++-v3/ChangeLog:

PR libstdc++/87744
* include/bits/random.h [!__SIZEOF_INT128__] (_Select_uint_least_t):
Define specialization for 64-bit generators with
non-power-of-two modulus and large constants.
(__mod): Use if constexpr unconditionally.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.
* testsuite/26_numerics/random/linear_congruential_engine/87744.cc:
New test.

testsuite: Fix guality/ipa-sra-1.c to work with return IPA-VRP

The test guality/ipa-sra-1.c stopped working after
r14-5628-g53ba8d669550d3 because the variable from which the values of
removed parameters could be calculated is also removed with it. Fixed
with this patch which stops a function from returning a constant.

I have also noticed that the XFAILed test passes at -O0 -O1 and -Og on
all (three) targets I have tried, not just aarch64, so I extended the
xfail exception accordingly.

gcc/testsuite/ChangeLog:

2024-02-14 Martin Jambor <mjambor@suse.cz>

* gcc.dg/guality/ipa-sra-1.c (get_val1): Move up in the file.
(get_val2): Likewise.
(bar): Do not return a constant. Extend xfail exception for all
targets.

Skip gnat.dg/div_zero.adb on RISC-V

Like AArch64 and POWER, RISC-V does not support trap on zero divide.

gcc/testsuite/
* gnat.dg/div_zero.adb: Skip on RISC-V.

lower-bitint: Ensure we don't get coalescing ICEs for (ab) SSA_NAMEs used in mul/div/mod [PR113567]

The build_bitint_stmt_ssa_conflicts hook has a special case for
multiplication, division and modulo, where to ensure there is no overlap
between lhs and rhs1/rhs2 arrays we make the lhs conflict with the
operands.
On the following testcase, we have
 # a_1(ab) = PHI <a_2(D)(0), a_3(ab)(3)>
lab:
 a_3(ab) = a_1(ab) % 3;
before lowering and this special case causes a_3(ab) and a_1(ab) to
conflict, but the PHI requires them not to conflict, so we ICE because we
can't find some partitioning that will work.

The following patch fixes this by special casing such statements before
the partitioning, force the inputs of the multiplication/division which
have large/huge _BitInt (ab) lhs into new non-(ab) SSA_NAMEs initialized
right before the multiplication/division. This allows the partitioning
to work then, as it has the possibility to use a different partition for
the */% operands.

2024-02-15 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/113567
* gimple-lower-bitint.cc (gimple_lower_bitint): For large/huge
_BitInt multiplication, division or modulo with
SSA_NAME_OCCURS_IN_ABNORMAL_PHI lhs and at least one of rhs1 and rhs2
force the affected inputs into a new SSA_NAME.

* gcc.dg/bitint-90.c: New test.

[libiberty] remove TBAA violation in iterative_hash, improve code-gen

The following removes the TBAA violation present in iterative_hash.
As we eventually LTO that it's important to fix. This also improves
code generation for the >= 12 bytes loop by using | to compose the
4 byte words as at least GCC 7 and up can recognize that pattern
and perform a 4 byte load while the variant with a + is not
recognized (not on trunk either), I think we have an enhancement bug
for this somewhere.

Given we reliably merge and the bogus "optimized" path might be
only relevant for archs that cannot do misaligned loads efficiently
I've chosen to keep a specialization for aligned accesses.

libiberty/
* hashtab.c (iterative_hash): Remove TBAA violating handling
of aligned little-endian case in favor of just keeping the
aligned case special-cased. Use | for composing a larger word.

Daily bump.

Fortran: namelist-object-name renaming.

PR fortran/105847

gcc/fortran/ChangeLog:

* trans-io.cc (transfer_namelist_element): When building the
namelist object name, if the use rename attribute is set, use
the local name specified in the use statement.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105847.f90: New test.

testsuite: Fix a couple of x86 issues in gcc.dg/vect testsuite

A compile-time test can use -march=skylake-avx512 for all x86 targets,
but a runtime test needs to check avx512f effective target if the
instructions can be assembled.

The runtime test also needs to check if the target machine supports
instruction set we have been compiled for. The testsuite uses check_vect
infrastructure, but handling of AVX512F+ ISAs was missing there.

Add detection of __AVX512F__ and __AVX512VL__, which is enough to handle
all currently mentioned target processors in the gcc.dg/vect testsuite.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr113576.c (dg-additional-options):
Use -march=skylake-avx512 for avx512f effective target.
* gcc.dg/vect/pr98308.c (dg-additional-options):
Use -march=skylake-avx512 for all x86 targets.
* gcc.dg/vect/tree-vect.h (check_vect): Handle __AVX512F__
and __AVX512VL__.

x86: Support x32 and IBT in heap trampoline

Add x32 and IBT support to x86 heap trampoline implementation with a
testcase.

2024-02-13 Jakub Jelinek <jakub@redhat.com>
 H.J. Lu <hjl.tools@gmail.com>

libgcc/

PR target/113855
* config/i386/heap-trampoline.c (trampoline_insns): Add IBT
support and pad to the multiple of 4 bytes. Use movabsq
instead of movabs in comments. Add -mx32 variant.

gcc/testsuite/

PR target/113855
* gcc.dg/heap-trampoline-1.c: New test.
* lib/target-supports.exp (check_effective_target_heap_trampoline):
New.

i386: psrlq is not used for PERM<a,{0},1,2,3,4> [PR113871]

Introduce vec_shl_<mode> and vec_shr_<mode> expanders to improve

'*a = __builtin_shufflevector(*a, (vect64){0}, 1, 2, 3, 4);'

and
'*a = __builtin_shufflevector((vect64){0}, *a, 3, 4, 5, 6);'

shuffles. The generated code improves from:

movzwl 6(%rdi), %eax
movzwl 4(%rdi), %edx
salq $16, %rax
orq %rdx, %rax
movzwl 2(%rdi), %edx
salq $16, %rax
orq %rdx, %rax
movq %rax, (%rdi)

to:
movq (%rdi), %xmm0
psrlq $16, %xmm0
movq %xmm0, (%rdi)

and to:
movq (%rdi), %xmm0
psllq $16, %xmm0
movq %xmm0, (%rdi)

in the second case.

The patch handles 32-bit vectors as well and improves generated code from:

movd (%rdi), %xmm0
pxor %xmm1, %xmm1
punpcklwd %xmm1, %xmm0
pshuflw $230, %xmm0, %xmm0
movd %xmm0, (%rdi)

to:
movd (%rdi), %xmm0
psrld $16, %xmm0
movd %xmm0, (%rdi)

and to:
movd (%rdi), %xmm0
pslld $16, %xmm0
movd %xmm0, (%rdi)

PR target/113871

gcc/ChangeLog:

* config/i386/mmx.md (V248FI): New mode iterator.
(V24FI_32): DItto.
(vec_shl_<V248FI:mode>): New expander.
(vec_shl_<V24FI_32:mode>): Ditto.
(vec_shr_<V248FI:mode>): Ditto.
(vec_shr_<V24FI_32:mode>): Ditto.
* config/i386/sse.md (vec_shl_<V_128:mode>): Simplify expander.
(vec_shr_<V248FI:mode>): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr113871-1a.c: New test.
* gcc.target/i386/pr113871-1b.c: New test.
* gcc.target/i386/pr113871-2a.c: New test.
* gcc.target/i386/pr113871-2b.c: New test.
* gcc.target/i386/pr113871-3a.c: New test.
* gcc.target/i386/pr113871-3b.c: New test.
* gcc.target/i386/pr113871-4a.c: New test.

PR other/113336: Fix libatomic testsuite regressions on ARM.

This patch is a revised version of the fix for PR other/113336.
Bootstrapping GCC on arm-linux-gnueabihf with --with-arch=armv6 currently
has a large number of FAILs in libatomic (regressions since last time I
attempted this). The failure mode is related to IFUNC handling with the
file tas_8_2_.o containing an unresolved reference to the function
libat_test_and_set_1_i2.

The following one line change, to build tas_1_2_.o when building tas_8_2_.o,
resolves the problem for me and restores the libatomic testsuite to 44
expected passes and 5 unsupported tests [from 22 unexpected failures
and 22 unresolved testcases].
`

2024-02-14 Roger Sayle <roger@nextmovesoftware.com>
 Victor Do Nascimento <victor.donascimento@arm.com>

libatomic/ChangeLog
PR other/113336
* Makefile.am: Build tas_1_2_.o on ARCH_ARM_LINUX
* Makefile.in: Regenerate.

c++: Defer emitting inline variables [PR113708]

Inline variables are vague-linkage, and may or may not need to be
emitted in any TU that they are part of, similarly to e.g. template
instantiations.

Currently 'import_export_decl' assumes that inline variables have
already been emitted when it comes to end-of-TU processing, and so
crashes when importing non-trivially-initialised variables from a
module, as they have not yet been finalised.

This patch fixes this by ensuring that inline variables are always
deferred till end-of-TU processing, unifying the behaviour for module
and non-module code.

PR c++/113708

gcc/cp/ChangeLog:

* decl.cc (make_rtl_for_nonlocal_decl): Defer inline variables.
* decl2.cc (import_export_decl): Support inline variables.

gcc/testsuite/ChangeLog:

* g++.dg/debug/dwarf2/inline-var-1.C: Reference 'a' to ensure it
is emitted.
* g++.dg/debug/dwarf2/inline-var-3.C: Likewise.
* g++.dg/modules/init-7_a.H: New test.
* g++.dg/modules/init-7_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>

aarch64/testsuite: Remove dg-excess-errors from c-c++-common/gomp/pr63328.c and gcc.dg/gomp/pr87895-2.c [PR113861]

These now pass after r14-6416-gf5fc001a84a7db so let's remove the dg-excess-errors from them.

Committed as obvious after a test for aarch64-linux-gnu.

gcc/testsuite/ChangeLog:

PR testsuite/113861
* c-c++-common/gomp/pr63328.c: Remove dg-excess-errors.
* gcc.dg/gomp/pr87895-2.c: Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

Fix ICE in loop splitting with -fno-guess-branch-probability

PR tree-optimization/111054

gcc/ChangeLog:

* tree-ssa-loop-split.cc (split_loop): Check for profile being present.

gcc/testsuite/ChangeLog:

* gcc.c-torture/compile/pr111054.c: New test.

middle-end: inspect all exits for additional annotations for loop.

Attaching a pragma to a loop which has a complex condition often gets the pragma
dropped. e.g.

#pragma GCC novector
 while (i < N && parse_tables_n--)

before lowering this is represented as:

if (ANNOTATE_EXPR ) ...

But after lowering the condition is broken appart and attached to the final
component of the expression:

 if (parse_tables_n.2_2 != 0) goto <D.4456>; else goto <D.4453>;
 <D.4456>:
 iftmp.1D.4452 = 1;
 goto <D.4454>;
 <D.4453>:
 iftmp.1D.4452 = 0;
 <D.4454>:
 D.4451 = .ANNOTATE (iftmp.1D.4452, 2, 0);
 if (D.4451 != 0) goto <D.4442>; else goto <D.4440>;
 <D.4440>:

and it's never heard from again because during replace_loop_annotate we only
inspect the loop header and latch for annotations.

Since annotations were supposed to apply to the loop as a whole this fixes it
by checking the loop exit src blocks for annotations instead.

gcc/ChangeLog:

* tree-cfg.cc (replace_loop_annotate): Inspect loop edges for annotations.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-novect_gcond.c: New test.

Fortran: Implement read_x for UTF-8 encoded files.

PR fortran/99210

libgfortran/ChangeLog:

* io/read.c (read_x): If UTF-8 encoding is enabled, use
read_utf8 to move one character over in the read buffer.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr99210.f90: New test.

coreutils-sum-pr108666.c: fix spurious LLP64 warnings

Fixes the following warnings on x86_64-w64-mingw32:
coreutils-sum-pr108666.c:17:1: warning: conflicting types for built-in function ‘memcpy’; expected ‘void *(void *, const void *, long long unsigned int)’ [-Wbuiltin-declaration-mismatch]
 17 | memcpy(void* __restrict __dest, const void* __restrict __src, size_t __n)
 | ^~~~~~

coreutils-sum-pr108666.c:25:1: warning: conflicting types for built-in function ‘malloc’; expected ‘void *(long long unsigned int)’ [-Wbuiltin-declaration-mismatch]
 25 | malloc(size_t __size) __attribute__((__nothrow__, __leaf__))
 | ^~~~~~

gcc/testsuite:

* c-c++-common/analyzer/coreutils-sum-pr108666.c: Use
__SIZE_TYPE__ instead of long unsigned int for size_t
definition.

Signed-off-by: Jonathan Yong <10walls@gmail.com>

c++: synthesized_method_walk context independence [PR113908]

In the second testcase below, during ahead of time checking of the
non-dependent new-expr we synthesize B's copy ctor, which we expect to
get defined as deleted since A's copy ctor is inaccessible. But during
access checking thereof, enforce_access incorrectly decides to defer it
since we're in a template context according to current_template_parms
(before r14-557 it checked processing_template_decl which got cleared
from implicitly_declare_fn), which leads to the access check leaking out
to the template context that triggered the synthesization, and B's copy
ctor getting declared as non-deleted.

This patch fixes this by using maybe_push_to_top_level to clear the
context (including current_template_parms) before proceeding with the
synthesization. We could do this from implicitly_declare_fn, but it's
better to do it more generally from synthesized_method_walk for sake of
its other callers.

This turns out to fix PR113332 as well: there the lambda context
triggering synthesization was causing maybe_dummy_object to misbehave,
but now synthesization is sufficiently context-independent.

PR c++/113908
PR c++/113332

gcc/cp/ChangeLog:

* method.cc (synthesized_method_walk): Use maybe_push_to_top_level.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-nsdmi11.C: New test.
* g++.dg/template/non-dependent31.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>

tree-optimization/113910 - huge compile time during PTA

For the testcase in PR113910 we spend a lot of time in PTA comparing
bitmaps for looking up equivalence class members. This points to
the very weak bitmap_hash function which effectively hashes set
and a subset of not set bits.

The major problem with it is that it simply truncates the
BITMAP_WORD sized intermediate hash to hashval_t which is
unsigned int, effectively not hashing half of the bits.

This reduces the compile-time for the testcase from tens of minutes
to 42 seconds and PTA time from 99% to 46%.

PR tree-optimization/113910
* bitmap.cc (bitmap_hash): Mix the full element "hash" to
the hashval_t hash.

testsuite: gdc: Require ucn in gdc.test/runnable/mangle.d etc. [PR104739]

gdc.test/runnable/mangle.d and two other tests come out UNRESOLVED on
Solaris with the native assembler:

UNRESOLVED: gdc.test/runnable/mangle.d compilation failed to produce executable
UNRESOLVED: gdc.test/runnable/mangle.d -shared-libphobos compilation failed
to produce executable
UNRESOLVED: gdc.test/runnable/testmodule.d compilation failed to produce
executable
UNRESOLVED: gdc.test/runnable/testmodule.d -shared-libphobos compilation
failed to produce executable
UNRESOLVED: gdc.test/runnable/ufcs.d compilation failed to produce executable
UNRESOLVED: gdc.test/runnable/ufcs.d -shared-libphobos compilation failed
to produce executable

Assembler: mangle.d
 "/var/tmp//cci9q2Sc.s", line 115 : Syntax error
 Near line: " movzbl test_эльфийские_письмена_9, %eax"
 "/var/tmp//cci9q2Sc.s", line 115 : Syntax error
 Near line: " movzbl test_эльфийские_письмена_9, %eax"
 "/var/tmp//cci9q2Sc.s", line 115 : Syntax error
 Near line: " movzbl test_эльфийские_письмена_9, %eax"
 "/var/tmp//cci9q2Sc.s", line 115 : Syntax error
 Near line: " movzbl test_эльфийские_письмена_9, %eax"
 "/var/tmp//cci9q2Sc.s", line 115 : Syntax error
[...]

since /bin/as lacks UCN support.

Iain recently added UNICODE_NAMES: annotations to the affected tests and
those recently were imported into trunk.

This patch handles the DejaGnu side of things, adding

{ dg-require-effective-target ucn }

to those tests on the fly.

Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11 (as and gas each),
and x86_64-pc-linux-gnu.

2024-02-03 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR d/104739
* lib/gdc-utils.exp (gdc-convert-test) <UNICODE_NAMES>: Require
ucn support.

vect/testsuite: Fix vect-simd-clone-1[02].c when dg-do default is compile [PR113899]

The vect testsuite will chose the dg-do default based on if it knows the
running target does not support running with the vector extensions enabled
(for easy of testing). The problem is when it is decided the default is compile
instead of run, dg-additional-sources does not work. So the fix is to set
dg-do on these two testcases to run explicitly.

Tested on x86_64 with a hack to check_vect_support_and_set_flags to set the dg-default
to compile.

gcc/testsuite/ChangeLog:

PR testsuite/113899
* gcc.dg/vect/vect-simd-clone-10.c: Add `dg-do run`
* gcc.dg/vect/vect-simd-clone-12.c: Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

testsuite: Add %[zt][diox] tests to gcc.dg/format/

On Mon, Feb 12, 2024 at 04:10:33PM +0000, Joseph Myers wrote:
> Please also add some tests of format checking for these modifiers in
> gcc.dg/format/gcc_*.c.

The following patch does that.

Haven't added tests for bad type (but I think we don't have them in
c99-printf* either) for these because it is hard to figure out what
type from {,unsigned }{int,long,long long} size_t/ptrdiff_t certainly
is not, guess one could do that with preprocessor conditionals, e.g.
comparing __PTRDIFF_MAX__ with __INT_MAX__, __LONG_MAX__ and
__LONG_LONG_MAX__ and pick up the one which is different; but we'd need
to find out corresponding effective targets for the expected diagnostics.

2024-02-14 Jakub Jelinek <jakub@redhat.com>

* gcc.dg/format/gcc_diag-1.c (foo): Add tests for z and t modifiers.
* gcc.dg/format/gcc_gfc-1.c (foo): Add tests for ll, z and t modifiers.

pretty-print: Fix up ptrdiff handling for %tu, %to, %tx

This is IMHO more of a theoretical case, I believe my current code
doesn't handle %tu or %to or %tx correctly if ptrdiff_t is not one of
int, long int or long long int. For size_t and %zd or %zi I'm
using va_arg (whatever, ssize_t) and hope that ssize_t is the signed
type corresponding to size_t which C99 talks about.
For ptrdiff_t there is no type for unsigned type corresponding to
ptrdiff_t and I'm not aware of a portable way on the host to get
such a type (the fmt tests use hacks like
#define signed /* Type might or might not have explicit 'signed'. */
typedef unsigned __PTRDIFF_TYPE__ unsigned_ptrdiff_t;
#undef signed
but that only works with compilers which predefine __PTRDIFF_TYPE__),
std::make_unsigned<ptrdiff_t> I believe only works reliably if
ptrdiff_t is one of char, signed char, short, int, long or long long,
but won't work e.g. for __int20__ or whatever other weird ptrdiff_t
the host might have.

The following patch assumes host is two's complement (I think we
rely on it pretty much everywhere anyway) and prints unsigned type
corresponding to ptrdiff_t as unsigned long long printing of
ptrdiff_t value masked with 2ULL * PTRDIFF_MAX + 1. So the only
actual limitation is that it doesn't support ptrdiff_t wider than
long long int.

2024-02-14 Jakub Jelinek <jakub@redhat.com>

* pretty-print.cc (PTRDIFF_MAX): Define if not yet defined.
(pp_integer_with_precision): For unsigned ptrdiff_t printing
with u, o or x print ptrdiff_t argument converted to
unsigned long long and masked with 2ULL * PTRDIFF_MAX + 1.

* error.cc (error_print): For u printing of ptrdiff_t,
print ptrdiff_t argument converted to unsigned long long and
masked with 2ULL * PTRDIFF_MAX + 1.

middle-end/113576 - zero padding of vector bools when expanding compares

The following zeros paddings of vector bools when expanding compares
and the mode used for the compare is an integer mode. In that case
targets cannot distinguish between a 4 element and 8 element vector
compare (both get to the QImode compare optab) so we have to do the
job in the middle-end.

PR middle-end/113576
* expr.cc (do_store_flag): For vector bool compares of vectors
with padding zero that.
* dojump.cc (do_compare_and_jump): Likewise.

c++: Fix error recovery when redeclaring enum in different module [PR99573]

This ensures that with modules enabled, redeclaring an enum in the wrong
module or with the wrong underlying type no longer ICEs.

The patch also rearranges the order of the checks a little because I
think it's probably more important to note that you can't redeclare the
enum all before complaining about mismatched underlying types etc.

As a drive by this patch also adds some missing diagnostic groups, and
rewords the module redeclaration error message to more closely match the
wording used in other places this check is done.

PR c++/99573

gcc/cp/ChangeLog:

* decl.cc (start_enum): Reorder check for redeclaring in module.
Add missing auto_diagnostic_groups.

gcc/testsuite/ChangeLog:

* g++.dg/modules/enum-12.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Jason Merrill <jason@redhat.com>

testsuite: i386: Skip gcc.target/i386/pr113689-1.c etc. on Solaris [PR113909]

gcc.target/i386/pr113689-[1-3].c FAIL on 64-bit Solaris/x86:

FAIL: gcc.target/i386/pr113689-1.c (test for excess errors)
UNRESOLVED: gcc.target/i386/pr113689-1.c compilation failed to produce executable
FAIL: gcc.target/i386/pr113689-2.c (test for excess errors)
UNRESOLVED: gcc.target/i386/pr113689-2.c compilation failed to produce executable
FAIL: gcc.target/i386/pr113689-3.c (test for excess errors)
UNRESOLVED: gcc.target/i386/pr113689-3.c compilation failed to produce executable

with

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr113689-1.c:43:1: sorry, unimplemented: no register available for profiling '-mcmodel=large'

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr113689-2.c:26:1: sorry, unimplemented: profiling '-mcmodel=large' with PIC is not supported

Excess errors:
/vol/gcc/src/hg/master/local/gcc/testsuite/gcc.target/i386/pr113689-3.c:15:1: sorry, unimplemented: profiling '-mcmodel=large' with PIC is not supported

This happens because i386/sol2.h doesn't define NO_PROFILE_COUNTERS.

So this patch just skips the tests on Solaris.

Tested on i386-pc-solaris2.11.

2024-02-13 Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR target/113909
* gcc.target/i386/pr113689-1.c: Skip on Solaris.
* gcc.target/i386/pr113689-2.c: Likewise.
* gcc.target/i386/pr113689-3.c: Likewise.