]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
18 months agox86: Don't save callee-saved registers in noreturn functions
H.J. Lu [Tue, 23 Jan 2024 14:59:51 +0000 (06:59 -0800)] 
x86: Don't save callee-saved registers in noreturn functions

There is no need to save callee-saved registers in noreturn functions
if they don't throw nor support exceptions.  We can treat them the same
as functions with no_callee_saved_registers attribute.

Adjust stack-check-17.c for noreturn function which no longer saves any
registers.

With this change, __libc_start_main in glibc 2.39, which is a noreturn
function, is changed from

__libc_start_main:
endbr64
push   %r15
push   %r14
mov    %rcx,%r14
push   %r13
push   %r12
push   %rbp
mov    %esi,%ebp
push   %rbx
mov    %rdx,%rbx
sub    $0x28,%rsp
mov    %rdi,(%rsp)
mov    %fs:0x28,%rax
mov    %rax,0x18(%rsp)
xor    %eax,%eax
test   %r9,%r9

to

__libc_start_main:
endbr64
        sub    $0x28,%rsp
        mov    %esi,%ebp
        mov    %rdx,%rbx
        mov    %rcx,%r14
        mov    %rdi,(%rsp)
        mov    %fs:0x28,%rax
        mov    %rax,0x18(%rsp)
        xor    %eax,%eax
        test   %r9,%r9

In Linux kernel 6.7.0 on x86-64, do_exit is changed from

do_exit:
        endbr64
        call   <do_exit+0x9>
        push   %r15
        push   %r14
        push   %r13
        push   %r12
        mov    %rdi,%r12
        push   %rbp
        push   %rbx
        mov    %gs:0x0,%rbx
        sub    $0x28,%rsp
        mov    %gs:0x28,%rax
        mov    %rax,0x20(%rsp)
        xor    %eax,%eax
        call   *0x0(%rip)        # <do_exit+0x39>
        test   $0x2,%ah
        je     <do_exit+0x8d3>

to

do_exit:
        endbr64
        call   <do_exit+0x9>
        sub    $0x28,%rsp
        mov    %rdi,%r12
        mov    %gs:0x28,%rax
        mov    %rax,0x20(%rsp)
        xor    %eax,%eax
        mov    %gs:0x0,%rbx
        call   *0x0(%rip)        # <do_exit+0x2f>
        test   $0x2,%ah
        je     <do_exit+0x8c9>

I compared GCC master branch bootstrap and test times on a slow machine
with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13
with the backported patch.  The performance data isn't precise since the
measurements were done on different days with different GCC sources under
different 6.6 kernel versions.

GCC master branch build time in seconds:

before                after                  improvement
30043.75user          30013.16user           0%
1274.85system         1243.72system          2.4%

GCC master branch test time in seconds (new tests added):

before                after                  improvement
216035.90user         216547.51user          0
27365.51system        26658.54system         2.6%

gcc/

PR target/38534
* config/i386/i386-options.cc (ix86_set_func_type): Don't
save and restore callee saved registers for a noreturn function
with nothrow or compiled with -fno-exceptions.

gcc/testsuite/

PR target/38534
* gcc.target/i386/pr38534-1.c: New file.
* gcc.target/i386/pr38534-2.c: Likewise.
* gcc.target/i386/pr38534-3.c: Likewise.
* gcc.target/i386/pr38534-4.c: Likewise.
* gcc.target/i386/stack-check-17.c: Updated.

18 months agox86: Add no_callee_saved_registers function attribute
H.J. Lu [Tue, 23 Jan 2024 14:59:50 +0000 (06:59 -0800)] 
x86: Add no_callee_saved_registers function attribute

When an interrupt handler is implemented by an assembly stub which does:

1. Save all registers.
2. Call a C function.
3. Restore all registers.
4. Return from interrupt.

it is completely unnecessary to save and restore any registers in the C
function called by the assembly stub, even if they would normally be
callee-saved.

Add no_callee_saved_registers function attribute, which is complementary
to no_caller_saved_registers function attribute, to mark a function which
doesn't have any callee-saved registers.  Such a function won't save and
restore any registers.  Classify function call-saved register handling
type with:

1. Default call-saved registers.
2. No caller-saved registers with no_caller_saved_registers attribute.
3. No callee-saved registers with no_callee_saved_registers attribute.

Disallow sibcall if callee is a no_callee_saved_registers function
and caller isn't a no_callee_saved_registers function.  Otherwise,
callee-saved registers won't be preserved.

After a no_callee_saved_registers function is called, all registers may
be clobbered.  If the calling function isn't a no_callee_saved_registers
function, we need to preserve all registers which aren't used by function
calls.

gcc/

PR target/103503
PR target/113312
* config/i386/i386-expand.cc (ix86_expand_call): Replace
no_caller_saved_registers check with call_saved_registers check.
Clobber all registers that are not used by the callee with
no_callee_saved_registers attribute.
* config/i386/i386-options.cc (ix86_set_func_type): Set
call_saved_registers to TYPE_NO_CALLEE_SAVED_REGISTERS for
noreturn function.  Disallow no_callee_saved_registers with
interrupt or no_caller_saved_registers attributes together.
(ix86_set_current_function): Replace no_caller_saved_registers
check with call_saved_registers check.
(ix86_handle_no_caller_saved_registers_attribute): Renamed to ...
(ix86_handle_call_saved_registers_attribute): This.
(ix86_gnu_attributes): Add
ix86_handle_call_saved_registers_attribute.
* config/i386/i386.cc (ix86_conditional_register_usage): Replace
no_caller_saved_registers check with call_saved_registers check.
(ix86_function_ok_for_sibcall): Don't allow callee with
no_callee_saved_registers attribute when the calling function
has callee-saved registers.
(ix86_comp_type_attributes): Also check
no_callee_saved_registers.
(ix86_epilogue_uses): Replace no_caller_saved_registers check
with call_saved_registers check.
(ix86_hard_regno_scratch_ok): Likewise.
(ix86_save_reg): Replace no_caller_saved_registers check with
call_saved_registers check.  Don't save any registers for
TYPE_NO_CALLEE_SAVED_REGISTERS.  Save all registers with
TYPE_DEFAULT_CALL_SAVED_REGISTERS if function with
no_callee_saved_registers attribute is called.
(find_drap_reg): Replace no_caller_saved_registers check with
call_saved_registers check.
* config/i386/i386.h (call_saved_registers_type): New enum.
(machine_function): Replace no_caller_saved_registers with
call_saved_registers.
* doc/extend.texi: Document no_callee_saved_registers attribute.

gcc/testsuite/

PR target/103503
PR target/113312
* gcc.dg/torture/no-callee-saved-run-1a.c: New file.
* gcc.dg/torture/no-callee-saved-run-1b.c: Likewise.
* gcc.target/i386/no-callee-saved-1.c: Likewise.
* gcc.target/i386/no-callee-saved-2.c: Likewise.
* gcc.target/i386/no-callee-saved-3.c: Likewise.
* gcc.target/i386/no-callee-saved-4.c: Likewise.
* gcc.target/i386/no-callee-saved-5.c: Likewise.
* gcc.target/i386/no-callee-saved-6.c: Likewise.
* gcc.target/i386/no-callee-saved-7.c: Likewise.
* gcc.target/i386/no-callee-saved-8.c: Likewise.
* gcc.target/i386/no-callee-saved-9.c: Likewise.
* gcc.target/i386/no-callee-saved-10.c: Likewise.
* gcc.target/i386/no-callee-saved-11.c: Likewise.
* gcc.target/i386/no-callee-saved-12.c: Likewise.
* gcc.target/i386/no-callee-saved-13.c: Likewise.
* gcc.target/i386/no-callee-saved-14.c: Likewise.
* gcc.target/i386/no-callee-saved-15.c: Likewise.
* gcc.target/i386/no-callee-saved-16.c: Likewise.
* gcc.target/i386/no-callee-saved-17.c: Likewise.
* gcc.target/i386/no-callee-saved-18.c: Likewise.

18 months agolower-bitint: Avoid sign-extending cast to unsigned types feeding div/mod/float ...
Jakub Jelinek [Sat, 27 Jan 2024 12:06:55 +0000 (13:06 +0100)] 
lower-bitint: Avoid sign-extending cast to unsigned types feeding div/mod/float [PR113614]

The following testcase is miscompiled, because some narrower value
is sign-extended to wider unsigned _BitInt used as division operand.
handle_operand_addr for that case returns the narrower value and
precision -prec_of_narrower_value.  That works fine for multiplication
(at least, normal multiplication, but we don't merge casts with
.MUL_OVERFLOW or the ubsan multiplication right now), because the
result is the same whether we treat the arguments as signed or unsigned.
But is completely wrong for division/modulo or conversions to
floating-point, if we pass negative prec for an input operand of a libgcc
handler, those treat it like a negative number, not an unsigned one
sign-extended from something smaller (and it doesn't know to what precision
it has been extended).

So, the following patch fixes it by making sure we don't merge such
sign-extensions to unsigned _BitInt type with division, modulo or
conversions to floating point.

2024-01-27  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/113614
* gimple-lower-bitint.cc (gimple_lower_bitint): Don't merge
widening casts from signed to unsigned types with TRUNC_DIV_EXPR,
TRUNC_MOD_EXPR or FLOAT_EXPR uses.

* gcc.dg/torture/bitint-54.c: New test.

18 months agolower-bitint: Fix up VIEW_CONVERT_EXPR handling in lower_mergeable_stmt [PR113568]
Jakub Jelinek [Sat, 27 Jan 2024 12:06:17 +0000 (13:06 +0100)] 
lower-bitint: Fix up VIEW_CONVERT_EXPR handling in lower_mergeable_stmt [PR113568]

We generally allow merging mergeable stmts with some final cast (but not
further casts or mergeable operations after the cast).  As some casts
are handled conditionally, if (idx < cst) handle_operand (idx); else if
idx == cst) handle_operand (cst); else ..., we must sure that e.g. the
mergeable PLUS_EXPR/MINUS_EXPR/NEGATE_EXPR never appear in handle_operand
called from such casts, because it ICEs on invalid SSA_NAME form (that part
could be fixable by adding further PHIs) but also because we'd need to
correctly propagate the overflow flags from the if to else if.
So, instead lower_mergeable_stmt handles an outermost widening cast (or
widening cast feeding outermost store) specially.
The problem was similar to PR113408, that VIEW_CONVERT_EXPR tree is
present in the gimple_assign_rhs1 while it is not for NOP_EXPR/CONVERT_EXPR,
so the checks whether the outermost cast should be handled didn't handle
the VCE case and so handle_plus_minus was called from the conditional
handle_cast.

2024-01-27  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/113568
* gimple-lower-bitint.cc (bitint_large_huge::lower_mergeable_stmt):
For VIEW_CONVERT_EXPR use first operand of rhs1 instead of rhs1
in the widening extension checks.

* gcc.dg/bitint-78.c: New test.

18 months agolower-bitint: Add debugging dump of SSA_NAME -> decl mappings
Jakub Jelinek [Sat, 27 Jan 2024 12:05:30 +0000 (13:05 +0100)] 
lower-bitint: Add debugging dump of SSA_NAME -> decl mappings

While the SSA coalescing performed by lower bitint prints some information
if -fdump-tree-bitintlower-details, it is really hard to read and doesn't
contain the most important information which one looks for when debugging
bitint lowering issues, namely what VAR_DECLs (or PARM_DECLs/RESULT_DECLs)
each SSA_NAME in large_huge.m_names bitmap maps to.

So, the following patch adds dumping of that, so that we know that say
_3 -> bitint.3
_8 -> bitint.7
_16 -> bitint.7
etc.

2024-01-27  Jakub Jelinek  <jakub@redhat.com>

* gimple-lower-bitint.cc (gimple_lower_bitint): For
TDF_DETAILS dump mapping of SSA_NAMEs to decls.

18 months agoc-family: Fix ICE with large column number after restoring a PCH [PR105608]
Lewis Hyatt [Tue, 5 Dec 2023 16:33:39 +0000 (11:33 -0500)] 
c-family: Fix ICE with large column number after restoring a PCH [PR105608]

Users are allowed to define macros prior to restoring a precompiled header
file, as long as those macros are not defined (or are defined identically)
in the PCH.  However, the PCH restoration process destroys all the macro
definitions, so libcpp has to record them before restoring the PCH and then
redefine them afterward.

This process does not currently assign great locations to the macros after
redefining them. Some work is needed to also remember the original locations
and get the line_maps instance in the right state (since, like all other
data structures, the line_maps instance is also reset after restoring a PCH).
The new testcase line-map-3.C contains XFAILed examples where the locations
are wrong.

This patch addresses a more pressing issue, which is that we ICE in some
cases since GCC 11, hitting an assert in line-maps.cc. It happens if the
first line encountered after the PCH restore requires an LC_RENAME map, such
as will happen if the line is sufficiently long.  This is much easier to
fix, since we just need to call linemap_line_start before asking libcpp to
redefine the stored macros, instead of afterward, to avoid the unexpected
need for an LC_RENAME before an LC_ENTER has been seen.

gcc/c-family/ChangeLog:

PR preprocessor/105608
* c-pch.cc (c_common_read_pch): Start a new line map before asking
libcpp to restore macros defined prior to reading the PCH, instead
of afterward.

gcc/testsuite/ChangeLog:

PR preprocessor/105608
* g++.dg/pch/line-map-1.C: New test.
* g++.dg/pch/line-map-1.Hs: New test.
* g++.dg/pch/line-map-2.C: New test.
* g++.dg/pch/line-map-2.Hs: New test.
* g++.dg/pch/line-map-3.C: New test.
* g++.dg/pch/line-map-3.Hs: New test.

18 months agoDaily bump.
GCC Administrator [Sat, 27 Jan 2024 00:18:16 +0000 (00:18 +0000)] 
Daily bump.

18 months agoc/c++: Tweak warning for 'always_inline function might not be inlinable'
Hans-Peter Nilsson [Fri, 26 Jan 2024 23:55:01 +0000 (00:55 +0100)] 
c/c++: Tweak warning for 'always_inline function might not be inlinable'

When you're not regularly exposed to this warning, it is
easy to be misled by its wording, believing that there's
something else in the function that stops it from being
inlined, something other than the lack of also being
*declared* inline.  Also, clang does not warn.

It's just a warning: without the inline directive, there has
to be a secondary reason for the function to be inlined,
other than the always_inline attribute, a reason that may be
in effect despite the warning.

Whenever the text is quoted in inline-related bugzilla
entries, there seems to often have been an initial step of
confusion that has to be cleared, for example in PR55830.
A file in the powerpc-specific parts of the test-suite,
gcc.target/powerpc/vec-extract-v16qiu-v2.h, has a comment
and seems to be another example, and I testify as the
first-hand third "experience".  The wording has been the
same since the warning was added.

Let's just tweak the wording, adding the cause, so that the
reason for the warning is clearer.  This hopefully stops the
user from immediately asking "'Might'?  Because why?"  and
then going off looking at the function body - or grepping
the gcc source or documentation, or enter a bug-report
subsequently closed as resolved/invalid.

Since the message is only appended with additional
information, no test-case actually required adjustment.
I still changed them, so the message is covered.

gcc:
* cgraphunit.cc (process_function_and_variable_attributes): Tweak
the warning for an attribute-always_inline without inline declaration.

gcc/testsuite:
* g++.dg/Wattributes-3.C: Adjust expected warning.
* gcc.dg/fail_always_inline.c: Ditto.

18 months agoc++: Stream additional fields for DECL_STRUCT_FUNCTION [PR113580]
Nathaniel Shead [Fri, 26 Jan 2024 05:55:52 +0000 (16:55 +1100)] 
c++: Stream additional fields for DECL_STRUCT_FUNCTION [PR113580]

Currently the DECL_STRUCT_FUNCTION for a declaration is always
reconstructed from scratch. This causes issues though, as some fields
used by other parts of the compiler (in this case, specifically
'function_{start,end}_locus') are then not correctly initialised. This
patch makes sure that these fields are also read and written.

PR c++/113580

gcc/cp/ChangeLog:

* module.cc (struct post_process_data): Create.
(trees_in::post_decls): Use.
(trees_in::post_process): Return entire vector at once.
Change overload to take post_process_data instead of tree.
(trees_out::write_function_def): Write needed flags from
DECL_STRUCT_FUNCTION.
(trees_in::read_function_def): Read them and pass to
post_process.
(module_state::read_cluster): Write flags into cfun.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr113580_a.C: New test.
* g++.dg/modules/pr113580_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
18 months agoRISC-V/testsuite: Add RTL cset-sext.c testcase variants
Maciej W. Rozycki [Fri, 26 Jan 2024 21:47:40 +0000 (21:47 +0000)] 
RISC-V/testsuite: Add RTL cset-sext.c testcase variants

Add RTL tests, for RV64 and RV32 where appropriate, corresponding to the
existing cset-sext.c tests.  They have been produced from RTL code as at
the entry of the "ce1" pass for the respective cset-sext.c tests built
at -O3.

gcc/testsuite/
* gcc.target/riscv/cset-sext-rtl.c: New file.
* gcc.target/riscv/cset-sext-rtl32.c: New file.
* gcc.target/riscv/cset-sext-sfb-rtl.c: New file.
* gcc.target/riscv/cset-sext-sfb-rtl32.c: New file.
* gcc.target/riscv/cset-sext-thead-rtl.c: New file.
* gcc.target/riscv/cset-sext-ventana-rtl.c: New file.
* gcc.target/riscv/cset-sext-zicond-rtl.c: New file.
* gcc.target/riscv/cset-sext-zicond-rtl32.c: New file.

18 months agoRISC-V/testsuite: Add RTL pr105314.c testcase variants
Maciej W. Rozycki [Fri, 26 Jan 2024 21:47:40 +0000 (21:47 +0000)] 
RISC-V/testsuite: Add RTL pr105314.c testcase variants

Add a pair of RTL tests, for RV64 and RV32 respectively, corresponding
to the existing pr105314.c test.  They have been produced from RTL code
as at the entry of the "ce1" pass for pr105314.c compiled at -O3.

gcc/testsuite/
* gcc.target/riscv/pr105314-rtl.c: New file.
* gcc.target/riscv/pr105314-rtl32.c: New file.

18 months agoRISC-V/testsuite: Also verify if-conversion runs for pr105314.c
Maciej W. Rozycki [Fri, 26 Jan 2024 21:47:40 +0000 (21:47 +0000)] 
RISC-V/testsuite: Also verify if-conversion runs for pr105314.c

Verify that if-conversion succeeded through noce_try_store_flag_mask, as
per PR rtl-optimization/105314, tightening the test case and making it
explicit.

gcc/testsuite/
* gcc.target/riscv/pr105314.c: Scan the RTL "ce1" pass too.

18 months agoRISC-V/testsuite: Widen coverage for pr105314.c
Maciej W. Rozycki [Fri, 26 Jan 2024 21:47:40 +0000 (21:47 +0000)] 
RISC-V/testsuite: Widen coverage for pr105314.c

The optimization levels pr105314.c is iterated over are needlessly
overridden with "-O2", limiting the coverage of the test case to that
level, perhaps with additional options the original optimization level
has been supplied with.  We could prevent the extra iterations other
than "-O2" from being run, but the transformation made by if-conversion
is also expected to happen at other optimization levels, so include them
all, and also make sure no reverse-condition branch appears in output,
moving the `dg-final' command to the bottom, as with most test cases.

gcc/testsuite/
* gcc.target/riscv/pr105314.c: Replace `dg-options' command with
`dg-skip-if'.  Also reject "bne" with `dg-final'.

18 months agogenopinit: Split init_all_optabs [PR113575].
Robin Dapp [Wed, 24 Jan 2024 16:28:31 +0000 (17:28 +0100)] 
genopinit: Split init_all_optabs [PR113575].

init_all_optabs initializes > 10000 patterns for riscv targets.  This
leads to pathological situations in dataflow analysis (which can occur
with many adjacent stores).
To alleviate this this patch makes genopinit split the init_all_optabs
function into several init_optabs_xx functions that each initialize 1000
patterns.

With this change insn-opinit.cc's compilation time is reduced from 4+
minutes to 1:30 and memory consumption decreases from 1.2G to 630M.

gcc/ChangeLog:

PR other/113575

* genopinit.cc (main): Split init_all_optabs into functions
of 1000 patterns each.

18 months agomodula2: detect string and pointer formal and actual parameter incompatibility
Gaius Mulley [Fri, 26 Jan 2024 19:04:48 +0000 (19:04 +0000)] 
modula2: detect string and pointer formal and actual parameter incompatibility

This patch improves the location accuracy of parameters and fixes bugs
in parameter checking in M2Check.  It also corrects the location
of constant declarations.

gcc/m2/ChangeLog:

* gm2-compiler/M2Check.mod (dumpIndice): New procedure.
(dumpIndex): New procedure.
(dumptInfo): New procedure.
(buildError4): Add comment and pass formal and actual to
MetaError4.  Improve text describing error.
(buildError2): Generate different error descriptions for
the three error kinds.
(checkConstMeta): Add block comment.  Add more meta checks
and call doCheckPair to complete string const checking.
Add tinfo parameter.
(checkConstEquivalence): Add tinfo parameter.
* gm2-compiler/M2GCCDeclare.mod (PrintVerboseFromList):
Print the length of a const string.
* gm2-compiler/M2GenGCC.mod (CodeParam): Remove parameters
op1, op2 and op3.
(doParam): Add paramtok parameter.  Use paramtok instead rather
than CurrentQuadToken.
(CodeParam): Rewrite.
* gm2-compiler/M2Quads.mod (CheckProcedureParameters):
Add comments explaining that const strings are not checked
in M2Quads.mod.
(FailParameter): Use MetaErrorT2 with tokpos rather than
MetaError2.
(doBuildBinaryOp): Assign OldPos and OperatorPos before the
IF block.
* gm2-compiler/SymbolTable.mod (PutConstString): Add call to
InitWhereDeclaredTok.

gcc/testsuite/ChangeLog:

* gm2/pim/fail/badpointer4.mod: New test.
* gm2/pim/fail/strconst.def: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
18 months agoAvoid registering unsupported OMP offload devices
Richard Biener [Fri, 26 Jan 2024 11:57:10 +0000 (12:57 +0100)] 
Avoid registering unsupported OMP offload devices

The following avoids registering unsupported GCN offload devices
when iterating over available ones.  With a Zen4 desktop CPU
you will have an IGPU (unspported) which will otherwise be made
available.  This causes testcases like
libgomp.c-c++-common/non-rect-loop-1.c which iterate over all
decives to FAIL.

libgomp/
* plugin/plugin-gcn.c (suitable_hsa_agent_p): Filter out
agents with unsupported ISA.

18 months agoFix architecture support in OMP_OFFLOAD_init_device for gcn
Richard Biener [Fri, 26 Jan 2024 11:35:57 +0000 (12:35 +0100)] 
Fix architecture support in OMP_OFFLOAD_init_device for gcn

The following makes the existing architecture support check work
instead of being optimized away (enum vs. -1).  This avoids
later asserts when we assume such devices are never actually
used.

libgomp/
* plugin/plugin-gcn.c
(EF_AMDGPU_MACH::EF_AMDGPU_MACH_UNSUPPORTED): Add.
(isa_code): Return that instead of -1.
(GOMP_OFFLOAD_init_device): Adjust.

18 months agoamdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs
Tobias Burnus [Fri, 26 Jan 2024 14:11:09 +0000 (15:11 +0100)] 
amdgcn: config.gcc - enable gfx1030 and gfx1100 multilib; add them to the docs

gcc/ChangeLog:

* config.gcc (amdgcn-*-*): Add gfx1030 and gfx1100 to
TM_MULTILIB_CONFIG.
* doc/install.texi (Configuration amdgcn-*-*): Mention gfx1030/gfx1100.
* doc/invoke.texi (AMD GCN Options): Add gfx1030 and gfx1100 to
-march/-mtune.

libgomp/ChangeLog:

* testsuite/libgomp.c/declare-variant-4.h: Add variant functions
for gfx1030 and gfx1100.
* testsuite/libgomp.c/declare-variant-4-gfx1030.c: New test.
* testsuite/libgomp.c/declare-variant-4-gfx1100.c: New test.

Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
18 months agoamdgcn: additional gfx1030/gfx1100 support
Andrew Stubbs [Wed, 24 Jan 2024 11:07:28 +0000 (11:07 +0000)] 
amdgcn: additional gfx1030/gfx1100 support

This is enough to get gfx1030 and gfx1100 working; there are still some test
failures to investigate, and probably some tuning to do.

gcc/ChangeLog:

* config/gcn/gcn-opts.h (TARGET_PACKED_WORK_ITEMS): Add TARGET_RDNA3.
* config/gcn/gcn-valu.md (all_convert): New iterator.
(<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>2<exec>): New
define_expand, and rename the old one to ...
(*<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>_sdwa<exec>): ... this.
(extend<V_INT_1REG_ALT:mode><V_INT_1REG:mode>2<exec>): Likewise, to ...
(extend<V_INT_1REG_ALT:mode><V_INT_1REG:mode>_sdwa<exec>): .. this.
(*<convop><V_INT_1REG_ALT:mode><V_INT_1REG:mode>_shift<exec>): New.
* config/gcn/gcn.cc (gcn_global_address_p): Use "offsetbits" correctly.
(gcn_hsa_declare_function_name): Update the vgpr counting for gfx1100.
* config/gcn/gcn.md (<u>mulhisi3): Disable on RDNA3.
(<u>mulqihi3_scalar): Likewise.

libgcc/ChangeLog:

* config/gcn/amdgcn_veclib.h (CDNA3_PLUS): Handle RDNA3.

libgomp/ChangeLog:

* config/gcn/time.c (RTC_TICKS): Configure RDNA3.
(omp_get_wtime): Add RDNA3-compatible variant.
* plugin/plugin-gcn.c (max_isa_vgprs): Tune for gfx1030 and gfx1100.

Signed-off-by: Andrew Stubbs <ams@baylibre.com>
18 months agoc++: Emit definitions of ODR-used static members imported from modules [PR112899]
Nathaniel Shead [Tue, 2 Jan 2024 22:27:06 +0000 (09:27 +1100)] 
c++: Emit definitions of ODR-used static members imported from modules [PR112899]

Static data members marked 'inline' should be emitted in TUs where they
are ODR-used.  We need to make sure that inlines imported from modules
are correctly added to the 'pending_statics' map so that they get
emitted if needed, otherwise the attached testcase fails to link.

PR c++/112899

gcc/cp/ChangeLog:

* cp-tree.h (note_variable_template_instantiation): Rename to...
(note_vague_linkage_variable): ...this.
* decl2.cc (note_variable_template_instantiation): Rename to...
(note_vague_linkage_variable): ...this.
* pt.cc (instantiate_decl): Rename usage of above function.
* module.cc (trees_in::read_var_def): Remember pending statics
that we stream in.

gcc/testsuite/ChangeLog:

* g++.dg/modules/init-4_a.C: New test.
* g++.dg/modules/init-4_b.C: New test.
* g++.dg/modules/init-6_a.H: New test.
* g++.dg/modules/init-6_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
Reviewed-by: Patrick Palka <ppalka@redhat.com>
Reviewed-by: Jason Merrill <jason@redhat.com
18 months agotree-optimization/113602 - datarefs of non-addressables
Richard Biener [Fri, 26 Jan 2024 08:29:22 +0000 (09:29 +0100)] 
tree-optimization/113602 - datarefs of non-addressables

We can end up creating ADDR_EXPRs of non-addressable entities during
for example vectorization.  The following plugs this in data-ref
analysis when that would create such invalid ADDR_EXPR as part of
analyzing the ref structure.

PR tree-optimization/113602
* tree-data-ref.cc (dr_analyze_innermost): Fail when
the base object isn't addressable.

* gcc.dg/pr113602.c: New testcase.

18 months agogcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC
Tobias Burnus [Fri, 26 Jan 2024 09:14:09 +0000 (10:14 +0100)] 
gcn/gcn-hsa.h: Always pass --amdhsa-code-object-version= in ASM_SPEC

Since LLVM commit 082f87c9d418 (Pull Req. #79038; will become LLVM 18)
  "[AMDGPU] Change default AMDHSA Code Object version to 5"
the default - when no --amdhsa-code-object-version= is used - was bumped.

Using --amdhsa-code-object-version=5 is supported (with unknown limitations)
since LLVM 14. GCC required for proper support at least LLVM 13.0.1 such
that explicitly using COV5 is not possible.

Unfortunately, the COV number matters for debugging ("-g") as mkoffload.cc
extracts debugging data from the host's object file and writes into an
an AMD GPU object file it creates. And all object files linked together
must have the same ABI version.

gcc/ChangeLog:

* config/gcn/gcn-hsa.h (ABI_VERSION_SPEC): New; creates the
"--amdhsa-code-object-version=" argument.
(ASM_SPEC): Use it; replace previous version of it.

Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
18 months agoRISC-V: Refine some codes of VSETVL PASS [NFC]
Juzhe-Zhong [Fri, 26 Jan 2024 08:31:09 +0000 (16:31 +0800)] 
RISC-V: Refine some codes of VSETVL PASS [NFC]

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info): Refine some codes.
(pre_vsetvl::emit_vsetvl): Ditto.

18 months agoLoongArch: Split vec_selects of bottom elements into simple move
Jiahao Xu [Tue, 16 Jan 2024 02:23:20 +0000 (10:23 +0800)] 
LoongArch: Split vec_selects of bottom elements into simple move

For below pattern, can be treated as a simple move because floating point
and vector share a common register on loongarch64.

(set (reg/v:SF 32 $f0 [orig:93 res ] [93])
      (vec_select:SF (reg:V8SF 32 $f0 [115])
          (parallel [
                  (const_int 0 [0])
              ])))

gcc/ChangeLog:

* config/loongarch/lasx.md (vec_extract<mode>_0):
New define_insn_and_split patten.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/vect-extract.c: New test.

18 months agoLoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT
Jiahao Xu [Tue, 16 Jan 2024 02:32:31 +0000 (10:32 +0800)] 
LoongArch: Define LOGICAL_OP_NON_SHORT_CIRCUIT

Define LOGICAL_OP_NON_SHORT_CIRCUIT as 0, for a short-circuit branch, use the
short-circuit operation instead of the non-short-circuit operation.

SPEC2017 performance evaluation shows 1% performance improvement for fprate
GEOMEAN and no obvious regression for others. Especially, 526.blender_r +10.6%
on 3A6000.

This modification will introduce the following FAIL items:

FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Conditional combines static and invariant" 1
FAIL: gcc.dg/tree-ssa/copy-headers-8.c scan-tree-dump-times ch2 "Will duplicate bb" 2
FAIL: gcc.dg/tree-ssa/update-threading.c scan-tree-dump-times optimized "Invalid sum" 0

gcc/ChangeLog:

* config/loongarch/loongarch.h (LOGICAL_OP_NON_SHORT_CIRCUIT): Define.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/short-circuit.c: New test.

18 months agoLoongArch: testsuite:Added additional vectorization "-mlsx" option.
chenxiaolong [Fri, 26 Jan 2024 06:22:31 +0000 (14:22 +0800)] 
LoongArch: testsuite:Added additional vectorization "-mlsx" option.

gcc/testsuite/ChangeLog:

* gcc.dg/signbit-2.c: Added additional "-mlsx" compilation options.
* gfortran.dg/graphite/vect-pr40979.f90: Dito.
* gfortran.dg/vect/fast-math-mgrid-resid.f: Dito.

18 months agoLoongArch: Optimize implementation of single-precision floating-point approximate...
Li Wei [Wed, 24 Jan 2024 09:44:17 +0000 (17:44 +0800)] 
LoongArch: Optimize implementation of single-precision floating-point approximate division.

We found that in the spec17 521.wrf program, some loop invariant code generated
from single-precision floating-point approximate division calculation failed to
propose a loop. This is because the pseudo-register that stores the
intermediate temporary calculation results is rewritten in the implementation
of single-precision floating-point approximate division, failing to propose
invariants in the loop2_invariant pass. To this end, the intermediate temporary
calculation results are stored in new pseudo-registers without destroying the
read-write dependency, so that they could be recognized as loop invariants in
the loop2_invariant pass.
After optimization, the number of instructions of 521.wrf is reduced by 0.18%
compared with before optimization (1716612948501 -> 1713471771364).

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_emit_swdivsf): Adjust.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/invariant-recip.c: New test.

18 months agotestsuite/vect: Fix pr25413a.c expectations [PR109705]
Andrew Pinski [Thu, 25 Jan 2024 21:58:10 +0000 (13:58 -0800)] 
testsuite/vect: Fix pr25413a.c expectations [PR109705]

The 2 loops in octfapg_universe can and will be vectorized now
after r14-333-g6d4b59a9356ac4 on targets that support multiplication
in the long type. But the testcase does not check vect_long_mult for
that, so this patch corrects that error and now the testcase passes correctly
on aarch64-linux-gnu (with and without SVE).

Built and tested on aarch64-linux-gnu (with and without SVE).

gcc/testsuite/ChangeLog:

PR testsuite/109705
* gcc.dg/vect/pr25413a.c: Expect 1 vectorized loops for !vect_long_mult
and 2 for vect_long_mult.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months agoRISC-V: Fix incorrect LCM delete bug [VSETVL PASS]
Juzhe-Zhong [Fri, 26 Jan 2024 06:46:21 +0000 (14:46 +0800)] 
RISC-V: Fix incorrect LCM delete bug [VSETVL PASS]

This patch fixes the recent noticed bug in RV32 glibc.

We incorrectly deleted a vsetvl:

        ...
and a4,a4,a3
vmv.v.i v1,0                 ---> Missed vsetvl cause illegal instruction report.
vse8.v v1,0(a5)

The root cause the laterin in LCM is incorrect.

      BB 358:
        avloc: n_bits = 2, set = {}
        kill: n_bits = 2, set = {}
        antloc: n_bits = 2, set = {}
        transp: n_bits = 2, set = {}
        avin: n_bits = 2, set = {}
        avout: n_bits = 2, set = {}
        del: n_bits = 2, set = {}

cause LCM let BB 360 delete the vsetvl:

      BB 360:
        avloc: n_bits = 2, set = {}
        kill: n_bits = 2, set = {}
        antloc: n_bits = 2, set = {}
        transp: n_bits = 2, set = {0 1 }
        avin: n_bits = 2, set = {}
        avout: n_bits = 2, set = {}
        del: n_bits = 2, set = {1}

Also, remove unknown vsetvl info into local computation since it is unnecessary.

Tested on both RV32/RV64 no regression.

PR target/113469

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::compute_lcm_local_properties): Fix bug.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr113469.c: New test.

18 months agoaarch64: Fix/avoid undefinedness in aarch64_classify_index [PR100212]
Andrew Pinski [Thu, 25 Jan 2024 21:45:59 +0000 (13:45 -0800)] 
aarch64: Fix/avoid undefinedness in aarch64_classify_index [PR100212]

The problem here is we don't check the return value of exact_log2
and always use that result as shifter. This fixes the issue by avoiding
the shift if the value was `-1` (which means the value was not exact a power of 2);
in this case we could either check if the values was equal to -1 or not equal to because
we then assign -1 to shift if the constant value was not equal. I chose `!=` as
it seemed to be more obvious of what the code is doing.

Committed as obvious after a build/test for aarch64-linux-gnu.

gcc/ChangeLog:

PR target/100212
* config/aarch64/aarch64.cc (aarch64_classify_index): Avoid
undefined shift after the call to exact_log2.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months agoDaily bump.
GCC Administrator [Fri, 26 Jan 2024 00:18:33 +0000 (00:18 +0000)] 
Daily bump.

18 months agoc++: Fix up build_m_component_ref [PR113599]
Jakub Jelinek [Thu, 25 Jan 2024 23:08:36 +0000 (00:08 +0100)] 
c++: Fix up build_m_component_ref [PR113599]

The following testcase reduced from GDB is miscompiled starting with
r14-5503 PR112427 change.
The problem is in the build_m_component_ref hunk, which changed
-      datum = fold_build_pointer_plus (fold_convert (ptype, datum), component);
+      datum = cp_convert (ptype, datum, complain);
+      if (!processing_template_decl)
+       datum = build2 (POINTER_PLUS_EXPR, ptype,
+                       datum, convert_to_ptrofftype (component));
+      datum = cp_fully_fold (datum);
Component is e, (sizetype) e is 16, offset of c inside of C.
ptype is A *, pointer to type of C::c and datum is &d.
Now, previously the above created ((A *) &d) p+ (sizetype) e which is correct,
but in the new code cp_convert sees that C has A as base class and
instead of returning (A *) &d, it returns &d.D.2800 where D.2800 is
the FIELD_DECL for the A base at offset 8 into C.
So, instead of computing ((A *) &d) p+ (sizetype) e it computes
&d.D.2800 p+ (sizetype) e, which is ((A *) &d) p+ 24.

The following patch fixes it by using convert instead of cp_convert which
eventually calls build_nop (ptype, datum).

2024-01-26  Jakub Jelinek  <jakub@redhat.com>

PR c++/113599
* typeck2.cc (build_m_component_ref): Use convert instead of
cp_convert for pointer conversion.

* g++.dg/expr/ptrmem11.C: New test.

18 months agoc++: array of PMF [PR113598]
Jason Merrill [Thu, 25 Jan 2024 17:02:07 +0000 (12:02 -0500)] 
c++: array of PMF [PR113598]

Here AGGREGATE_TYPE_P includes pointers to member functions, which is not
what we want.  Instead we should use class||array, as elsewhere in the
function.

PR c++/113598

gcc/cp/ChangeLog:

* init.cc (build_vec_init): Don't use {} for PMF.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-pmf2.C: New test.

18 months agoc++: co_await and initializer_list [PR109227]
Jason Merrill [Thu, 25 Jan 2024 19:45:35 +0000 (14:45 -0500)] 
c++: co_await and initializer_list [PR109227]

Here we end up with an initializer_list of 'aa', a type with a non-trivial
destructor, and need to destroy it.  The code called
build_special_member_call for cleanups, but that doesn't work for arrays, so
use cxx_maybe_build_cleanup instead.  Let's go ahead and do that
everywhere that has been calling the destructor directly.

PR c++/109227

gcc/cp/ChangeLog:

* coroutines.cc (build_co_await): Use cxx_maybe_build_cleanup.
(build_actor_fn, process_conditional, maybe_promote_temps)
(morph_fn_to_coro): Likewise.
(expand_one_await_expression): Use build_cleanup.

gcc/testsuite/ChangeLog:

* g++.dg/coroutines/co-await-initlist2.C: New test.

18 months agoaarch64: Fix undefinedness while testing the J constraint [PR100204]
Andrew Pinski [Thu, 25 Jan 2024 16:30:36 +0000 (08:30 -0800)] 
aarch64: Fix undefinedness while testing the J constraint [PR100204]

The J constraint can invoke undefined behavior due to it taking the
negative of the ival if ival was HWI_MIN. The fix is simple as casting
to `unsigned HOST_WIDE_INT` before doing the negative of it. This
does that.

Committed as obvious after build/test for aarch64-linux-gnu.

gcc/ChangeLog:

PR target/100204
* config/aarch64/constraints.md (J): Cast to `unsigned HOST_WIDE_INT`
before taking the negative of it.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months ago[PR113526][LRA]: Fixing asm-flag-1.c failure on ARM
Vladimir N. Makarov [Thu, 25 Jan 2024 19:41:17 +0000 (14:41 -0500)] 
[PR113526][LRA]: Fixing asm-flag-1.c failure on ARM

My recent patch for PR113356 results in failure asm-flag-1.c test on arm.
After the patch LRA treats asm operand pseudos as general regs.  There
are too many such operands and LRA can not assign hard regs to all
operand pseudos.  Actually we should not assign hard regs to the
operand pseudo at all.  The following patch fixes this.

gcc/ChangeLog:

PR target/113526
* lra-constraints.cc (curr_insn_transform): Change class even for
spilled pseudo successfully matched with with NO_REGS.

18 months agoMAINTAINERS: Update my work email address
Chung-Lin Tang [Thu, 25 Jan 2024 18:20:43 +0000 (18:20 +0000)] 
MAINTAINERS: Update my work email address

* MAINTAINERS: Update my work email address.

18 months agoAVR: target/113601 - Fix wrong data start for ATmega3208 and ATmega3209.
Georg-Johann Lay [Thu, 25 Jan 2024 17:51:04 +0000 (18:51 +0100)] 
AVR: target/113601 - Fix wrong data start for ATmega3208 and ATmega3209.

gcc/
PR target/113601
* config/avr/avr-mcus.def (atmega3208, atmega3209): Fix data_section_start.

18 months agoaarch64: Fix eh_return for -mtrack-speculation [PR112987]
Szabolcs Nagy [Wed, 24 Jan 2024 18:50:19 +0000 (18:50 +0000)] 
aarch64: Fix eh_return for -mtrack-speculation [PR112987]

Recent commit introduced a conditional branch in eh_return epilogues
that is not compatible with speculation tracking:

  commit 426fddcbdad6746fe70e031f707fb07f55dfb405
  Author:     Szabolcs Nagy <szabolcs.nagy@arm.com>
  CommitDate: 2023-11-27 15:52:48 +0000

  aarch64: Use br instead of ret for eh_return

Refactor the compare zero and jump pattern and use it to fix the issue.

gcc/ChangeLog:

PR target/112987
* config/aarch64/aarch64.cc (aarch64_gen_compare_zero_and_branch): New.
(aarch64_expand_epilogue): Use the new function.
(aarch64_split_compare_and_swap): Likewise.
(aarch64_split_atomic_op): Likewise.

18 months agomodula2: add project regression test and badpointer tests
Gaius Mulley [Thu, 25 Jan 2024 16:29:02 +0000 (16:29 +0000)] 
modula2: add project regression test and badpointer tests

This patch adds four modula-2 testcases to the regression testsuite.
The project example stresses INC/DEC and range checking and the bad
pointer stress attempting to pass a string acual parameter to a
procedure with a pointer formal parameter.

gcc/testsuite/ChangeLog:

* gm2/pim/fail/badpointer.mod: New test.
* gm2/pim/fail/badpointer2.mod: New test.
* gm2/pim/fail/badpointer3.mod: New test.
* gm2/projects/pim/run/pass/pegfive/pegfive.mod: New test.
* gm2/projects/pim/run/pass/pegfive/projects-pim-run-pass-pegfive.exp: New test.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
18 months agofold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].
Robin Dapp [Mon, 15 Jan 2024 15:23:30 +0000 (16:23 +0100)] 
fold-const: Handle AND, IOR, XOR with stepped vectors [PR112971].

Found in PR112971 this patch adds folding support for bitwise operations
of const duplicate zero/one vectors with stepped vectors.
On riscv we have the situation that a folding would perpetually continue
without simplifying because e.g. {0, 0, 0, ...} & {7, 6, 5, ...} would
not be folded to {0, 0, 0, ...}.

gcc/ChangeLog:

PR middle-end/112971

* fold-const.cc (simplify_const_binop): New function for binop
simplification of two constant vectors when element-wise
handling is not necessary.
(const_binop): Call new function.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/pr112971.c: New test.

18 months agotestsuite/vect: Add target checks to refined patterns.
Robin Dapp [Tue, 23 Jan 2024 11:44:20 +0000 (12:44 +0100)] 
testsuite/vect: Add target checks to refined patterns.

On Solaris/SPARC several vector tests appeared to be regressing.  They
were never vectorized but the checks before r14-3612-ge40edf64995769
would match regardless if a loop was actually vectorized or not.
The refined checks only match a successful vectorization attempt
but are run unconditionally.  This patch adds target checks to them.

gcc/testsuite/ChangeLog:

PR testsuite/113558

* gcc.dg/vect/no-scevccp-outer-7.c: Add target check.
* gcc.dg/vect/vect-outer-4c-big-array.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-s16a.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-s8a.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-s8b.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-u16b.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-u8a.c: Ditto.
* gcc.dg/vect/vect-reduc-dot-u8b.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-1a.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-1b-big-array.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-1c-big-array.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-2a.c: Ditto.
* gcc.dg/vect/vect-reduc-pattern-2b-big-array.c: Ditto.
* gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Ditto.

18 months agoanalyzer: fix defaults in compound assignments from non-zero offsets [PR112969]
David Malcolm [Thu, 25 Jan 2024 15:06:12 +0000 (10:06 -0500)] 
analyzer: fix defaults in compound assignments from non-zero offsets [PR112969]

Confusion in binding_cluster::maybe_get_compound_binding about whether
offsets are relative to the start of the region or to the start of the
cluster was leading to incorrect handling of default values, leading
to false positives from -Wanalyzer-use-of-uninitialized-value, from
-Wanalyzer-exposure-through-uninit-copy, and other logic errors.

Fixed thusly.

gcc/analyzer/ChangeLog:
PR analyzer/112969
* store.cc (binding_cluster::maybe_get_compound_binding): When
populating default_map, express the bit-range of the default key
for REG relative to REG, rather than to the base region.

gcc/testsuite/ChangeLog:
PR analyzer/112969
* c-c++-common/analyzer/compound-assignment-5.c (test_3): Remove
xfails, reorder tests.
* c-c++-common/analyzer/compound-assignment-pr112969.c: New test.
* gcc.dg/plugin/infoleak-pr112969.c: New test.
* gcc.dg/plugin/plugin.exp: Add infoleak-pr112969.c to
analyzer_kernel_plugin.c tests.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
18 months agomodula2: correct prototype for lseek within gcc/m2/gm2-libs/libc.def
Gaius Mulley [Thu, 25 Jan 2024 15:04:53 +0000 (15:04 +0000)] 
modula2: correct prototype for lseek within gcc/m2/gm2-libs/libc.def

This patch corrects the definition of lseek by changing the second
parameter to a CSSIZE_T rather than LONGINT and allow the return value
to be ignored.

gcc/m2/ChangeLog:

* gm2-libs/libc.def (lseek): Change the second parameter
type to CSSIZE_T and make the return value optional.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
18 months agoRISC-V: Add support for XCVsimd extension in CV32E40P
Mary Bennett [Tue, 16 Jan 2024 17:13:50 +0000 (17:13 +0000)] 
RISC-V: Add support for XCVsimd extension in CV32E40P

Spec: github.com/openhwgroup/core-v-sw/blob/master/specifications/corev-builtin-spec.md

Contributors:
  Mary Bennett <mary.bennett@embecosm.com>
  Nandni Jamnadas <nandni.jamnadas@embecosm.com>
  Pietra Ferreira <pietra.ferreira@embecosm.com>
  Charlie Keaney
  Jessica Mills
  Craig Blackmore <craig.blackmore@embecosm.com>
  Simon Cook <simon.cook@embecosm.com>
  Jeremy Bennett <jeremy.bennett@embecosm.com>
  Helene Chelin <helene.chelin@embecosm.com>

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add XCVbitmanip.
* config/riscv/constraints.md: Likewise.
* config/riscv/corev.def: Likewise.
* config/riscv/corev.md: Likewise.
* config/riscv/predicates.md: Likewise.
* config/riscv/riscv-builtins.cc (AVAIL): Likewise.
* config/riscv/riscv-ftypes.def: Likewise.
* config/riscv/riscv.opt: Likewise.
* config/riscv/riscv.cc (riscv_print_operand): Add new operand 'Y'.
* doc/extend.texi: Add XCVbitmanip builtin documentation.
* doc/sourcebuild.texi: Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/cv-simd-abs-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-abs-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-add-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-and-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avg-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-avgu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpeq-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpge-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgeu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgt-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpgtu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmple-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpleu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmplt-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpltu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cmpne-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxconj-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-i-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-cplxmul-r-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotsp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotup-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-dotusp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extract-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extract-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extractu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-extractu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-insert-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-insert-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-march-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-max-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-maxu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-min-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-minu-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-neg-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-neg-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-or-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-pack-compile-1.c: New test.
* gcc.target/riscv/cv-simd-pack-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-packhi-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-packlo-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotsp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotup-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sdotusp-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shuffle-sci-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shuffle2-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shuffle2-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei0-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei1-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei2-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-shufflei3-sci-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sll-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sra-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-srl-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-sub-sc-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-div2-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-div4-compile-1.c: New test.
* gcc.target/riscv/cv-simd-subrotmj-div8-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-h-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-sc-b-compile-1.c: New test.
* gcc.target/riscv/cv-simd-xor-sc-h-compile-1.c: New test.
* lib/target-supports.exp: Add proc for XCVsimd extension.

18 months agogcn: Add missing space to ASM_SPEC in gcn-hsa.h
Tobias Burnus [Thu, 25 Jan 2024 14:43:50 +0000 (15:43 +0100)] 
gcn: Add missing space to ASM_SPEC in gcn-hsa.h

gcc/
* config/gcn/gcn-hsa.h (ASM_SPEC): Add space after -mxnack= argument.

18 months agoRISC-V: remove param riscv-vector-abi. [PR113538]
Yanzhang Wang [Thu, 25 Jan 2024 13:06:09 +0000 (21:06 +0800)] 
RISC-V: remove param riscv-vector-abi. [PR113538]

Also adjust some of the tests for scan-assembly. The behavior is the
same as --param=riscv-vector-abi before.

gcc/ChangeLog:

PR target/113538
* config/riscv/riscv.cc (riscv_get_arg_info): Remove the flag.
(riscv_fntype_abi): Ditto.
* config/riscv/riscv.opt: Ditto.

gcc/testsuite/ChangeLog:

PR target/113538
* gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul1-7.c: Fix the asm
check.
* gcc.target/riscv/rvv/base/abi-call-args-1-run.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-1.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-2-run.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-2.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-3-run.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-3.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-4-run.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-args-4.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-error-1.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-return-run.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-return.c: Ditto.
* gcc.target/riscv/rvv/base/abi-call-variant_cc.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-1.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-fixed-2.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-save-restore.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-1-zcmp.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-1.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-save-restore.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-2-zcmp.c: Ditto.
* gcc.target/riscv/rvv/base/abi-callee-saved-2.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-69.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-70.c: Ditto.
* gcc.target/riscv/rvv/base/float-point-dynamic-frm-71.c: Ditto.
* gcc.target/riscv/rvv/base/misc_vreinterpret_vbool_vint.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv32_vadd.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv32_vfadd.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv32_vget_vset.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv32_vloxseg2ei16.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv32_vreinterpret.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv64_vadd.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv64_vfadd.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv64_vget_vset.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv64_vloxseg2ei16.c: Ditto.
* gcc.target/riscv/rvv/base/overloaded_rv64_vreinterpret.c: Ditto.
* gcc.target/riscv/rvv/base/spill-10.c: Ditto.
* gcc.target/riscv/rvv/base/spill-11.c: Ditto.
* gcc.target/riscv/rvv/base/spill-9.c: Ditto.
* gcc.target/riscv/rvv/base/tuple_vundefined.c: Ditto.
* gcc.target/riscv/rvv/base/vcreate.c: Ditto.
* gcc.target/riscv/rvv/base/vlmul_ext-1.c: Ditto.
* gcc.target/riscv/rvv/base/zvfh-over-zvfhmin.c: Ditto.
* gcc.target/riscv/rvv/base/zvfhmin-intrinsic.c: Ditto.
* lib/target-supports.exp: Remove the flag.

Signed-off-by: Yanzhang Wang <yanzhang.wang@intel.com>
18 months agoconvert: Fix test for out of bounds shift count [PR113574]
Jakub Jelinek [Thu, 25 Jan 2024 12:15:23 +0000 (13:15 +0100)] 
convert: Fix test for out of bounds shift count [PR113574]

The following patch is miscompiled, because convert_to_integer_1 for
LSHIFT_EXPR tests if the INTEGER_CST shift count is too high, but
incorrectly compares it against TYPE_SIZE rather than TYPE_PRECISION.
The type in question is unsigned _BitInt(1), which has TYPE_PRECISION 1,
TYPE_SIZE 8, and the shift count is 2 in that case.

2024-01-25  Jakub Jelinek  <jakub@redhat.com>

PR middle-end/113574
* convert.cc (convert_to_integer_1) <case LSHIFT_EXPR>: Compare shift
count against TYPE_PRECISION rather than TYPE_SIZE.

* gcc.dg/torture/bitint-52.c: New test.

18 months agoaarch64: Fix out-of-bounds ENCODED_ELT access [PR113572]
Richard Sandiford [Thu, 25 Jan 2024 12:03:18 +0000 (12:03 +0000)] 
aarch64: Fix out-of-bounds ENCODED_ELT access [PR113572]

When generalising vector_cst_all_same, I'd forgotten to update
VECTOR_CST_ENCODED_ELT to VECTOR_CST_ELT.  The check deliberately
looks at implicitly encoded elements in some cases.

gcc/
PR target/113572
* config/aarch64/aarch64-sve-builtins.cc (vector_cst_all_same):
Check VECTOR_CST_ELT instead of VECTOR_CST_ENCODED_ELT

gcc/testsuite/
PR target/113572
* gcc.target/aarch64/sve/pr113572.c: New test.

18 months agoaarch64: Handle overlapping registers in movv8di [PR113550]
Richard Sandiford [Thu, 25 Jan 2024 12:03:18 +0000 (12:03 +0000)] 
aarch64: Handle overlapping registers in movv8di [PR113550]

The LS64 movv8di pattern didn't handle loads that overlapped with
the address register (unless the overlap happened to be in the
last subload).

gcc/
PR target/113550
* config/aarch64/aarch64-simd.md: In the movv8di splitter, check
whether each split instruction is a load that clobbers the source
address.  Emit that instruction last if so.

gcc/testsuite/
PR target/113550
* gcc.target/aarch64/pr113550.c: New test.

18 months agoaarch64: Avoid paradoxical subregs in UXTL split [PR113485]
Richard Sandiford [Thu, 25 Jan 2024 12:03:17 +0000 (12:03 +0000)] 
aarch64: Avoid paradoxical subregs in UXTL split [PR113485]

g:74e3e839ab2d36841320 handled the UXTL{,2}-ZIP[12] optimisation
in split1.  The UXTL input is a 64-bit vector of N-bit elements
and the result is a 128-bit vector of 2N-bit elements.  The
corresponding ZIP1 operates on 128-bit vectors of N-bit elements.

This meant that the ZIP1 input had to be a 128-bit paradoxical subreg
of the 64-bit UXTL input.  In the PRs, it wasn't possible to generate
this subreg because the inputs were already subregs of a x[234]
structure of 64-bit vectors.

I don't think the same thing can happen for UXTL2->ZIP2 because
UXTL2 input is a 128-bit vector rather than a 64-bit vector.

It isn't really necessary for ZIP1 to take 128-bit inputs,
since the upper 64 bits are ignored.  This patch therefore adds
a pattern for 64-bit → 128-bit ZIP1s.

In principle, we should probably use this form for all ZIP1s.
But in practice, that creates an awkward special case, and
would be quite invasive for stage 4.

gcc/
PR target/113485
* config/aarch64/aarch64-simd.md (aarch64_zip1<mode>_low): New
pattern.
(<optab><Vnarrowq><mode>2): Use it instead of generating a
paradoxical subreg for the input.

gcc/testsuite/
PR target/113485
* gcc.target/aarch64/pr113485.c: New test.
* gcc.target/aarch64/pr113573.c: Likewise.

18 months agoRISC-V: Add LCM delete block predecessors dump information
Juzhe-Zhong [Thu, 25 Jan 2024 08:59:42 +0000 (16:59 +0800)] 
RISC-V: Add LCM delete block predecessors dump information

While looking into PR113469, I notice the LCM delete a vsetvl incorrectly.

This patch add dump information of all predecessors for LCM delete vsetvl block
for better debugging.

Tested no regression.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (get_all_predecessors): New function.
(pre_vsetvl::pre_global_vsetvl_info): Add LCM delete block all
predecessors dump information.

18 months agoRISC-V: Remove redundant full available computation [NFC]
Juzhe-Zhong [Thu, 25 Jan 2024 08:24:26 +0000 (16:24 +0800)] 
RISC-V: Remove redundant full available computation [NFC]

Notice full available is computed evey round of earliest fusion which is redundant.
Actually we only need to compute it once in phase 3.

It's NFC patch and tested no regression. Committed.

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (pre_vsetvl::compute_vsetvl_def_data): Remove
redundant full available computation.
(pre_vsetvl::pre_global_vsetvl_info): Ditto.

18 months agoFix a few vect gimple testcases for LLP64 targets (e.g. mingw) [PR113548]
Andrew Pinski [Wed, 24 Jan 2024 18:28:37 +0000 (10:28 -0800)] 
Fix a few vect gimple testcases for LLP64 targets (e.g. mingw) [PR113548]

This fixes of the vect testcases which uses the gimple FE for LLP64 targets.
The testcases use directly `unsigned long` for the addition to pointers
when they should be using `__SIZETYPE__`. This changes to use that instead.

gcc/testsuite/ChangeLog:

PR testsuite/113548
* gcc.dg/vect/slp-reduc-10a.c: Use `__SIZETYPE__` instead of `unsigned long`.
* gcc.dg/vect/slp-reduc-10b.c: Likewise.
* gcc.dg/vect/slp-reduc-10c.c: Likewise.
* gcc.dg/vect/slp-reduc-10d.c: Likewise.
* gcc.dg/vect/slp-reduc-10e.c: Likewise.
* gcc.dg/vect/vect-cond-arith-2.c: Likewise.
* gcc.dg/vect/vect-ifcvt-19.c: Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months agodocs: Fix 2 typos
Jakub Jelinek [Thu, 25 Jan 2024 08:10:08 +0000 (09:10 +0100)] 
docs: Fix 2 typos

When looking into PR113572, I've noticed a typo in VECTOR_CST documentation
and grep found pasto of it elsewhere.

2024-01-25  Jakub Jelinek  <jakub@redhat.com>

* doc/generic.texi (VECTOR_CST): Fix typo - petterns -> patterns.
* doc/rtl.texi (CONST_VECTOR): Likewise.

18 months agoRISC-V: Add optim-no-fusion compile option [VSETVL PASS]
Juzhe-Zhong [Thu, 25 Jan 2024 07:55:39 +0000 (15:55 +0800)] 
RISC-V: Add optim-no-fusion compile option [VSETVL PASS]

This patch adds no fusion compile option to disable phase 2 global fusion.

It can help us to analyze the compile-time and debugging.

Committed.

gcc/ChangeLog:

* config/riscv/riscv-opts.h (enum vsetvl_strategy_enum): Add optim-no-fusion option.
* config/riscv/riscv-vsetvl.cc (pass_vsetvl::lazy_vsetvl): Ditto.
(pass_vsetvl::execute): Ditto.
* config/riscv/riscv.opt: Ditto.

18 months agoLoongArch: Remove vec_concatz<mode> pattern.
Jiahao Xu [Wed, 24 Jan 2024 09:19:13 +0000 (17:19 +0800)] 
LoongArch: Remove vec_concatz<mode> pattern.

It is incorrect to use vld/vori to implement the vec_concatz<mode> because when the LSX
instruction is used to update the value of the vector register, the upper 128 bits of
the vector register will not be zeroed.

gcc/ChangeLog:

* config/loongarch/lasx.md (@vec_concatz<mode>): Remove this define_insn pattern.
* config/loongarch/loongarch.cc (loongarch_expand_vector_group_init): Use vec_concat<mode>.

18 months agotree-optimization/113576 - non-empty latch and may_be_zero vectorization
Richard Biener [Wed, 24 Jan 2024 13:55:49 +0000 (14:55 +0100)] 
tree-optimization/113576 - non-empty latch and may_be_zero vectorization

We can't support niters with may_be_zero when we end up with a
non-empty latch due to early exit peeling.  At least not in
the simplistic way the vectorizer handles this now.  Disallow
it again for exits that are not the last one.

PR tree-optimization/113576
* tree-vect-loop.cc (vec_init_loop_exit_info): Only allow
exits with may_be_zero niters when its the last one.

* gcc.dg/vect/pr113576.c: New testcase.

18 months agoLoongArch: Disable TLS type symbols from generating non-zero offsets.
Lulu Cheng [Tue, 23 Jan 2024 03:28:09 +0000 (11:28 +0800)] 
LoongArch: Disable TLS type symbols from generating non-zero offsets.

TLS gd ld and ie type symbols will generate corresponding GOT entries,
so non-zero offsets cannot be generated.
The address of TLS le type symbol+addend is not implemented in binutils,
so non-zero offset is not generated here for the time being.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_symbolic_constant_p):
For symbols of type tls, non-zero Offset is not generated.

18 months agors6000: Enable block compare expand on P9 with m32 and mpowerpc64
Haochen Gui [Thu, 25 Jan 2024 06:54:42 +0000 (14:54 +0800)] 
rs6000: Enable block compare expand on P9 with m32 and mpowerpc64

gcc/
* config/rs6000/rs6000-string.cc (expand_block_compare): Enable
P9 with m32 and mpowerpc64.

gcc/testsuite/
* gcc.target/powerpc/block-cmp-1.c: Exclude m32 and mpowerpc64.
* gcc.target/powerpc/block-cmp-4.c: Likewise.
* gcc.target/powerpc/block-cmp-8.c: New.

18 months agoEnable -mlam=u57 by default when compiled with -fsanitize=hwaddress.
liuhongt [Tue, 23 Jan 2024 06:21:58 +0000 (14:21 +0800)] 
Enable -mlam=u57 by default when compiled with -fsanitize=hwaddress.

gcc/ChangeLog:

* config/i386/i386-options.cc (ix86_option_override_internal):
Enable -mlam=u57 by default when compiled with
-fsanitize=hwaddress.

18 months agoAdjust hwasan testcase for x86 target.
liuhongt [Tue, 23 Jan 2024 05:35:39 +0000 (13:35 +0800)] 
Adjust hwasan testcase for x86 target.

There're 2 cases:
1. hwasan-poison-optimisation.c is supposed to scan call to
__hwasan_tag_mismatch4, and x86 have different mnemonic(call) from
aarch64(bl), so adjust testcase to scan either call or bl.

2. alloca-outside-caught.c/vararray-outside-caught.c are supposed to
scan mismatched tags and expected the tag corresponding to
out-of-bounds memory is 00, but for x86 the continous stack is
allocated by other local variable/array which is assigned with a
different tag, but still there're mismatches. So adjust testcase to
scan XX/XX instead of XX/00.

gcc/testsuite/ChangeLog:

* c-c++-common/hwasan/alloca-outside-caught.c: Adjust
testcase.
* c-c++-common/hwasan/hwasan-poison-optimisation.c: Ditto.
* c-c++-common/hwasan/vararray-outside-caught.c: Ditto.

18 months agoc++: Handle partial specialisations in GMF [PR113405]
Nathaniel Shead [Fri, 19 Jan 2024 11:24:18 +0000 (22:24 +1100)] 
c++: Handle partial specialisations in GMF [PR113405]

Currently, when exporting names from the GMF, or within header modules,
for a set of constrained partial specialisations we only emit the first
one. This is because the 'type_specialization' list only includes a
single specialization per template+argument list; constraints are not
considered here.

The existing code uses a separate 'partial_specializations' list to
track this instead, but currently it's only used for declarations in the
module purview. This patch makes use of this list for all declarations.

PR c++/113405

gcc/cp/ChangeLog:

* module.cc (set_defining_module): Track partial specialisations
for all declarations.

gcc/testsuite/ChangeLog:

* g++.dg/modules/concept-9.h: New test.
* g++.dg/modules/concept-9_a.C: New test.
* g++.dg/modules/concept-9_b.C: New test.
* g++.dg/modules/concept-10_a.H: New test.
* g++.dg/modules/concept-10_b.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
18 months agoc++: Fix importing nested namespace declarations [PR100707]
Nathaniel Shead [Sat, 20 Jan 2024 13:45:37 +0000 (00:45 +1100)] 
c++: Fix importing nested namespace declarations [PR100707]

Currently, importing a namespace declarations marks it as imported, and
so marks it as originating from the module that it was imported from.
This is usually harmless, but causes problems with nested namespaces.

In the linked PR, what happens is that the namespace 'A' imported from
the module ends up not being considered when creating the 'A' namespace
within its own TU, and thus it has its 'cp_binding_level' recreated.
However, by this point 'A::B' has already been imported, and so the
'level_chain' member no longer correctly points at 'A's binding level,
so the sanity check for this in 'resume_scope' ICEs.

Since as far as I can tell there's no reason for imported namespaces to
be attached to any specific module (namespace declarations with external
linkage are always attached to the global module by [module.unit] p7.2),
this patch just removes the 'imported' flag, which stops code from
caring about its originating module.

This patch also makes some minor adjustments to existing tests to cater
for the new dumped name.

PR c++/100707

gcc/cp/ChangeLog:

* name-lookup.cc (add_imported_namespace): Don't mark namespaces
as imported.

gcc/testsuite/ChangeLog:

* g++.dg/modules/indirect-1_b.C: Adjust to handle namespaces not
being attached to the module they were imported from.
* g++.dg/modules/indirect-1_c.C: Likewise.
* g++.dg/modules/indirect-2_b.C: Likewise.
* g++.dg/modules/indirect-2_c.C: Likewise.
* g++.dg/modules/indirect-3_b.C: Likewise.
* g++.dg/modules/indirect-3_c.C: Likewise.
* g++.dg/modules/indirect-4_b.C: Likewise.
* g++.dg/modules/indirect-4_c.C: Likewise.
* g++.dg/modules/namespace-5_a.C: New test.
* g++.dg/modules/namespace-5_b.C: New test.
* g++.dg/modules/namespace-5_c.C: New test.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
18 months agoi386: Modify testcases failed under -DDEBUG
Haochen Jiang [Fri, 19 Jan 2024 07:06:47 +0000 (15:06 +0800)] 
i386: Modify testcases failed under -DDEBUG

gcc/testsuite/ChangeLog:

* gcc.target/i386/adx-check.h: Include stdio.h when DEBUG
is defined.
* gcc.target/i386/avx512fp16-vscalefph-1b.c: Do not define
DEBUG.
* gcc.target/i386/avx512fp16vl-vaddph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vcmpph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vdivph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vfpclassph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vgetexpph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vgetmantph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vmaxph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vminph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vmulph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vrcpph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vreduceph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vrndscaleph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vrsqrtph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vscalefph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vsqrtph-1b.c: Ditto.
* gcc.target/i386/avx512fp16vl-vsubph-1b.c: Ditto.
* gcc.target/i386/readeflags-1.c: Include stdio.h when DEBUG
is defined.
* gcc.target/i386/rtm-check.h: Ditto.
* gcc.target/i386/sha-check.h: Ditto.
* gcc.target/i386/writeeflags-1.c: Ditto.

18 months agoFix check_effective_target_vect_long_mult
Andrew Pinski [Thu, 25 Jan 2024 00:33:38 +0000 (16:33 -0800)] 
Fix check_effective_target_vect_long_mult

My last commit I tested on aarch64 but vect_long_mult was not actually invoked
and I didn't notice that I was missing a `[` in front of check_effective_target_aarch64_sve.
When I ran the testsuite on x86_64, I got the failure.

Committed as obvious after testing on x86_64.

gcc/testsuite/ChangeLog:

* lib/target-supports.exp (check_effective_target_vect_long_mult): Fix
small typo for aarch64*-*-*.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months agoRISC-V: Don't make Ztso imply A
Palmer Dabbelt [Wed, 13 Dec 2023 03:54:05 +0000 (19:54 -0800)] 
RISC-V: Don't make Ztso imply A

I can't actually find anything in the ISA manual that makes Ztso imply
A.  In theory the memory ordering is just a different thing that the set
of availiable instructions (ie, Ztso without A would still imply TSO for
loads and stores).  It also seems like a configuration that could be
sane to build: without A it's all but impossible to write any meaningful
multi-core code, and TSO is really cheap for a single core.

That said, I think it's kind of reasonable to provide A to users asking
for Ztso.  So maybe even if this was a mistake it's the right thing to
do?

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info):
Remove {"ztso", "a"}.

18 months agoDaily bump.
GCC Administrator [Thu, 25 Jan 2024 00:19:12 +0000 (00:19 +0000)] 
Daily bump.

18 months agoc++: ambiguous member lookup for rewritten cands [PR113529]
Patrick Palka [Wed, 24 Jan 2024 22:11:09 +0000 (17:11 -0500)] 
c++: ambiguous member lookup for rewritten cands [PR113529]

Here we handle the operator expression u < v inconsistently: in a SFINAE
context we accept it, and in a non-SFINAE context we reject it with

  error: request for member 'operator<=>' is ambiguous

as per [class.member.lookup]/6.  This inconsistency is ultimately
because we neglect to propagate error_mark_node after recursing in
add_operator_candidates, fixed like so.

PR c++/113529

gcc/cp/ChangeLog:

* call.cc (add_operator_candidates): Propagate error_mark_node
result after recursing to find rewritten candidates.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-sfinae3.C: New test.

Reviewed-by: Jason Merrill <jason@redhat.com>
18 months agoc++: add test [PR113347]
Jason Merrill [Tue, 23 Jan 2024 21:09:15 +0000 (16:09 -0500)] 
c++: add test [PR113347]

The patch for this PR is unneeded on trunk, but let's add the test.

PR c++/113347

gcc/testsuite/ChangeLog:

* g++.dg/eh/return3.C: New test.

18 months agoFortran: passing of optional dummies to elemental procedures [PR113377]
Harald Anlauf [Wed, 24 Jan 2024 19:27:36 +0000 (20:27 +0100)] 
Fortran: passing of optional dummies to elemental procedures [PR113377]

gcc/fortran/ChangeLog:

PR fortran/113377
* trans-expr.cc (conv_dummy_value): New.
(gfc_conv_procedure_call): Factor code for handling dummy arguments
with the VALUE attribute in the scalar case into conv_dummy_value().
Reuse and adjust for calling elemental procedures.

gcc/testsuite/ChangeLog:

PR fortran/113377
* gfortran.dg/optional_absent_10.f90: New test.

18 months agoFix vect_long_mult for aarch64 [PR109705]
Andrew Pinski [Wed, 24 Jan 2024 08:00:34 +0000 (00:00 -0800)] 
Fix vect_long_mult for aarch64 [PR109705]

On aarch64, vectorization of `long` multiply can be done if SVE is enabled
or if long is 32bit (ILP32). It can also be done for constants too but there
is no effective target test for that just yet.

Build and tested on aarch64-linux-gnu with no regressions (also tested with SVE enabled).

gcc/testsuite/ChangeLog:

PR testsuite/109705
* lib/target-supports.exp (check_effective_target_vect_long_mult):
Fix aarch64*-*-* checks.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months agoipa: Self-DCE of uses of removed call LHSs (PR 108007)
Martin Jambor [Wed, 24 Jan 2024 18:12:31 +0000 (19:12 +0100)] 
ipa: Self-DCE of uses of removed call LHSs (PR 108007)

PR 108007 is another manifestation where we rely on DCE to clean-up
after IPA-SRA and if the user explicitely switches DCE off, IPA-SRA
can leave behind statements which are fed uninitialized values and
trap, even though their results are themselves never used.

I have already fixed this for unused parameters in callees, this bug
shows that almost the same thing can happen for removed returns, on
the side of callers.  This means that the issue has to be fixed
elsewhere, in call redirection.  This patch adds a function which
looks for (and through, using a work-list) uses of operations fed
specific SSA names and removes them all.

That would have been easy if it wasn't for debug statements during
tree-inline (from which call redirection is also invoked).  Debug
statements are decoupled from the rest at this point and iterating
over uses of SSAs does not bring them up.  During tree-inline they are
handled especially at the end, I assume in order to make sure that
relative ordering of UIDs are the same with and without debug info.

This means that during tree-inline we need to make a hash of killed
SSAs, that we already have in copy_body_data, available to the
function making the purging.  So the patch duly does also that, making
the interface slightly ugly.  Moreover, all newly unused SSA names
need to be freed and as PR 112616 showed, it must be done in a defined
order, which is what newly added ipa_release_ssas_in_hash does.

gcc/ChangeLog:

2024-01-12  Martin Jambor  <mjambor@suse.cz>

PR ipa/108007
PR ipa/112616
* cgraph.h (cgraph_edge): Add a parameter to
redirect_call_stmt_to_callee.
* ipa-param-manipulation.h (ipa_param_adjustments): Add a
parameter to modify_call.
(ipa_release_ssas_in_hash): Declare.
* cgraph.cc (cgraph_edge::redirect_call_stmt_to_callee): New
parameter killed_ssas, pass it to padjs->modify_call.
* ipa-param-manipulation.cc (purge_all_uses): New function.
(ipa_param_adjustments::modify_call): New parameter killed_ssas.
Instead of substituting uses, invoke purge_all_uses.  If
hash of killed SSAs has not been provided, create a temporary one
and release SSAs that have been added to it.
(compare_ssa_versions): New function.
(ipa_release_ssas_in_hash): Likewise.
* tree-inline.cc (redirect_all_calls): Create
id->killed_new_ssa_names earlier, pass it to edge redirection,
adjust a comment.
(copy_body): Release SSAs in id->killed_new_ssa_names.

gcc/testsuite/ChangeLog:

2024-01-15  Martin Jambor  <mjambor@suse.cz>

PR ipa/108007
PR ipa/112616
* gcc.dg/ipa/pr108007.c: New test.
* gcc.dg/ipa/pr112616.c: Likewise.

18 months agoaarch64: Fix __builtin_apply with -mgeneral-regs-only [PR113486]
Andrew Pinski [Thu, 18 Jan 2024 18:40:04 +0000 (10:40 -0800)] 
aarch64: Fix __builtin_apply with -mgeneral-regs-only [PR113486]

The problem here is the builtin apply mechanism thinks the FP registers
are to be used due to get_raw_arg_mode not returning VOIDmode. This
fixes that oversight and the backend now returns VOIDmode for non-general-regs
if TARGET_GENERAL_REGS_ONLY is true.

Built and tested for aarch64-linux-gnu with no regressions.

PR target/113486

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_get_reg_raw_mode): For
TARGET_GENERAL_REGS_ONLY, return VOIDmode for non-GP_REGNUM_P regno.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/builtin_apply-1.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
18 months ago[PATCH v3] RISC-V: Add split pattern to generate SFB instructions. [PR113095]
Monk Chiang [Wed, 24 Jan 2024 17:19:28 +0000 (10:19 -0700)] 
[PATCH v3] RISC-V: Add split pattern to generate SFB instructions. [PR113095]

Since the match.pd transforms (zero_one == 0) ? y : z <op> y,
into ((typeof(y))zero_one * z) <op> y. Add splitters to recongize
this expression to generate SFB instructions.

gcc/ChangeLog:
PR target/113095
* config/riscv/sfb.md: New splitters to rewrite single bit
sign extension as the condition to SFB instructions.

gcc/testsuite/ChangeLog:
* gcc.target/riscv/sfb.c: New test.
* gcc.target/riscv/pr113095.c: New test.

18 months agoAdd -fmin-function-alignmnet
Jan Hubicka [Wed, 24 Jan 2024 17:13:17 +0000 (18:13 +0100)] 
Add -fmin-function-alignmnet

-falign-functions is ignored in cold code, since it is an optimization intended to
improve instruction prefetch.  In some case it is necessary to force alignment for
all functions, so this patch adds -fmin-function-alignment for this purpose.

gcc/ChangeLog:

PR middle-end/88345
* common.opt: (flimit-function-alignment): Reorder alphabeticaly
(fmin-function-alignment): New parameter.
* doc/invoke.texi: (-fmin-function-alignment): Document.
(-falign-functions,-falign-loops,-falign-labels): Mention that
aglinments are ignored in cold code.
* varasm.cc (assemble_start_function): Handle min-function-alignment.

18 months agoAArch64: Fix expansion of Advanced SIMD div and mul using SVE [PR109636]
Tamar Christina [Wed, 24 Jan 2024 15:58:34 +0000 (15:58 +0000)] 
AArch64: Fix expansion of Advanced SIMD div and mul using SVE [PR109636]

As suggested in the ticket this replaces the expansion by converting the
Advanced SIMD types to SVE types by simply printing out an SVE register for
these instructions.

This fixes the subreg issues since there are no subregs involved anymore.

gcc/ChangeLog:

PR target/109636
* config/aarch64/aarch64-simd.md (<su_optab>div<mode>3,
mulv2di3): Remove.
* config/aarch64/iterators.md (VQDIV): Remove.
(SVE_FULL_SDI_SIMD, SVE_FULL_HSDI_SIMD_DI,
SVE_I_SIMD_DI): New.
(VPRED, sve_lane_con): Add V4SI and V2DI.
* config/aarch64/aarch64-sve.md (<optab><mode>3,
@aarch64_pred_<optab><mode>): Support Advanced SIMD types.
(mul<mode>3): New, split from <optab><mode>3.
(@aarch64_pred_<optab><mode>, *post_ra_<optab><mode>3): New.
* config/aarch64/aarch64-sve2.md (@aarch64_mul_lane_<mode>,
*aarch64_mul_unpredicated_<mode>): Change SVE_FULL_HSDI to
SVE_FULL_HSDI_SIMD_DI.

gcc/testsuite/ChangeLog:

PR target/109636
* gcc.target/aarch64/sve/pr109636_1.c: New test.
* gcc.target/aarch64/sve/pr109636_2.c: New test.
* gcc.target/aarch64/sve2/pr109636_1.c: New test.

18 months agoAArch64: Do not allow SIMD clones with simdlen 1 [PR113552]
Tamar Christina [Wed, 24 Jan 2024 15:56:50 +0000 (15:56 +0000)] 
AArch64: Do not allow SIMD clones with simdlen 1 [PR113552]

The AArch64 vector PCS does not allow simd calls with simdlen 1,
however due to a bug we currently do allow it for num == 0.

This causes us to emit a symbol that doesn't exist and we fail to link.

gcc/ChangeLog:

PR tree-optimization/113552
* config/aarch64/aarch64.cc
(aarch64_simd_clone_compute_vecsize_and_simdlen): Block simdlen 1.

gcc/testsuite/ChangeLog:

PR tree-optimization/113552
* gcc.target/aarch64/pr113552.c: New test.
* gcc.target/aarch64/simd_pcs_attribute-3.c: Remove bogus check.

18 months agoipa-cp: Fix check for exceeding param_ipa_cp_value_list_size (PR 113490)
Martin Jambor [Wed, 24 Jan 2024 15:19:48 +0000 (16:19 +0100)] 
ipa-cp: Fix check for exceeding param_ipa_cp_value_list_size  (PR 113490)

When the check for exceeding param_ipa_cp_value_list_size limit was
modified to be ignored for generating values from self-recursive
calls, it should have been changed from equal to, to equals to or is
greater than.  This omission manifests itself as PR 113490.

When I examined the condition I also noticed that the parameter should
come from the callee rather than the caller, since the value list is
associated with the former and not the latter.  In practice the limit
is of course very likely to be the same, but I fixed this aspect of
the condition too.  I briefly audited all other uses of opt_for_fn in
ipa-cp.cc and all the others looked OK.

gcc/ChangeLog:

2024-01-19  Martin Jambor  <mjambor@suse.cz>

PR ipa/113490
* ipa-cp.cc (ipcp_lattice<valtype>::add_value): Bail out if value
count is equal or greater than the limit.  Use the limit from the
callee.

gcc/testsuite/ChangeLog:

2024-01-22  Martin Jambor  <mjambor@suse.cz>

PR ipa/113490
* gcc.dg/ipa/pr113490.c: New test.

18 months agoanalyzer: fix taint false +ve due to overzealous state purging [PR112977]
David Malcolm [Wed, 24 Jan 2024 15:11:35 +0000 (10:11 -0500)] 
analyzer: fix taint false +ve due to overzealous state purging [PR112977]

gcc/analyzer/ChangeLog:
PR analyzer/112977
* engine.cc (impl_region_model_context::on_liveness_change): Pass
m_ext_state to sm_state_map::on_liveness_change.
* program-state.cc (sm_state_map::on_svalue_leak): Guard removal
of map entry based on can_purge_p.
(sm_state_map::on_liveness_change): Add ext_state param.  Add
workaround for bad interaction between state purging and
alt-inherited sm-state.
* program-state.h (sm_state_map::on_liveness_change): Add
ext_state param.
* sm-taint.cc
(taint_state_machine::has_alt_get_inherited_state_p): New.
(taint_state_machine::can_purge_p): Return false for "has_lb" and
"has_ub".
* sm.h (state_machine::has_alt_get_inherited_state_p): New vfunc.

gcc/testsuite/ChangeLog:
PR analyzer/112977
* gcc.dg/plugin/plugin.exp: Add taint-pr112977.c.
* gcc.dg/plugin/taint-pr112977.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
18 months agoanalyzer kernel plugin: implement __check_object_size [PR112927]
David Malcolm [Wed, 24 Jan 2024 15:11:09 +0000 (10:11 -0500)] 
analyzer kernel plugin: implement __check_object_size [PR112927]

PR analyzer/112927 reports a false positive from -Wanalyzer-tainted-size
seen on the Linux kernel's drivers/char/ipmi/ipmi_devintf.c with the
analyzer kernel plugin.

The issue is that in:

(A):
  if (msg->data_len > 272) {
    return -90;
  }

(B):
  n = msg->data_len;
  __check_object_size(to, n);
  n = copy_from_user(to, from, n);

the analyzer is treating __check_object_size as having arbitrary side
effects, and, in particular could modify msg->data_len.  Hence the
sanitization that occurs at (A) above is treated as being for a
different value than the size obtained at (B), hence the bogus warning
at the call to copy_from_user.

Fixed by extending the analyzer kernel plugin to "teach" it that
__check_object_size has no side effects.

gcc/testsuite/ChangeLog:
PR analyzer/112927
* gcc.dg/plugin/analyzer_kernel_plugin.c
(class known_function___check_object_size): New.
(kernel_analyzer_init_cb): Register it.
* gcc.dg/plugin/plugin.exp: Add taint-pr112927.c.
* gcc.dg/plugin/taint-pr112927.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
18 months agoPR modula2/113559 FIO.mod lseek requires cssize_t rather than longint
Gaius Mulley [Wed, 24 Jan 2024 13:11:46 +0000 (13:11 +0000)] 
PR modula2/113559 FIO.mod lseek requires cssize_t rather than longint

This patch fixes a bug in gcc/m2/gm2-libs/FIO.mod which failed to cast the
whence parameter into the correct type.  The patch casts the whence
parameter for lseek to SYSTEM.CSSIZE_T.

gcc/m2/ChangeLog:

PR modula2/113559
* gm2-libs/FIO.mod (SetPositionFromBeginning): Convert pos into
CSSIZE_T during call to lseek.
(SetPositionFromEnd): Convert pos into CSSIZE_T during call to
lseek.

Signed-off-by: Gaius Mulley <gaiusmod2@gmail.com>
18 months agotestsuite: i386: Don't restrict gcc.dg/vect/vect-simd-clone-16c.c etc. to i686 [PR113556]
Rainer Orth [Wed, 24 Jan 2024 12:56:23 +0000 (13:56 +0100)] 
testsuite: i386: Don't restrict gcc.dg/vect/vect-simd-clone-16c.c etc. to i686 [PR113556]

A couple of gcc.dg/vect/vect-simd-clone-1*.c tests FAIL on 32-bit
Solaris/x86 since 20230222:

FAIL: gcc.dg/vect/vect-simd-clone-16c.c scan-tree-dump-times vect
"[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-16d.c scan-tree-dump-times vect
"[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17c.c scan-tree-dump-times vect
"[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-17d.c scan-tree-dump-times vect
"[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18c.c scan-tree-dump-times vect
"[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2
FAIL: gcc.dg/vect/vect-simd-clone-18d.c scan-tree-dump-times vect
"[\\\\n\\\\r] [^\\\\n]* = foo\\\\.simdclone" 2

The problem is that the 32-bit Solaris/x86 triple still uses i386,
although gcc defaults to -mpentium4.  However, the tests only handle
x86_64* and i686*, although the tests don't seem to require some
specific ISA extension not covered by vect_simd_clones.

To fix this, the tests now allow generic i?86.  At the same time, I've
removed the wildcards from x86_64* and i686* since DejaGnu uses the
canonical forms.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

2024-01-24  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
PR target/113556
* gcc.dg/vect/vect-simd-clone-16c.c: Don't wildcard x86_64 in
target specs.  Allow any i?86 target instead of i686 only.
* gcc.dg/vect/vect-simd-clone-16d.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17c.c: Likewise.
* gcc.dg/vect/vect-simd-clone-17d.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18c.c: Likewise.
* gcc.dg/vect/vect-simd-clone-18d.c: Likewise.

18 months agotestsuite: i386: Fix gcc.target/i386/pr80833-1.c on 32-bit Solaris/x86
Rainer Orth [Wed, 24 Jan 2024 12:52:54 +0000 (13:52 +0100)] 
testsuite: i386: Fix gcc.target/i386/pr80833-1.c on 32-bit Solaris/x86

gcc.target/i386/pr80833-1.c FAILs on 32-bit Solaris/x86 since 20220609:

FAIL: gcc.target/i386/pr80833-1.c scan-assembler pextrd

Unlike e.g. Linux/i686, 32-bit Solaris/x86 defaults to -mstackrealign,
so this patch overrides that to match.

Tested on i386-pc-solaris2.11 and i686-pc-linux-gnu.

2024-01-23  Rainer Orth  <ro@CeBiTec.Uni-Bielefeld.DE>

gcc/testsuite:
* gcc.target/i386/pr80833-1.c: Add -mno-stackrealign to dg-options.

18 months agoMIPS: Accept arguments for -mexplicit-relocs
YunQiang Su [Fri, 19 Jan 2024 10:23:21 +0000 (18:23 +0800)] 
MIPS: Accept arguments for -mexplicit-relocs

GAS introduced explicit relocs since 2001, and %pcrel_hi/low were
introduced in 2014.  In future, we may introduce more.

Let's convert -mexplicit-relocs option, and accpet options:
    none, base, pcrel.

We also update gcc/configure.ac to set the value to option
the gas support when GCC itself is built.

gcc
* configure.ac: Detect the explicit relocs support for
mips, and define C macro MIPS_EXPLICIT_RELOCS.
* config.in: Regenerated.
* configure: Regenerated.
* doc/invoke.texi(MIPS Options): Add -mexplicit-relocs.
* config/mips/mips-opts.h: Define enum mips_explicit_relocs.
* config/mips/mips.cc(mips_set_compression_mode): Sorry if
!TARGET_EXPLICIT_RELOCS instead of just set it.
* config/mips/mips.h: Define TARGET_EXPLICIT_RELOCS and
TARGET_EXPLICIT_RELOCS_PCREL with mips_opt_explicit_relocs.
* config/mips/mips.opt: Introduce -mexplicit-relocs= option
and define -m(no-)explicit-relocs as aliases.

18 months agoMAINTAINERS: Update my work email address
Thomas Schwinge [Wed, 24 Jan 2024 11:03:03 +0000 (12:03 +0100)] 
MAINTAINERS: Update my work email address

* MAINTAINERS: Update my work email address.

18 months agoaarch64: Re-enable ldp/stp fusion pass
Alex Coplan [Wed, 24 Jan 2024 09:22:19 +0000 (09:22 +0000)] 
aarch64: Re-enable ldp/stp fusion pass

Since, to the best of my knowledge, all reported regressions related to
the ldp/stp fusion pass have now been fixed, and PGO+LTO bootstrap with
--enable-languages=all is working again with the passes enabled, this
patch turns the passes back on by default, as agreed with Jakub here:

https://gcc.gnu.org/pipermail/gcc-patches/2024-January/642478.html

gcc/ChangeLog:

* config/aarch64/aarch64.opt (-mearly-ldp-fusion): Set default
to 1.
(-mlate-ldp-fusion): Likewise.

18 months agomiddle-end: rename main_exit_p in reduction code.
Tamar Christina [Wed, 24 Jan 2024 07:38:18 +0000 (07:38 +0000)] 
middle-end: rename main_exit_p in reduction code.

This renamed main_exit_p to last_val_reduc_p to more accurately
reflect what the value is calculating.

gcc/ChangeLog:

* tree-vect-loop.cc (vect_get_vect_def,
vect_create_epilog_for_reduction): Rename main_exit_p to
last_val_reduc_p.

18 months agomiddle-end: fix epilog reductions when vector iters peeled [PR113364]
Tamar Christina [Wed, 24 Jan 2024 07:37:17 +0000 (07:37 +0000)] 
middle-end: fix epilog reductions when vector iters peeled [PR113364]

This fixes a bug where vect_create_epilog_for_reduction does not handle the
case where all exits are early exits.  In this case we should do like induction
handling code does and not have a main exit.

This shows that some new miscompiles are happening (stage3 is likely miscompiled)
but that's unrelated to this patch and I'll look at it next.

gcc/ChangeLog:

PR tree-optimization/113364
* tree-vect-loop.cc (vect_create_epilog_for_reduction): If all exits all
early exits then we must reduce from the first offset for all of them.

gcc/testsuite/ChangeLog:

PR tree-optimization/113364
* gcc.dg/vect/vect-early-break_107-pr113364.c: New test.

18 months agolibgomp.texi: Document omp_pause_resource{,_all} and omp_target_memcpy*
Tobias Burnus [Wed, 24 Jan 2024 07:06:28 +0000 (08:06 +0100)] 
libgomp.texi: Document omp_pause_resource{,_all} and omp_target_memcpy*

libgomp/ChangeLog:

* libgomp.texi (Runtime Library Routines): Document
omp_pause_resource, omp_pause_resource_all and
omp_target_memcpy{,_rect}{,_async}.

Co-authored-by: Sandra Loosemore <sandra@codesourcery.com>
Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
18 months agolibstdc++: [_Hashtable] Remove useless check for _M_before_begin node
Huanghui Nie [Mon, 22 Jan 2024 05:45:48 +0000 (06:45 +0100)] 
libstdc++: [_Hashtable] Remove useless check for _M_before_begin node

When removing the first node of a bucket it is useless to check if this bucket
is the one containing the _M_before_begin node. The bucket before-begin node is
already transfered to the next pointed-to bucket regardeless if it is the container
before-begin node.

libstdc++-v3/ChangeLog:

* include/bits/hashtable.h (_Hahstable<>::_M_remove_bucket_begin): Remove
_M_before_begin check and cleanup implementation.

Co-authored-by: Théo Papadopoulo <papadopoulo@gmail.com>
18 months agoRISC-V: Add regression test for vsetvl bug pr113429
Patrick O'Neill [Wed, 24 Jan 2024 00:36:53 +0000 (16:36 -0800)] 
RISC-V: Add regression test for vsetvl bug pr113429

The reduced testcase for pr113429 (cam4 failure) needed additional
modules so it wasn't committed.
The fuzzer found a c testcase that was also fixed with pr113429's fix.
Adding it as a regression test.

PR target/113429

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/vsetvl/pr113429.c: New test.

Signed-off-by: Patrick O'Neill <patrick@rivosinc.com>
18 months agoRISC-V: Fix large memory usage of VSETVL PASS [PR113495]
Juzhe-Zhong [Tue, 23 Jan 2024 10:12:49 +0000 (18:12 +0800)] 
RISC-V: Fix large memory usage of VSETVL PASS [PR113495]

SPEC 2017 wrf benchmark expose unreasonble memory usage of VSETVL PASS
that is, VSETVL PASS consume over 33 GB memory which make use impossible
to compile SPEC 2017 wrf in a laptop.

The root cause is wasting-memory variables:

unsigned num_exprs = num_bbs * num_regs;
sbitmap *avl_def_loc = sbitmap_vector_alloc (num_bbs, num_exprs);
sbitmap *m_kill = sbitmap_vector_alloc (num_bbs, num_exprs);
m_avl_def_in = sbitmap_vector_alloc (num_bbs, num_exprs);
m_avl_def_out = sbitmap_vector_alloc (num_bbs, num_exprs);

I find that compute_avl_def_data can be achieved by RTL_SSA framework.
Replace the code implementation base on RTL_SSA framework.

After this patch, the memory-hog issue is fixed.

simple vsetvl memory usage (valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out)
is 1.673 GB.

lazy vsetvl memory usage (valgrind --tool=massif --pages-as-heap=yes --massif-out-file=massif.out)
is 2.441 GB.

Tested on both RV32 and RV64, no regression.

gcc/ChangeLog:

PR target/113495
* config/riscv/riscv-vsetvl.cc (get_expr_id): Remove.
(get_regno): Ditto.
(get_bb_index): Ditto.
(pre_vsetvl::compute_avl_def_data): Ditto.
(pre_vsetvl::earliest_fuse_vsetvl_info): Fix large memory usage.
(pre_vsetvl::pre_global_vsetvl_info): Ditto.

gcc/testsuite/ChangeLog:

PR target/113495
* gcc.target/riscv/rvv/vsetvl/avl_single-107.c: Adapt test.

18 months agoDaily bump.
GCC Administrator [Wed, 24 Jan 2024 00:18:36 +0000 (00:18 +0000)] 
Daily bump.

18 months agotestsuite: Disable new test for PR113292 on targets without TLS support
Nathaniel Shead [Fri, 19 Jan 2024 07:15:22 +0000 (18:15 +1100)] 
testsuite: Disable new test for PR113292 on targets without TLS support

This disables the new test added by r14-8168 on machines that don't have
TLS support, such as bare-metal ARM.

gcc/testsuite/ChangeLog:

* g++.dg/modules/pr113292_c.C: Require TLS.

Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com>
18 months agoc++: -Wdangling-reference and lambda false warning [PR109640]
Marek Polacek [Fri, 19 Jan 2024 18:59:41 +0000 (13:59 -0500)] 
c++: -Wdangling-reference and lambda false warning [PR109640]

-Wdangling-reference checks if a function receives a temporary as its
argument, and only warns if any of the arguments was a temporary.  But
we should not warn when the temporary represents a lambda or we generate
false positives as in the attached testcases.

PR c++/113256
PR c++/111607
PR c++/109640

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn if the temporary
is of lambda type.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference14.C: New test.
* g++.dg/warn/Wdangling-reference15.C: New test.
* g++.dg/warn/Wdangling-reference16.C: New test.

18 months agoMAINTAINERS: Update my email address
Tobias Burnus [Tue, 23 Jan 2024 21:18:57 +0000 (22:18 +0100)] 
MAINTAINERS: Update my email address

ChangeLog:

* MAINTAINERS: Update my email address.

Signed-off-by: Tobias Burnus <tburnus@baylibre.com>
18 months agoc: Call c_fully_fold on __atomic_* operands in atomic_bitint_fetch_using_cas_loop...
Jakub Jelinek [Tue, 23 Jan 2024 18:59:00 +0000 (19:59 +0100)] 
c: Call c_fully_fold on __atomic_* operands in atomic_bitint_fetch_using_cas_loop [PR113518]

As the following testcase shows, I forgot to call c_fully_fold on the
__atomic_*/__sync_* operands called on _BitInt address, the expressions
are then used inside of TARGET_EXPR initializers etc. and are never fully
folded later, which means we can ICE e.g. on C_MAYBE_CONST_EXPR trees
inside of those.

The following patch fixes it, while the function currently is only called
in the C FE because C++ doesn't support BITINT_TYPE, I think guarding the
calls on !c_dialect_cxx () is safer.

2024-01-23  Jakub Jelinek  <jakub@redhat.com>

PR c/113518
* c-common.cc (atomic_bitint_fetch_using_cas_loop): Call c_fully_fold
on lhs_addr, val and model for C.

* gcc.dg/bitint-77.c: New test.

18 months agoaarch64/expr: Use ccmp when the outer expression is used twice [PR100942]
Andrew Pinski [Tue, 23 Jan 2024 17:42:51 +0000 (17:42 +0000)] 
aarch64/expr: Use ccmp when the outer expression is used twice [PR100942]

Ccmp is not used if the result of the and/ior is used by both
a GIMPLE_COND and a GIMPLE_ASSIGN. This improves the code generation
here by using ccmp in this case.
Two changes is required, first we need to allow the outer statement's
result be used more than once.
The second change is that during the expansion of the gimple, we need
to try using ccmp. This is needed because we don't use expand the ssa
name of the lhs but rather expand directly from the gimple.

A small note on the ccmp_4.c testcase, we should be able to get slightly
better than with this patch but it is one extra instruction compared to
before.

PR target/100942

gcc/ChangeLog:

* ccmp.cc (ccmp_candidate_p): Add outer argument.
Allow if the outer is true and the lhs is used more
than once.
(expand_ccmp_expr): Update call to ccmp_candidate_p.
* expr.h (expand_expr_real_gassign): Declare.
* expr.cc (expand_expr_real_gassign): New function, split out from...
(expand_expr_real_1): ...here.
* cfgexpand.cc (expand_gimple_stmt_1): Use expand_expr_real_gassign.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/ccmp_3.c: New test.
* gcc.target/aarch64/ccmp_4.c: New test.
* gcc.target/aarch64/ccmp_5.c: New test.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Co-Authored-By: Richard Sandiford <richard.sandiford@arm.com>
18 months agoUpdate my email in MAINTAINERS
Andrew Stubbs [Tue, 23 Jan 2024 11:21:55 +0000 (11:21 +0000)] 
Update my email in MAINTAINERS

ChangeLog:

* MAINTAINERS: Update

Signed-off-by: Andrew Stubbs <ams@baylibre.com>