]> git.ipfire.org Git - thirdparty/gcc.git/log
thirdparty/gcc.git
9 months agotree-optimization/117147 - add testcase
Richard Biener [Tue, 15 Oct 2024 09:36:33 +0000 (11:36 +0200)] 
tree-optimization/117147 - add testcase

The following adds a testcase for the PR.

PR tree-optimization/117147
* gcc.dg/vect/pr117147.c: New testcase.

9 months agotree-optimization/117138 - fix ICE with vector comparison in COND_EXPR
Richard Biener [Tue, 15 Oct 2024 08:23:06 +0000 (10:23 +0200)] 
tree-optimization/117138 - fix ICE with vector comparison in COND_EXPR

The range folding code of COND_EXPRs missed a check whether the
comparison operand type is supported.

PR tree-optimization/117138
* gimple-range-fold.cc (fold_using_range::condexpr_adjust):
Check if the comparison operand type is supported.

* gcc.dg/torture/pr117138.c: New testcase.

9 months agomiddle-end/117137 - expansion issue with vector equality compares
Richard Biener [Tue, 15 Oct 2024 07:48:10 +0000 (09:48 +0200)] 
middle-end/117137 - expansion issue with vector equality compares

When expanding a COND_EXPR with a vector equality compare as condition
expand_cond_expr_using_cmove fails to properly go the cbranch path.
I failed to massage it's twisted logic so the simple fix is to make
sure to expand a vector condition separately which also generates
the expected code for the testcase:

        ptest   %xmm0, %xmm0
        cmovne  %edi, %eax

PR middle-end/117137
* expr.cc (expand_cond_expr_using_cmove): Make sure to
expand vector comparisons separately.

* gcc.dg/torture/pr117137.c: New testcase.

9 months agotree-optimization/117147 - bogus re-use of previous ldst_p
Richard Biener [Tue, 15 Oct 2024 07:22:09 +0000 (09:22 +0200)] 
tree-optimization/117147 - bogus re-use of previous ldst_p

The following shows that in vect_build_slp_tree_1 we're eventually
re-using the previous lane set ldst_p flag.  Fixed by some
refactoring.

PR tree-optimization/117147
* tree-vect-slp.cc (vect_build_slp_tree_1): Put vars and
initialization of per-lane data into the per-lane processing
loop to avoid re-using previous lane state.

9 months agoFortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device...
Thomas Schwinge [Tue, 15 Oct 2024 07:29:53 +0000 (09:29 +0200)] 
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Fix 'is_builtin' initialization

Bug fix for commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11
"Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device".

PR fortran/82250
PR fortran/82251
PR fortran/117136
gcc/fortran/
* trans-expr.cc (gfc_conv_procedure_call): Initialize
'is_builtin'.
(conv_function_val): Clean up.

Co-authored-by: Harald Anlauf <anlauf@gmx.de>
9 months agoSVE intrinsics: Fold svmul with constant power-of-2 operand to svlsl
Jennifer Schmitz [Thu, 19 Sep 2024 10:18:05 +0000 (03:18 -0700)] 
SVE intrinsics: Fold svmul with constant power-of-2 operand to svlsl

For svmul, if one of the operands is a constant vector with a uniform
power of 2, this patch folds the multiplication to a left-shift by
immediate (svlsl).
Because the shift amount in svlsl is the second operand, the order of the
operands is switched, if the first operand contained the powers of 2. However,
this switching is not valid for some predications: If the predication is
_m and the predicate not ptrue, the result of svlsl might not be the
same as for svmul. Therefore, we do not apply the fold in this case.
The transform is also not applied to constant vectors of 1 (this case is
partially covered by constant folding already and the missing cases will be
addressed by the follow-up patch suggested in
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/663275.html).

Tests were added in the existing test harness to check the produced assembly
- when the first or second operand contains the power of 2
- when the second operand is a vector or scalar (_n)
- for _m, _z, _x predication
- for _m with ptrue or non-ptrue
- for intmin for signed integer types
- for the maximum power of 2 for signed and unsigned integer types.
Note that we used 4 as a power of 2, instead of 2, because a recent
patch optimizes left-shifts by 1 to an add instruction. But since we
wanted to highlight the change to an lsl instruction we used a higher
power of 2.
To also check correctness, runtime tests were added.

The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
* config/aarch64/aarch64-sve-builtins-base.cc (svmul_impl::fold):
Implement fold to svlsl for power-of-2 operands.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/asm/mul_s8.c: New test.
* gcc.target/aarch64/sve/acle/asm/mul_s16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_s32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_s64.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u8.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u16.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u32.c: Likewise.
* gcc.target/aarch64/sve/acle/asm/mul_u64.c: Likewise.
* gcc.target/aarch64/sve/mul_const_run.c: Likewise.

9 months agolibcpp: Add -Wtrailing-blanks warning
Jakub Jelinek [Tue, 15 Oct 2024 05:53:56 +0000 (07:53 +0200)] 
libcpp: Add -Wtrailing-blanks warning

Trailing blanks is something even git diff diagnoses; while it is a coding
style issue, if it is so common that git diff diagnoses it, I think it could
be useful to various projects to check that at compile time.

Dunno if it should be included in -Wextra, currently it isn't, and due to
tons of trailing whitespace in our sources, haven't enabled it for when
building gcc itself either.

Note, git diff also diagnoses indentation with tab following space, wonder
if we couldn't have trivial warning options where one would simply ask for
checking of indentation with no tabs, just spaces vs. indentation with
tabs followed by spaces (but never tab width or more spaces in the
indentation).  I think that would be easy to do also on the libcpp side.
Checking how much something should be exactly indented requires syntax
analysis (at least some limited one) and can consider columns of first token
on line, but what the exact indentation blanks were is something only libcpp
knows.

On Thu, Sep 19, 2024 at 08:17:24AM +0200, Richard Biener wrote:
> Generally I like diagnosing this early.  For the above I'd say -Wtrailing-whitespace=
> with a set of things to diagnose (and a sane default - just spaces and tabs - for
> -Wtrailiing-whitespace) would be nice.  As for naming possibly follow the
> is{space,blank,cntrl} character classifications?  If those are a good
> fit, that is.

The patch currently allows blank (' ' '\t') and space (' ' '\t' '\f' '\v'),
cntrl not yet added, not anything non-ASCII, but in theory could
be added later (though, non-ASCII would be just for inside of comments,
say non-breaking space etc. in the source is otherwise an error).

2024-10-15  Jakub Jelinek  <jakub@redhat.com>

libcpp/
* include/cpplib.h (struct cpp_options): Add
cpp_warn_trailing_whitespace member.
(enum cpp_warning_reason): Add CPP_W_TRAILING_WHITESPACE.
* internal.h (struct _cpp_line_note): Document 'W' line note.
* lex.cc (_cpp_clean_line): Add 'W' line note for trailing whitespace
except for trailing whitespace after backslash.  Formatting fix.
(_cpp_process_line_notes): Emit -Wtrailing-whitespace diagnostics.
Formatting fixes.
(lex_raw_string): Clear type on 'W' notes.
gcc/
* doc/invoke.texi (Wtrailing-whitespace): Document.
gcc/c-family/
* c.opt (Wtrailing-whitespace=): New option.
(Wtrailing-whitespace): New alias.
* c.opt.urls: Regenerate.
gcc/testsuite/
* c-c++-common/cpp/Wtrailing-whitespace-1.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-2.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-3.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-4.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-5.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-6.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-7.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-8.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-9.c: New test.
* c-c++-common/cpp/Wtrailing-whitespace-10.c: New test.

9 months agogenmatch: Revert recent genmatch changes, instead add custom diag_vfprintf routine...
Jakub Jelinek [Tue, 15 Oct 2024 05:50:35 +0000 (07:50 +0200)] 
genmatch: Revert recent genmatch changes, instead add custom diag_vfprintf routine [PR117110]

My recent changes to genmatch apparently broke bootstrap on FreeBSD
and Darwin and perhaps others, and also broke $build != $host
builds including canadian cross.

The change was to link in libcommon.a into build/genmatch, so that
we can use pp_format_verbatim.  Unfortunately that has various
dependencies in libcommon.a, and more importantly, libcommon.a is
a host library, while build/genmatch carefully links with build/vec.o
etc., build version of libcpp.
So, in order to use pretty-print.o stuff, we'd need to build a build/
version of all those objects and worse ensure there is and we properly
link build version of libintl and/or libiconv when needed (those 2 are
the reasons for FreeBSD/Darwin failures).

The following patch just reverts those changes and instead adds a very
simple variant of gcc_diag style vfprintf, which prints the result
directly into a stream.
We don't need anything fancy, like UTF-8 quotes, colors, URLs, in the
usual case genmatch shouldn't print anything at all.
The patch implements what pretty-print.cc implements, except the fancy
stuff (no colors, no URLs printed, quotes always printed just as
'something', strings even in %qs printed normally rather than trying to
pass through ASCII and valid UTF-8 and use <80><35> style printing for the
rest) and except %@ and %e (neither libcpp nor genmatch.cc use those
currently and they need extra structures etc. which aren't used in libcpp
at all).  It handles both "%.*s %d" and "%3$.*2$s %1$d" styles just in case
something got translated (although at least the cross-compiler and stage1
genmatch shouldn't be translating anything, but stage2+ native can).

I've tested it with hacking up most of pretty-print.cc self-tests
to just use warning_at ((location_t) 1, ...) and doing manual verification
of what was printed vs. what was expected (with a few additions for the
%M$ style formats); as it goes into a FILE * directly, I'm afraid self-tests
of this aren't easily possible.

2024-10-15  Jakub Jelinek  <jakub@redhat.com>

PR bootstrap/117110
* Makefile.in (generated_files, generated_match_files,
build/genmatch$(build_exeext), LINKER_FOR_BUILD): Revert
2024-10-12 changes.
* genmatch.cc: Don't include pretty-print.h and input.h.
(fatal, ggc_internal_cleared_alloc, ggc_free, line_table,
linemap_client_expand_location_to_spelling_point): Revert
2024-10-12 changes.
(DIAG_ARGMAX): Define.
(diag_integer_with_precision): Define.
(diag_vfprintf): New function.
(diagnostic_cb): Use diag_vfprintf instead of pp_format_verbatim.
(output_line_directive): Revert 2024-10-12 changes.

9 months agoRISC-V: Fix UNRESOLVED testcases for SAT alu vector mode
Pan Li [Tue, 15 Oct 2024 01:19:44 +0000 (09:19 +0800)] 
RISC-V: Fix UNRESOLVED testcases for SAT alu vector mode

Some saturation related alu testcases missed additional option
for expand check, which result in some UNRESOLVED issues.  This
patch would like to fix it by adding the option back as other
testcases.

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-13.c: Add
compile option for expanding check.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-22.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-29.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-15.c: Ditto.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add_imm-7.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-11.c: Ditto.
* gcc.target/riscv/rvv/autovec/unop/vec_sat_u_trunc-12.c: Ditto.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agodiagnostics: fix overload of emit_diagnostic [PR117109]
David Malcolm [Mon, 14 Oct 2024 23:22:46 +0000 (19:22 -0400)] 
diagnostics: fix overload of emit_diagnostic [PR117109]

I accidentally broke "make gcc.pot" in r15-4081 by adding
a member function diagnostic_context::emit_diagnostic with a
gmsgid in a different position to the existing emit_diagnostic
functions, which exgettext's parser can't handle.

Fixed thusly.

gcc/ChangeLog:
PR bootstrap/117109
* diagnostic-format-sarif.cc
(diagnostic_output_format_init_sarif_file): Rename
diagnostic_context::emit_diagnostic to
diagnostic_context::emit_diagnostic_with_group.
* diagnostic.cc (diagnostic_context::emit_diagnostic): Rename
to...
(diagnostic_context::emit_diagnostic_with_group): ...this.
(diagnostic_context::emit_diagnostic_va): Rename to...
(diagnostic_context::emit_diagnostic_with_group_va): ...this.
* diagnostic.h (diagnostic_context::emit_diagnostic): Rename to...
(diagnostic_context::emit_diagnostic_with_group): ...this.
(diagnostic_context::emit_diagnostic_va): Rename to...
(diagnostic_context::emit_diagnostic_with_group_va): ...this.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
9 months agolibcpp: avoid extra spaces in module preprocessing
Jason Merrill [Tue, 8 Oct 2024 22:26:40 +0000 (18:26 -0400)] 
libcpp: avoid extra spaces in module preprocessing

Within the compiler, module keywords "import", "module", and "export" that
are recognized as part of module directives gain an extra trailing space to
distinguish them from other non-keyword uses of those words in the code.
But when dumping preprocessed output, printing those spaces creates a
gratuitous inconsistency with non-modules preprocessing, as revealed by
several of the g++.dg/modules/cpp* tests if modules are enabled by default
in C++20 mode.

libcpp/ChangeLog:

* lex.cc (cpp_output_token): Omit terminal space from name.

gcc/testsuite/ChangeLog:

* g++.dg/modules/cpp-2_c.C: Expect only one space after import.
* g++.dg/modules/cpp-5_c.C
* g++.dg/modules/dep-2.C
* g++.dg/modules/dir-only-2_b.C
* g++.dg/modules/pr99050_b.C
* g++.dg/modules/inc-xlate-1_b.H
* g++.dg/modules/legacy-3_b.H
* g++.dg/modules/legacy-3_c.H: Likewise.

9 months agolibstdc++: Implement LWG 3564 for ranges::transform_view
Jonathan Wakely [Sun, 13 Oct 2024 20:47:14 +0000 (21:47 +0100)] 
libstdc++: Implement LWG 3564 for ranges::transform_view

The _Iterator<true> type returned by begin() const uses const F& to
transform the elements, so it should use const F& to determine the
iterator's value_type and iterator_category as well.

This was accepted into the WP in July 2022.

libstdc++-v3/ChangeLog:

* include/std/ranges (transform_view:_Iterator): Use const F&
to determine value_type and iterator_category of
_Iterator<true>, as per LWG 3564.
* testsuite/std/ranges/adaptors/transform.cc: Check value_type
and iterator_category.

Reviewed-by: Patrick Palka <ppalka@redhat.com>
9 months agoc++: address deduction and concepts [CWG2918]
Jason Merrill [Fri, 11 Oct 2024 18:52:43 +0000 (14:52 -0400)] 
c++: address deduction and concepts [CWG2918]

CWG2918 changes deduction from an overload set for the case where multiple
candidates succeed and have the same type; previously this made the overload
set a non-deduced context, now it succeeds since the result is consistent
between the candidates.

This is needed for cases of overloading based on requirements, where we want
to choose the most constrained overload.  I also needed to adjust
resolve_address_of_overloaded_function accordingly; we already handled the
comparison for template candidates in most_specialized_instantiation, but
need to also do the comparison for non-template candidates such as member
functions of a class template.

CWG 2918 (proposed)

gcc/cp/ChangeLog:

* cp-tree.h (most_constrained_function): Declare..
* class.cc (resolve_address_of_overloaded_function): Call it.
* pt.cc (get_template_for_ordering): Handle list from
resolve_address_of_overloaded_function.
(most_constrained_function): No longer static.
(resolve_overloaded_unification): Always compare type rather
than decl.

gcc/testsuite/ChangeLog:

* g++.dg/DRs/dr2918.C: New test.

9 months agoOpenACC 'nohost' clause: harmonize 'libgomp.oacc-{c-c++-common,fortran}/routine-nohos...
Thomas Schwinge [Mon, 14 Oct 2024 12:38:13 +0000 (14:38 +0200)] 
OpenACC 'nohost' clause: harmonize 'libgomp.oacc-{c-c++-common,fortran}/routine-nohost-1.*'

The test case 'libgomp.oacc-fortran/routine-nohost-1.f90' added in 2021
commit a61f6afbee370785cf091fe46e2e022748528307 "OpenACC 'nohost' clause" was
dependend on inlining being enabled, and otherwise ('-fno-inline') failed to
optimize/link:

    /tmp/ccb2hsPd.o: In function `MAIN__._omp_fn.0':
    routine-nohost-1.f90:(.text+0xf4): undefined reference to `fact_nohost_'

However, as of recent commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11
"Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device",
we're now properly handling OpenACC/Fortran 'acc_on_device', and may specify
'-fno-inline', like done in 'libgomp.oacc-c-c++-common/routine-nohost-1.c'.

libgomp/
* testsuite/libgomp.oacc-fortran/routine-nohost-1.f90: Add
'-fno-inline'.

9 months agolibstdc++: Populate generic std::time_get's wide %c format [PR117135]
Jonathan Wakely [Tue, 24 Sep 2024 22:20:56 +0000 (23:20 +0100)] 
libstdc++: Populate generic std::time_get's wide %c format [PR117135]

I missed out the __timepunct<wchar_t> specialization for the "generic"
implementation when defining the %c format in r15-4016-gc534e37faccf48.

libstdc++-v3/ChangeLog:

PR libstdc++/117135
* config/locale/generic/time_members.cc
(__timepunct<wchar_t>::_M_initialize_timepunc): Set
_M_date_time_format for C locale. Set %Ex formats to the same
values as the %x formats.

9 months agolibstdc++: Constrain std::expected comparisons (P3379R0)
Jonathan Wakely [Wed, 28 Aug 2024 11:01:18 +0000 (12:01 +0100)] 
libstdc++: Constrain std::expected comparisons (P3379R0)

This proposal of mine has been approved by LEWG and forwarded to LWG. I
expect it to be voted into the draft without significant changes.

libstdc++-v3/ChangeLog:

* include/bits/version.def (constrained_equality): Bump value.
* include/bits/version.h: Regenerate.
* include/std/expected (operator==): Add constraints and
noexcept specifiers.
* testsuite/20_util/optional/relops/constrained.cc: Adjust
check for feature test macro.
* testsuite/20_util/pair/comparison_operators/constrained.cc:
Likewise.
* testsuite/20_util/tuple/comparison_operators/constrained.cc:
Likewise.
* testsuite/20_util/variant/relops/constrained.cc: Likewise.
* testsuite/20_util/expected/equality_constrained.cc: New test.

9 months agofold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997]
Andre Vieira [Mon, 14 Oct 2024 15:24:07 +0000 (16:24 +0100)] 
fold-const: Fix BIT_INSERT_EXPR folding for BYTES_BIG_ENDIAN [PR116997]

Fix constant folding of BIT_INSER_EXPR for BYTES_BIG_ENDIAN targets.

gcc/ChangeLog:

PR middle-end/116997
* fold-const.cc (fold_ternary_loc): Fix BIT_INSERT_EXPR constant folding
for BYTES_BIG_ENDIAN targets.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/pr116997.c: New test.

Co-authored-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agodce: Use a base common base class for pass_cd_dce and pass_dce
Andrew Pinski [Sun, 13 Oct 2024 18:40:39 +0000 (11:40 -0700)] 
dce: Use a base common base class for pass_cd_dce and pass_dce

The classes pass_dce and pass_cd_dce share the same mechansim for their
params and almost the same execute functionality so let's create a new
base class which will be used for these two classes and move the common
code into the same one.

Note update_address_taken_p was updated to be a NSDMI instead of initializing
it explicitly in the constructor.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-dce.cc (tree_ssa_dce): Remove.
(tree_ssa_cd_dce): Remove.
(class pass_dce_base): New class.
(class pass_dce): Use pass_dce_base as the base class.
(class pass_cd_dce): Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agodce: add remove_unused_locals conditionally to the todos [PR117096]
Andrew Pinski [Sun, 13 Oct 2024 18:16:51 +0000 (11:16 -0700)] 
dce: add remove_unused_locals conditionally to the todos [PR117096]

This is the updated patch with the suggestion from:
https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665217.html
Where we use a second arg/param to set which passes we want to have
the remove_unused_locals on the dce.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

PR tree-optimization/117096
* passes.def: Update some of the dce/cd-cde passes setting
the 2nd arg to true.
Also remove comment about stdarg since dce does it.
* tree-ssa-dce.cc (pass_dce): Add remove_unused_locals_p field.
Update set_pass_param to allow for 2nd param.
Use remove_unused_locals_p in execute to return TODO_remove_unused_locals.
(pass_cd_dce): Likewise.
* tree-stdarg.cc (pass_data_stdarg): Remove TODO_remove_unused_locals.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agopasses: Allow for second param for NEXT_PASS
Andrew Pinski [Sun, 13 Oct 2024 18:09:14 +0000 (11:09 -0700)] 
passes: Allow for second param for NEXT_PASS

Right now we currently only support 1 parameter for each pass in NEXT_PASS.
We also don't error out if someone tries to use more than 1.
This adds support for more than one but only to a max of max_number_args
(which is currently 2).
In the next patch, this will be used for DCE, adding a new parameter.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* gen-pass-instances.awk (END): Handle processing
of multiple arguments to NEXT_PASS. Also error out
if using more than max_number_args (2).
* pass_manager.h (NEXT_PASS_WITH_ARG2): New define.
* passes.cc (NEXT_PASS_WITH_ARG2): New define.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agopasses: Move #undef to pass-instances.def
Andrew Pinski [Sun, 13 Oct 2024 16:46:03 +0000 (09:46 -0700)] 
passes: Move #undef to pass-instances.def

Like what was done r6-4608-g0aad01985747ab for builtins.def/DEF_BUILTIN,
the same should be done for the defines that are used for pass-instances.def.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* gen-pass-instances.awk: Print out the #undefs.
* pass_manager.h: Don't #undef INSERT_PASSES_AFTER,
PUSH_INSERT_PASSES_WITHIN, POP_INSERT_PASSES, NEXT_PASS,
NEXT_PASS_WITH_ARG, and TERMINATE_PASS_LIST.
* passes.cc: Likewise.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
9 months agolibcpp: Fix _Pragma("GCC system_header") [PR114436]
Lewis Hyatt [Fri, 22 Mar 2024 19:42:43 +0000 (15:42 -0400)] 
libcpp: Fix _Pragma("GCC system_header") [PR114436]

_Pragma("GCC system_header") currently takes effect only partially. It does
succeed in updating the line_map, so that checks like in_system_header_at()
return correctly, but it does not update pfile->buffer->sysp.  One result is
that a subsequent #include does not set up the system header state properly
for the newly included file, as pointed out in the PR. Fix by propagating
the new system header state back to the buffer after processing the pragma.

libcpp/ChangeLog:

PR preprocessor/114436
* directives.cc (destringize_and_run): If the _Pragma changed the
buffer system header state (e.g. because it was "GCC
system_header"), propagate that change back to the actual buffer
too.

gcc/testsuite/ChangeLog:

PR preprocessor/114436
* c-c++-common/cpp/pragma-system-header-1.h: New test.
* c-c++-common/cpp/pragma-system-header-2.h: New test.
* c-c++-common/cpp/pragma-system-header.c: New test.

9 months agolibcpp: Support extended characters for #pragma {push,pop}_macro [PR109704]
Lewis Hyatt [Fri, 12 Jan 2024 18:26:06 +0000 (13:26 -0500)] 
libcpp: Support extended characters for #pragma {push,pop}_macro [PR109704]

The implementation of #pragma push_macro and #pragma pop_macro has to date
made use of an ad-hoc function, _cpp_lex_identifier(), which lexes an
identifier out of a string. When support was added for extended characters
in identifiers ($, UCNs, or UTF-8), that support was added only for the
"normal" way of lexing identifiers out of a cpp_buffer (_cpp_lex_direct) and
not for the ad-hoc way. Consequently, extended identifiers are not usable
with these pragmas.

The logic for lexing identifiers has become more complicated than it was
when _cpp_lex_identifier() was written -- it now handles things like \N{}
escapes in C++, for instance -- and it no longer seems practical to maintain
a redundant code path for lexing identifiers. Address the issue by changing
the implementation of #pragma {push,pop}_macro to lex identifiers in the
expected way, i.e. by pushing a cpp_buffer and lexing the identifier from
there.

The existing implementation has some quirks because of the ad-hoc parsing
logic. For example:

 #pragma push_macro("X ")
 ...
 #pragma pop_macro("X")

will not restore macro X (note the extra space in the first string). However:

 #pragma push_macro("X ")
 ...
 #pragma pop_macro("X ")

actually does sucessfully restore "X". This is because the key for looking
up the saved macro on the push stack is the original string passed, so the
string passed to pop_macro needs to match it exactly. It is not that easy to
reproduce this logic in the world of extended characters, given that for
example it should be valid to pass a UCN to push_macro, and the
corresponding UTF-8 to pop_macro. Given that this aspect of the existing
behavior seems unintentional and has no tests (and does not match other
implementations), I opted to make the new logic more straightforward. The
string passed needs to lex to one token, which must be a valid identifier,
or else no action is taken and no error is generated. Any diagnostics
encountered during lexing (e.g., due to a UTF-8 character not permitted to
appear in an identifier) are also suppressed.

It could be nice (for GCC 15) to also add a warning if a pop_macro does not
match a previous push_macro.

libcpp/ChangeLog:

PR preprocessor/109704
* include/cpplib.h (class cpp_auto_suppress_diagnostics): New class.
* errors.cc
(cpp_auto_suppress_diagnostics::cpp_auto_suppress_diagnostics): New
function.
(cpp_auto_suppress_diagnostics::~cpp_auto_suppress_diagnostics): New
function.
* charset.cc (noop_diagnostic_cb): Remove.
(cpp_interpret_string_ranges): Refactor diagnostic suppression logic
into new class cpp_auto_suppress_diagnostics.
(count_source_chars): Likewise.
* directives.cc (cpp_pop_definition): Add cpp_hashnode argument.
(lex_identifier_from_string): New static helper function.
(push_pop_macro_common): Refactor common logic from
do_pragma_push_macro and do_pragma_pop_macro; use
lex_identifier_from_string instead of _cpp_lex_identifier.
(do_pragma_push_macro): Reimplement using push_pop_macro_common.
(do_pragma_pop_macro): Likewise.
* internal.h (_cpp_lex_identifier): Remove.
* lex.cc (lex_identifier_intern): Remove.
(_cpp_lex_identifier): Remove.

gcc/testsuite/ChangeLog:

PR preprocessor/109704
* c-c++-common/cpp/pragma-push-pop-utf8.c: New test.
* g++.dg/pch/pushpop-2.C: New test.
* g++.dg/pch/pushpop-2.Hs: New test.
* gcc.dg/pch/pushpop-2.c: New test.
* gcc.dg/pch/pushpop-2.hs: New test.

9 months agoAllow for class type coarray parameters. [PR77871]
Andre Vehreschild [Thu, 15 Aug 2024 11:49:49 +0000 (13:49 +0200)] 
Allow for class type coarray parameters. [PR77871]

gcc/fortran/ChangeLog:

PR fortran/77871

* trans-expr.cc (gfc_conv_derived_to_class): Assign token when
converting a coarray to class.
(gfc_get_tree_for_caf_expr): For classes get the caf decl from
the saved descriptor.
(gfc_get_caf_token_offset):Assert that coarray=lib is set and
cover more cases where the tree having the coarray token can be.
* trans-intrinsic.cc (gfc_conv_intrinsic_caf_get): Use unified
test for pointers.

gcc/testsuite/ChangeLog:

* gfortran.dg/coarray/dummy_3.f90: New test.

9 months agomiddle-end: copy STMT_VINFO_STRIDED_P when DR is replaced [PR116956]
Tamar Christina [Mon, 14 Oct 2024 13:01:24 +0000 (14:01 +0100)] 
middle-end: copy STMT_VINFO_STRIDED_P when DR is replaced [PR116956]

When move_dr copies a DR from one statement to another, it seems we've
forgotten to copy the STMT_VINFO_STRIDED_P flag.  This leaves the new DR in a
broken state where it has a non constant stride but isn't marked as strided.

This causes the ICE in the PR because dataref analysis fails during epilogue
vectorization because there is an assumption in place that while costing may
fail for epiloque vectorization, that DR analysis cannot if it succeeded for
the main loop.

gcc/ChangeLog:

PR tree-optimization/116956
* tree-vectorizer.cc (vec_info::move_dr): Copy STMT_VINFO_STRIDED_P.

gcc/testsuite/ChangeLog:

PR tree-optimization/116956
* gfortran.dg/vect/pr116956.f90: New test.

9 months agosimplify-rtx: Fix incorrect folding of shift and AND [PR117012]
Tamar Christina [Mon, 14 Oct 2024 13:00:25 +0000 (14:00 +0100)] 
simplify-rtx: Fix incorrect folding of shift and AND [PR117012]

The optimization added in r15-1047-g7876cde25cbd2f is using the wrong
operaiton to check for uniform constant vectors.

The Author intended to check that all the lanes in the vector are the same and
so used CONST_VECTOR_DUPLICATE_P.  However this only checks that the vector
is created from a pattern duplication, but doesn't say how many pattern
alternatives make up the duplication.  Normally would would need to check this
separately or use const_vec_duplicate_p.

Without this the optimization incorrectly triggers.

gcc/ChangeLog:

PR rtl-optimization/117012
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1): Use
const_vec_duplicate_p instead of CONST_VECTOR_DUPLICATE_P.

gcc/testsuite/ChangeLog:

PR rtl-optimization/117012
* gcc.target/aarch64/pr117012.c: New test.

9 months agoAArch64: rename the SVE2 psel intrinsics to psel_lane [PR116371]
Tamar Christina [Mon, 14 Oct 2024 12:58:09 +0000 (13:58 +0100)] 
AArch64: rename the SVE2 psel intrinsics to psel_lane [PR116371]

The psel intrinsics. similar to the pext, should be name psel_lane.  This
corrects the naming.

gcc/ChangeLog:

PR target/116371
* config/aarch64/aarch64-sve-builtins-sve2.cc (class svpsel_impl):
Renamed to ...
(class svpsel_lane_impl): ... This and adjust initialization.
* config/aarch64/aarch64-sve-builtins-sve2.def (svpsel): Renamed to ...
(svpsel_lane): ... This.
* config/aarch64/aarch64-sve-builtins-sve2.h (svpsel): Renamed to
svpsel_lane.

gcc/testsuite/ChangeLog:

PR target/116371
* gcc.target/aarch64/sme2/acle-asm/psel_b16.c,
gcc.target/aarch64/sme2/acle-asm/psel_b32.c,
gcc.target/aarch64/sme2/acle-asm/psel_b64.c,
gcc.target/aarch64/sme2/acle-asm/psel_b8.c,
gcc.target/aarch64/sme2/acle-asm/psel_c16.c,
gcc.target/aarch64/sme2/acle-asm/psel_c32.c,
gcc.target/aarch64/sme2/acle-asm/psel_c64.c,
gcc.target/aarch64/sme2/acle-asm/psel_c8.c: Renamed to....
* gcc.target/aarch64/sme2/acle-asm/psel_lane_b16.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_b32.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_b64.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_b8.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_c16.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_c32.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_c64.c,
gcc.target/aarch64/sme2/acle-asm/psel_lane_c8.c: ... These.

9 months agoRISC-V: Add detailed comments on processing implied extensions. [NFC]
Yangyu Chen [Mon, 14 Oct 2024 10:31:06 +0000 (18:31 +0800)] 
RISC-V: Add detailed comments on processing implied extensions. [NFC]

In some cases, we don't need to handle implied extensions. Add detailed
comments to help developers understand what implied ISAs should be
considered.

libgcc/ChangeLog:

* config/riscv/feature_bits.c (__init_riscv_features_bits_linux):
Add detailed comments on processing implied extensions.

Signed-off-by: Yangyu Chen <chenyangyu@isrc.iscas.ac.cn>
9 months agomiddle-end: support SLP early break
Tamar Christina [Mon, 14 Oct 2024 10:58:59 +0000 (11:58 +0100)] 
middle-end: support SLP early break

This patch introduces feature parity for early break int the SLP only
vectorizer.

The approach taken here is to treat the early exits as root statements for an
SLP tree.  This means that we don't need any changes to build_slp to support
gconds.

Codegen for the gcond itself now has to be done out of line but the body of the
SLP blocks itself is simply driven by SLP scheduling.  There is a slight
awkwardness in having re-used vectorizable_early_exit for both SLP and non-SLP
but I've documented the differences and when I did try to refactor it it wasn't
really worth it given that this is a temporary state anyway.

This version is restricted to lane = 1, as such we can re-use the existing
move_early_break function instead of having to do safety update through
scheduling.  I have a branch where I'm working on that but lane > 1 is out of
scope for GCC 15 anyway.   The only reason I will try to get moving through
scheduling done as a stretch goal is so we get epilogue vectorization back for
early break.

The example:

unsigned test4(unsigned x)
{
 unsigned ret = 0;
 for (int i = 0; i < N; i++)
 {
   vect_b[i] = x + i;
   if (vect_a[i]*2 != x)
     break;
   vect_a[i] = x;

 }
 return ret;
}

builds the following SLP instance for early break:

note:   Analyzing vectorizable control flow: if (patt_6 != 0)
note:   Starting SLP discovery for
note:     patt_6 = _4 != x_9(D);
note:   starting SLP discovery for node 0x63abc80
note:   Build SLP for patt_6 = _4 != x_9(D);
note:   precomputed vectype: vector(4) <signed-boolean:32>
note:   nunits = 4
note:   vect_is_simple_use: operand x_9(D), type of def: external
note:   vect_is_simple_use: operand # RANGE [irange] unsigned int [0, 0][2, +INF] MASK 0xffff
        _3 * 2, type of def: internal
note:   starting SLP discovery for node 0x63abdc0
note:   Build SLP for _4 = _3 * 2;
note:   precomputed vectype: vector(4) unsigned int
note:   nunits = 4
note:   vect_is_simple_use: operand #
        vect_aD.4416[i_15], type of def: internal
note:   vect_is_simple_use: operand 2, type of def: constant
note:   starting SLP discovery for node 0x63abe60
note:   Build SLP for _3 = vect_a[i_15];
note:   precomputed vectype: vector(4) unsigned int
note:   nunits = 4
note:   SLP discovery for node 0x63abe60 succeeded
note:   SLP discovery for node 0x63abdc0 succeeded
note:   SLP discovery for node 0x63abc80 succeeded
note:   SLP size 3 vs. limit 10.
note:   Final SLP tree for instance 0x6474190:
note:   node 0x63abc80 (max_nunits=4, refcnt=2) vector(4) <signed-boolean:32>
note:   op template: patt_6 = _4 != x_9(D);
note:    stmt 0 patt_6 = _4 != x_9(D);
note:    children 0x63abd20 0x63abdc0
note:   node (external) 0x63abd20 (max_nunits=1, refcnt=1)
note:    { x_9(D) }
note:   node 0x63abdc0 (max_nunits=4, refcnt=2) vector(4) unsigned int
note:   op template: _4 = _3 * 2;
note:    stmt 0 _4 = _3 * 2;
note:    children 0x63abe60 0x63abf00
note:   node 0x63abe60 (max_nunits=4, refcnt=2) vector(4) unsigned int
note:   op template: _3 = vect_a[i_15];
note:    stmt 0 _3 = vect_a[i_15];
note:    load permutation { 0 }
note:   node (constant) 0x63abf00 (max_nunits=1, refcnt=1)
note:    { 2 }

and during codegen:

note:   ------>vectorizing SLP node starting from: patt_6 = _4 != x_9(D);
note:   vect_is_simple_use: operand # RANGE [irange] unsigned int [0, 0][2, +INF] MASK 0xffff
        _3 * 2, type of def: internal
note:   add new stmt: mask_patt_6.18_58 = _53 != vect__4.17_57;
note:    === vectorizable_early_exit ===
note:    transform early-exit.
note:   vectorizing stmts using SLP.
note:   Vectorizing SLP tree:
note:   node 0x63abfa0 (max_nunits=4, refcnt=1) vector(4) int
note:   op template: i_12 = i_15 + 1;
note:    stmt 0 i_12 = i_15 + 1;
note:    children 0x63aba00 0x63ac040
note:   node 0x63aba00 (max_nunits=4, refcnt=2) vector(4) int
note:   op template: i_15 = PHI <i_12(6), 0(14)>
note:    [l] stmt 0 i_15 = PHI <i_12(6), 0(14)>
note:    children (nil) (nil)
note:   node (constant) 0x63ac040 (max_nunits=1, refcnt=1) vector(4) int
note:    { 1 }

gcc/ChangeLog:

* tree-vect-loop.cc (vect_analyze_loop_2): Handle SLP trees with no
children.
* tree-vectorizer.h (enum slp_instance_kind): Add slp_inst_kind_gcond.
(LOOP_VINFO_EARLY_BREAKS_LIVE_IVS): New.
(vectorizable_early_exit): Expose.
(class _loop_vec_info): Add early_break_live_stmts.
* tree-vect-slp.cc (vect_build_slp_instance, vect_analyze_slp_instance):
Support gcond instances.
(vect_analyze_slp): Analyze gcond roots and early break live statements.
(maybe_push_to_hybrid_worklist): Don't sink gconds.
(vect_slp_analyze_operations): Support gconds.
(vect_slp_check_for_roots): Update comments.
(vectorize_slp_instance_root_stmt): Support gconds.
(vect_schedule_slp): Pass vinfo to vectorize_slp_instance_root_stmt.
* tree-vect-stmts.cc (vect_stmt_relevant_p): Record early break live
statements.
(vectorizable_early_exit): Support SLP.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/vect-early-break_126.c: New test.
* gcc.dg/vect/vect-early-break_127.c: New test.
* gcc.dg/vect/vect-early-break_128.c: New test.

9 months agoAdd regression test
Eric Botcazou [Mon, 14 Oct 2024 09:57:57 +0000 (11:57 +0200)] 
Add regression test

gcc/testsuite/
PR ada/114593
* gnat.dg/specs/generic_inst2-child2.ads: New test.
* gnat.dg/specs/generic_inst2.ads: New helper.
* gnat.dg/specs/generic_inst2-child1.ads: Likewise.

9 months agolibstdc++: Use std::move for iterator in ranges::fill [PR117094]
Jonathan Wakely [Sun, 13 Oct 2024 21:48:43 +0000 (22:48 +0100)] 
libstdc++: Use std::move for iterator in ranges::fill [PR117094]

Input iterators aren't required to be copyable.

libstdc++-v3/ChangeLog:

PR libstdc++/117094
* include/bits/ranges_algobase.h (__fill_fn): Use std::move for
iterator that might not be copyable.
* testsuite/25_algorithms/fill/constrained.cc: Check
non-copyable iterator with sized sentinel.

9 months agolibstdc++: Enable memset optimizations for distinct character types [PR93059]
Jonathan Wakely [Thu, 10 Oct 2024 12:36:33 +0000 (13:36 +0100)] 
libstdc++: Enable memset optimizations for distinct character types [PR93059]

Currently we only optimize std::fill to memset when the source and
destination types are the same byte-sized type. This means that we fail
to optimize cases like std::fill(buf. buf+n, 0) because the literal 0 is
not the same type as the character buffer.

Such cases can safely be optimized to use memset, because assigning an
int (or other integer) to a narrow character type has the same effects
as converting the integer to unsigned char then copying it with memset.

This patch enables the optimized code path when the fill character is a
memcpy-able integer (using the new __memcpyable_integer trait). We still
need to check is_same<U, T> to enable the memset optimization for
filling a range of std::byte with a std::byte value, because that isn't
a memcpyable integer.

libstdc++-v3/ChangeLog:

PR libstdc++/93059
* include/bits/stl_algobase.h (__fill_a1(T*, T*, const T&)):
Change template parameters and enable_if condition to allow the
fill value to be an integer.

9 months agolibstdc++: Enable memcpy optimizations for distinct integral types [PR93059]
Jonathan Wakely [Thu, 10 Oct 2024 12:36:33 +0000 (13:36 +0100)] 
libstdc++: Enable memcpy optimizations for distinct integral types [PR93059]

Currently we only optimize std::copy, std::copy_n etc. to memmove when
the source and destination types are the same. This means that we fail
to optimize copying between distinct 1-byte types, e.g. copying from a
buffer of unsigned char to a buffer of char8_t or vice versa.

This patch adds more partial specializations of the __memcpyable trait
so that we allow memcpy between integers of equal widths. This will
enable memmove for copies between narrow character types and also
between same-width types like int and unsigned.

Enabling the optimization needs to be based on the width of the integer
type, not just the size in bytes. This is because some targets define
non-standard integral types such as __int20 in msp430, which has padding
bits. It would not be safe to memcpy between e.g. __int20 and int32_t,
even though sizeof(__int20) == sizeof(int32_t). A new trait is
introduced to define the width, __memcpyable_integer, and then the
__memcpyable trait compares the widths.

It's safe to copy between signed and unsigned integers of the same
width, because GCC only supports two's complement integers.

I initially though it would be useful to define the specialization
__memcpyable_integer<byte> to enable copying between narrow character
types and std::byte. But that isn't possible with std::copy, because
is_assignable<char&, std::byte> is false. Optimized copies using memmove
will already happen for copying std::byte to std::byte, because
__memcpyable<T*, T*> is true.

libstdc++-v3/ChangeLog:

PR libstdc++/93059
* include/bits/cpp_type_traits.h (__memcpyable): Add partial
specialization for pointers to distinct types.
(__memcpyable_integer): New trait to control which types can use
cross-type memcpy optimizations.

9 months agoRISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor...
Kito Cheng [Mon, 14 Oct 2024 08:07:16 +0000 (16:07 +0800)] 
RISC-V: Implement __init_riscv_feature_bits, __riscv_feature_bits, and __riscv_vendor_feature_bits

This provides a common abstraction layer to probe the available extensions at
run-time. These functions can be used to implement function multi-versioning or
to detect available extensions.

The advantages of providing this abstraction layer are:
- Easy to port to other new platforms.
- Easier to maintain in GCC for function multi-versioning.
  - For example, maintaining platform-dependent code in C code/libgcc is much
    easier than maintaining it in GCC by creating GIMPLEs...

This API is intended to provide the capability to query minimal common available extensions on the system.

The API is defined in the riscv-c-api-doc:
https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/src/c-api.adoc

Proposal to use unsigned long long for marchid and mimpid:
https://github.com/riscv-non-isa/riscv-c-api-doc/pull/91

Full function multi-versioning implementation will come later. We are posting
this first because we intend to backport it to the GCC 14 branch to unblock
LLVM 19 to use this with GCC 14.2, rather than waiting for GCC 15.

Changes since v7:
- Remove vendorID field in __riscv_vendor_feature_bits.
- Fix C implies Zcf only for RV32.
- Add more comments to kernel versions.

Changes since v6:
- Implement __riscv_cpu_model.
- Set new sub extension bits which implied from previous extensions.

Changes since v5:
- Minor fixes on indentation.

Changes since v4:
- Bump to newest riscv-c-api-doc with some new extensions like Zve*, Zc*
  Zimop, Zcmop, Zawrs.
- Rename the return variable name of hwprobe syscall.
- Minor fixes on indentation.

Changes since v3:
- Fix non-linux build.
- Let __init_riscv_feature_bits become constructor

Changes since v2:
- Prevent it initialize more than once.

Changes since v1:
- Fix the format.
- Prevented race conditions by introducing a local variable to avoid load/store
  operations during the computation of the feature bit.

Co-Developed-by: Yangyu Chen <chenyangyu@isrc.iscas.ac.cn>
Signed-off-by: Yangyu Chen <chenyangyu@isrc.iscas.ac.cn>
libgcc/ChangeLog:

* config/riscv/feature_bits.c: New.
* config/riscv/t-elf (LIB2ADD): Add feature_bits.c.

9 months agoMAINTAINERS (s390 port): Add myself
Stefan Schulze Frielinghaus [Mon, 14 Oct 2024 09:12:48 +0000 (11:12 +0200)] 
MAINTAINERS (s390 port): Add myself

ChangeLog:

* MAINTAINERS (s390 port): Add myself.

9 months agomiddle-end: [PR middle-end/116926] Allow widening optabs for vec-mode -> scalar-mode
Victor Do Nascimento [Thu, 10 Oct 2024 11:55:04 +0000 (12:55 +0100)] 
middle-end: [PR middle-end/116926] Allow widening optabs for vec-mode -> scalar-mode

The recent refactoring of the dot_prod optab to convert-type exposed a
limitation in how `find_widening_optab_handler_and_mode' is currently
implemented, owing to the fact that, while the function expects the

  GET_MODE_CLASS (from_mode) == GET_MODE_CLASS (to_mode)

condition to hold, the c6x backend implements a dot product from V2HI
to SI, which triggers an ICE.

Consequently, this patch adds some logic to allow widening optabs
which accumulate vector elements to a single scalar.

gcc/ChangeLog:

PR middle-end/116926
* optabs-query.cc (find_widening_optab_handler_and_mode): Add
handling of vector -> scalar optab handling.

9 months agoaarch64: Fix folding of degenerate svwhilele case [PR117045]
Richard Sandiford [Mon, 14 Oct 2024 08:52:44 +0000 (09:52 +0100)] 
aarch64: Fix folding of degenerate svwhilele case [PR117045]

The svwhilele folder mishandled the degenerate case in which
the second argument is the maximum integer.  In that case,
the result is all-true regardless of the first parameter:

  If the second scalar operand is equal to the maximum signed integer
  value then a condition which includes an equality test can never fail
  and the result will be an all-true predicate.

This is because the conceptual "increment the first operand
by 1 after each element" is done modulo the range of the operand.
The GCC code was instead treating it as infinite precision.
whilele_5.c even had a test for the incorrect behaviour.

The easiest fix seemed to be to handle that case specially before
doing constant folding.  This also copes with variable first operands.

gcc/
PR target/116999
PR target/117045
* config/aarch64/aarch64-sve-builtins-base.cc
(svwhilelx_impl::fold): Check for WHILELTs of the minimum value
and WHILELEs of the maximum value.  Fold them to all-false and
all-true respectively.

gcc/testsuite/
PR target/116999
PR target/117045
* gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus
expected result.
* gcc.target/aarch64/sve/acle/general/whilele_11.c: New test.
* gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise.

9 months agoFortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device...
Thomas Schwinge [Mon, 14 Oct 2024 08:34:34 +0000 (10:34 +0200)] 
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Harmonize 'libgomp.oacc-fortran/acc_on_device-1-*'

The test case 'libgomp.oacc-fortran/acc_on_device-1-1.f90' added in
commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11
"Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device"
was missing '-fno-builtin-acc_on_device', and all
'libgomp.oacc-fortran/acc_on_device-1-*' need comments, why that option is
specified.

PR testsuite/82250
libgomp/
* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Add
'-fno-builtin-acc_on_device'.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Comment.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Comment.

9 months agoFortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device...
Thomas Schwinge [Mon, 14 Oct 2024 08:26:13 +0000 (10:26 +0200)] 
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device: Fix effective-target keyword in 'libgomp.oacc-fortran/acc_on_device-2.f90'

The test case 'libgomp.oacc-fortran/acc_on_device-2.f90' added in
commit 3269a722b7a03613e9c4e2862bc5088c4a17cc11
"Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device"
had a mismatch between dump file production and its scanning; the former needs
to use 'offload_target_nvptx' (like 'offload_target_amdgcn'), not
'offload_device_nvptx'.

PR testsuite/82250
libgomp/
* testsuite/libgomp.oacc-fortran/acc_on_device-2.f90: Fix
effective-target keyword.

9 months agomiddle-end/116891 - fix (negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)
Richard Biener [Mon, 14 Oct 2024 06:11:22 +0000 (08:11 +0200)] 
middle-end/116891 - fix (negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)

Transforming -fma (-a, b, -c) to fma (a, b, c) is only valid when
not rounding towards -inf or +inf as the sign of the multiplication
changes.

PR middle-end/116891
* match.pd ((negate (IFN_FNMS@3 @0 @1 @2)) -> (IFN_FMA @0 @1 @2)):
Only enable for !HONOR_SIGN_DEPENDENT_ROUNDING.

9 months agoRISC-V: Add testcases for form 4 of vector signed SAT_SUB
Pan Li [Sat, 12 Oct 2024 03:08:21 +0000 (11:08 +0800)] 
RISC-V: Add testcases for form 4 of vector signed SAT_SUB

Form 4:
  #define DEF_VEC_SAT_S_SUB_FMT_4(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_sub_##T##_fmt_4 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus;                                                       \
        bool overflow = __builtin_sub_overflow (x, y, &minus);         \
        out[i] = !overflow ? minus : x < 0 ? MIN : MAX;                \
      }                                                                \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-4-i8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-4-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 3 of vector signed SAT_SUB
Pan Li [Sat, 12 Oct 2024 02:40:30 +0000 (10:40 +0800)] 
RISC-V: Add testcases for form 3 of vector signed SAT_SUB

Form 3:
  #define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_sub_##T##_fmt_3 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus;                                                       \
        bool overflow = __builtin_sub_overflow (x, y, &minus);         \
        out[i] = overflow ? x < 0 ? MIN : MAX : minus;                 \
      }                                                                \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-3-i8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-3-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoMatch: Support form 3 for vector signed integer SAT_SUB
Pan Li [Sat, 12 Oct 2024 02:34:55 +0000 (10:34 +0800)] 
Match: Support form 3 for vector signed integer SAT_SUB

This patch would like to support the form 3 of the vector signed
integer SAT_SUB.  Aka below example:

Form 3:
  #define DEF_VEC_SAT_S_SUB_FMT_3(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_sub_##T##_fmt_3 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus;                                                       \
        bool overflow = __builtin_sub_overflow (x, y, &minus);         \
        out[i] = overflow ? x < 0 ? MIN : MAX : minus;                 \
      }                                                                \
  }

Before this patch:
  25   │   if (limit_11(D) != 0)
  26   │     goto <bb 3>; [89.00%]
  27   │   else
  28   │     goto <bb 8>; [11.00%]
  29   │ ;;    succ:       3
  30   │ ;;                8
  31   │
  32   │ ;;   basic block 3, loop depth 0
  33   │ ;;    pred:       2
  34   │   _13 = (unsigned long) limit_11(D);
  35   │ ;;    succ:       4
  36   │
  37   │ ;;   basic block 4, loop depth 1
  38   │ ;;    pred:       3
  39   │ ;;                7
  40   │   # ivtmp.7_34 = PHI <0(3), ivtmp.7_30(7)>
  41   │   _26 = op_1_12(D) + ivtmp.7_34;
  42   │   x_29 = MEM[(int8_t *)_26];
  43   │   _1 = op_2_14(D) + ivtmp.7_34;
  44   │   y_24 = MEM[(int8_t *)_1];
  45   │   _9 = .SUB_OVERFLOW (x_29, y_24);
  46   │   _7 = IMAGPART_EXPR <_9>;
  47   │   if (_7 != 0)
  48   │     goto <bb 6>; [50.00%]
  49   │   else
  50   │     goto <bb 5>; [50.00%]
  51   │ ;;    succ:       6
  52   │ ;;                5
  53   │
  54   │ ;;   basic block 5, loop depth 1
  55   │ ;;    pred:       4
  56   │   _42 = REALPART_EXPR <_9>;
  57   │   _2 = out_17(D) + ivtmp.7_34;
  58   │   MEM[(int8_t *)_2] = _42;
  59   │   ivtmp.7_27 = ivtmp.7_34 + 1;
  60   │   if (_13 != ivtmp.7_27)
  61   │     goto <bb 7>; [89.00%]
  62   │   else
  63   │     goto <bb 8>; [11.00%]
  64   │ ;;    succ:       7
  65   │ ;;                8
  66   │
  67   │ ;;   basic block 6, loop depth 1
  68   │ ;;    pred:       4
  69   │   _38 = x_29 < 0;
  70   │   _39 = (signed char) _38;
  71   │   _40 = -_39;
  72   │   _41 = _40 ^ 127;
  73   │   _33 = out_17(D) + ivtmp.7_34;
  74   │   MEM[(int8_t *)_33] = _41;
  75   │   ivtmp.7_25 = ivtmp.7_34 + 1;
  76   │   if (_13 != ivtmp.7_25)
  77   │     goto <bb 7>; [89.00%]
  78   │   else
  79   │     goto <bb 8>; [11.00%]

After this patch:
  77   │   _94 = .SELECT_VL (ivtmp_92, POLY_INT_CST [16, 16]);
  78   │   vect_x_13.9_81 = .MASK_LEN_LOAD (vectp_op_1.7_79, 8B, { -1, ... }, _94, 0);
  79   │   vect_y_15.12_85 = .MASK_LEN_LOAD (vectp_op_2.10_83, 8B, { -1, ... }, _94, 0);
  80   │   vect_patt_49.13_86 = .SAT_SUB (vect_x_13.9_81, vect_y_15.12_85);
  81   │   .MASK_LEN_STORE (vectp_out.14_88, 8B, { -1, ... }, _94, 0, vect_patt_49.13_86);
  82   │   vectp_op_1.7_80 = vectp_op_1.7_79 + _94;
  83   │   vectp_op_2.10_84 = vectp_op_2.10_83 + _94;
  84   │   vectp_out.14_89 = vectp_out.14_88 + _94;
  85   │   ivtmp_93 = ivtmp_92 - _94;

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add matching pattern for vector signed SAT_SUB form 3.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 2 of vector signed SAT_SUB
Pan Li [Sat, 12 Oct 2024 01:13:54 +0000 (09:13 +0800)] 
RISC-V: Add testcases for form 2 of vector signed SAT_SUB

Form 2:
  #define DEF_VEC_SAT_S_SUB_FMT_2(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_sub_##T##_fmt_2 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0 || (minus ^ x) >= 0                      \
          ? minus : x < 0 ? MIN : MAX;                                 \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_2(int8_t, uint8_t, INT8_MIN, INT8_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-2-i8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-2-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agotree-optimization/116290 - fix compare-debug issue in ldist
Richard Biener [Sun, 13 Oct 2024 13:12:44 +0000 (15:12 +0200)] 
tree-optimization/116290 - fix compare-debug issue in ldist

Loop distribution does different analysis with -g0/-g due to counting
a debug stmt starting a BB against a limit which will everntually
lead to different IVOPTs choices.  I've fixed a possible IVOPTs
issue on the way even though it doesn't make a difference here.

PR tree-optimization/116290
* tree-loop-distribution.cc (determine_reduction_stmt_1): PHIs
have no debug variants.  Start with first non-debug real stmt.
* tree-ssa-loop-ivopts.cc (find_givs_in_bb): Do not analyze
debug stmts.

* gcc.dg/pr116290.c: New testcase.

9 months agoSH: Fix cost estimation of mem load/store
Oleg Endo [Sun, 13 Oct 2024 02:36:38 +0000 (11:36 +0900)] 
SH: Fix cost estimation of mem load/store

For memory loads/stores (that contain a MEM rtx) sh_rtx_costs would wrongly
report a cost lower than 1 insn which is not accurate as it makes loads/stores
appear cheaper than simple arithmetic insns.  The cost of a load/store insn is
at least 1 insn plus the cost of the address expression (some addressing modes
can be considered more expensive than others due to additional constraints).

gcc/ChangeLog:

PR target/113533
* config/sh/sh.cc (sh_rtx_costs): Adjust cost estimation of MEM rtx
to be always at least COST_N_INSNS (1).  Forward speed argument to
sh_address_cost.

Co-authored-by: Roger Sayle <roger@nextmovesoftware.com>
9 months agoSH: Add -fno-math-errno to fsca,fsrra tests.
Oleg Endo [Sun, 13 Oct 2024 01:33:17 +0000 (10:33 +0900)] 
SH: Add -fno-math-errno to fsca,fsrra tests.

Without -fno-math-errno some of the test might fail because the expected insns
will not be generated.

gcc/testsuite/ChangeLog:
* gcc.target/sh/pr53512-1.c: Add -fno-math-errno option.
* gcc.target/sh/pr53512-2.c: Likewise.
* gcc.target/sh/pr53512-3.c: Likewise.
* gcc.target/sh/pr53512-4.c: Likewise.
* gcc.target/sh/pr54680.c: Likewise.

9 months agoDaily bump.
GCC Administrator [Mon, 14 Oct 2024 00:17:37 +0000 (00:17 +0000)] 
Daily bump.

9 months agolibstdc++: testsuite: adjust name_fortify test for pre-defined _FORTIFY_SOURCE
Sam James [Sun, 13 Oct 2024 22:22:02 +0000 (23:22 +0100)] 
libstdc++: testsuite: adjust name_fortify test for pre-defined _FORTIFY_SOURCE

Otherwise we get failures with toolchains that have _FORTIFY_SOURCE
defined already to a different value like 3.

libstdc++-v3/ChangeLog:

* testsuite/17_intro/names_fortify.cc: Undefine _FORTIFY_SOURCE.

9 months agolibstdc++: Fix ranges::copy_backward for a single memcpyable element [PR117121]
Jonathan Wakely [Sun, 13 Oct 2024 18:14:04 +0000 (19:14 +0100)] 
libstdc++: Fix ranges::copy_backward for a single memcpyable element [PR117121]

The result iterator needs to be decremented before writing to it.

Improve the PR 108846 tests for all of std::copy, std::copy_n,
std::copy_backward, and the std::ranges versions.

libstdc++-v3/ChangeLog:

PR libstdc++/117121
* include/bits/ranges_algobase.h (copy_backward): Decrement
output iterator before assigning one element through it.
* testsuite/25_algorithms/copy/108846.cc: Ensure the algorithm's
effects are correct for a single memcpyable element.
* testsuite/25_algorithms/copy_backward/108846.cc: Likewise.
* testsuite/25_algorithms/copy_n/108846.cc: Likewise.

9 months agoMAINTAINERS: Add myself to write after approval
Josef Melcr [Sun, 13 Oct 2024 17:14:13 +0000 (19:14 +0200)] 
MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval

Signed-off-by: Josef Melcr <melcrjos@fit.cvut.cz>
9 months agoRevert "c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]"
Simon Martin [Sun, 13 Oct 2024 15:58:14 +0000 (17:58 +0200)] 
Revert "c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]"

This reverts commit 60163c85730e6b7c566e219222403ac87ddbbddd.

9 months agom68k: replace reload_in_progress by reload_in_progress || lra_in_progress
Andreas Schwab [Fri, 11 Oct 2024 08:28:38 +0000 (10:28 +0200)] 
m68k: replace reload_in_progress by reload_in_progress || lra_in_progress

For now assume that LRA needs the same treatment as reload.

* config/m68k/m68k.md ("movsi", "movxf"): Replace
reload_in_progress by reload_in_progress || lra_in_progress.
* config/m68k/m68k.cc (m68k_legitimate_mem_p)
(emit_move_sequence): Likewise.
* config/m68k/predicates.md ("fp_src_operand"): Likewise.

9 months agotree-optimization/116481 - avoid building function_type[]
Richard Biener [Sun, 13 Oct 2024 09:42:27 +0000 (11:42 +0200)] 
tree-optimization/116481 - avoid building function_type[]

The following avoids building an array type with function or method
element type during diagnosing an array bound violation as this
will result in an error, rejecting a program with a not too useful
error message.  Instead build such array type manually.

PR tree-optimization/116481
* pointer-query.cc (build_printable_array_type):
Build an array types with function or method element type
manually to avoid bogus diagnostic.

* gcc.dg/pr116481.c: New testcase.

9 months agoFortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device
Tobias Burnus [Sun, 13 Oct 2024 08:18:31 +0000 (10:18 +0200)] 
Fortran: Use OpenACC's acc_on_device builtin, fix OpenMP' __builtin_is_initial_device

It turned out that 'if (omp_is_initial_device() .eqv. true)' gave an ICE
due to comparing 'int' with 'logical(4)'. When digging deeper, it also
turned out that when the procedure pointer is needed, the builtin cannot
be used, either.  (Follow up to r15-2799-gf1bfba3a9b3f31 )

Extend the code to also use the builtin acc_on_device with OpenACC,
which was previously only used in C/C++.  Additionally, fix folding
when offloading is not enabled.

Fixes additionally the BT_BOOL data type, which was 'char'/integer(1)
instead of bool, backing the booleaness; use bool_type_node as the rest
of GCC.

gcc/fortran/ChangeLog:

* gfortran.h (gfc_option_t): Add disable_acc_on_device.
* options.cc (gfc_handle_option): Handle -fno-builtin-acc_on_device.
* trans-decl.cc (gfc_get_extern_function_decl): Move
__builtin_omp_is_initial_device handling to ...
* trans-expr.cc (get_builtin_fn): ... this new function.
(conv_function_val): Call it.
(update_builtin_function): New.
(gfc_conv_procedure_call): Call it.
* types.def (BT_BOOL): Fix type by using bool_type_node.

gcc/ChangeLog:

* gimple-fold.cc (gimple_fold_builtin_acc_on_device): Also fold
when offloading is not configured.

libgomp/ChangeLog:

* libgomp.texi (TR13): Fix minor typos.
(omp_is_initial_device): Improve wording.
(acc_on_device): Note how to disable the builtin.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Remove TODO.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
Add -fno-builtin-acc_on_device.
* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.
* testsuite/libgomp.oacc-c-c++-common/routine-nohost-1.c: Update
dg- as !offloading_enabled now compile-time expands acc_on_device.
* testsuite/libgomp.fortran/target-is-initial-device-3.f90: New test.
* testsuite/libgomp.oacc-fortran/acc_on_device-2.f90: New test.

9 months ago[RISC-V] Avoid unnecessary extensions when value is already extended
Jivan Hakobyan [Sun, 13 Oct 2024 01:10:50 +0000 (19:10 -0600)] 
[RISC-V] Avoid unnecessary extensions when value is already extended

This is a minor patch from Jivan from roughly a year ago.  The basic
idea here is similar to what we do when extending values for the sake of
comparisons.  Specifically if the value is already known to be properly
extended, then an extension is just a copy.

The original idea was to use a similar patch, but which aborted to
identify cases where these unnecessary promotions where emitted.  All
that showed up when doing a testsuite run with that abort was the
promotions created by the arithmetic with overflow patterns such as addv.

Things like addv aren't *that* common so this never got high on my todo
list, even after a minor issue in this space was raised in bugzilla.

But with stage1 closing soon and no good reason not to go forward, I'm
submitting this into the pre-commit tester now.  My tester has been
using it since roughly Feb :-)  Plan would be to commit after the
pre-commit tester renders its verdict.

* config/riscv/riscv.md (zero_extendsidi2): If RHS is already
zero extended, then this is just a copy.
(extendsidi2): Similarly, but for sign extension.

9 months agoDaily bump.
GCC Administrator [Sun, 13 Oct 2024 00:18:21 +0000 (00:18 +0000)] 
Daily bump.

9 months agoUnsigned constants for ISO_FORTRAN_ENV and ISO_C_BINDING.
Thomas Koenig [Sat, 12 Oct 2024 17:09:14 +0000 (19:09 +0200)] 
Unsigned constants for ISO_FORTRAN_ENV and ISO_C_BINDING.

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (get_c_type_name): Also handle BT_UNSIGNED.
* gfortran.h (NAMED_UINTCST): Define before inclusion
of iso-c-binding.def and iso-fortran-env.def.
(gfc_get_uint_kind_from_width_isofortranenv): Prototype.
* gfortran.texi: Mention new constants in iso_c_binding and
iso_fortran_env.
* iso-c-binding.def: Handle NAMED_UINTCST. Add c_unsigned,
c_unsigned_short,c_unsigned_char, c_unsigned_long,
c_unsigned_long_long, c_uintmax_t, c_uint8_t, c_uint16_t,
c_uint32_t, c_uint64_t, c_uint128_t, c_uint_least8_t,
c_uint_least16_t, c_uint_least32_t, c_uint_least64_t,
c_uint_least128_t, c_uint_fast8_t, c_uint_fast16_t,
c_uint_fast32_t, c_uint_fast64_t and c_uint_fast128_t.
* iso-fortran-env.def: Handle NAMED_UINTCST. Add uint8, uint16,
uint32 and uint64.
* module.cc (parse_integer): Whitespace fix.
(write_module): Whitespace fix.
(NAMED_UINTCST): Define before inclusion of iso-fortran-evn.def
and iso-fortran-env.def.
* symbol.cc: Likewise.
* trans-types.cc (get_unsigned_kind_from_node): New function.
(get_uint_kind_from_name): New function.
(gfc_get_uint_kind_from_width_isofortranenv): New function.
(get_uint_kind_from_width): New function.
(gfc_init_kinds): Initialize gfc_c_uint_kind.

gcc/testsuite/ChangeLog:

* gfortran.dg/unsigned_36.f90: New test.

9 months agovect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985]
Feng Xue [Fri, 11 Oct 2024 06:55:05 +0000 (14:55 +0800)] 
vect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985]

To align vectorized def/use when lane-reducing op is present in loop reduction,
we may need to insert extra trivial pass-through copies, which would cause
mismatch between lane-reducing vector copy and loop mask index. This could be
fixed by computing the right index around a new counter on effective lane-
reducing vector copies.

2024-10-11 Feng Xue <fxue@os.amperecomputing.com>

gcc/
PR tree-optimization/116985
* tree-vect-loop.cc (vect_transform_reduction): Compute loop mask
index based on effective vector copies for reduction op.

gcc/testsuite/
PR tree-optimization/116985
* gcc.dg/vect/pr116985.c: New testcase.

9 months agotree-optimization/117104 - add missed guards to max(a,b) != a simplification
Richard Biener [Sat, 12 Oct 2024 12:51:37 +0000 (14:51 +0200)] 
tree-optimization/117104 - add missed guards to max(a,b) != a simplification

For vector types we have to make sure the comparison result is a vector
type and the resulting compare operation is supported.  As the resulting
compare is never an equality compare I didn't bother to check for the
cbranch case.

PR tree-optimization/117104
* match.pd ((cmp:c (minmax:c @0 @1) @0) -> (out @0 @1)): Properly
guard the vector case.

* gcc.dg/pr117104.c: New testcase.

9 months agoRISC-V] Slightly improve broadcasting small constants into vectors
Jeff Law [Sat, 12 Oct 2024 13:12:53 +0000 (07:12 -0600)] 
RISC-V] Slightly improve broadcasting small constants into vectors

I probably spent way more time on this than it's worth...

I was looking at the code we generate for vector SAD and noticed that we were
being a bit silly.  Specifically:

        li      a4,0            # 272   [c=4 l=4]  *movsi_internal/1

Followed shortly by:

        vmv.s.x v3,a4   # 261   [c=4 l=4]  *pred_broadcastrvvm1si/6

And no other uses of a4.  We could have used x0 trivially.

First we adjust the expander so that it doesn't force the constant into a
register.  In the matching pattern we change the appropriate source constraints
from "r" to "rJ" and the output template is changed to use %z for the operand.
The net is we drop the li completely and emit vmv.s.x,v3,x0.

But wait, there's more.  If we're broadcasting a constant in the range
[-16..15] into a vector, we currently load the constant into a register and use
vmv.v.r.  We can instead use vmv.v.i, which avoids loading the constant into a
GPR.  For that case we again avoid forcing the constant into a register in the
expander and adjust the output template to emit vmv.v.x or vmv.v.i based on
whether or not the appropriate operand is a constant or general purpose
register.  So again, we'll drop a load immediate into a scalar for this case.

Whether or not we should use vmv.v.i vs vmv.s.x for loading [-16..15] into the
0th element is probably uarch dependent.  The tradeoff is loading the GPR vs
the broadcast in the vector unit.  I didn't bother with this case.

Tested in my tester (which tests rv64gcv as a default codegen option). Will
wait for the pre-commit tester to render a verdict.

gcc/
* config/riscv/constraints.md (P): New constraint.
* config/riscv/vector.md (pred_broadcast<mode> expander): Do
not force small integers into GPRs so aggressively.
(pred_broadcast<mode> insn & splitter): Allow splatting small
constants across the vector register directly.  Allow splatting
(const_int 0) into element 0 directly.

9 months agoFortran/OpenMP: Warn when mapping polymorphic variables
Tobias Burnus [Sat, 12 Oct 2024 12:55:22 +0000 (14:55 +0200)] 
Fortran/OpenMP: Warn when mapping polymorphic variables

OpenMP (TR13) states for Fortran:
* For map: "If a list item has polymorphic type, the behavior is unspecified."
* "If the firstprivate clause is on a target construct and a variable is of
  polymorphic type, the behavior is unspecified."
which this commit now warns for.

gcc/fortran/ChangeLog:

* openmp.cc (resolve_omp_clauses): Diagnose polymorphic mapping.
* trans-openmp.cc (gfc_omp_finish_clause): Warn when
polymorphic variable is implicitly mapped.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/polymorphic-mapping.f90: New test.
* gfortran.dg/gomp/polymorphic-mapping-2.f90: New test.

9 months agobootstrap: Fix genmatch build where system gcc defaults to -fPIE -pie
Jakub Jelinek [Sat, 12 Oct 2024 11:47:45 +0000 (13:47 +0200)] 
bootstrap: Fix genmatch build where system gcc defaults to -fPIE -pie

Seems our buildbot is unhappy about my latest commit to link genmatch with
libcommon.a in order to support gcc_diag diagnostics in libcpp.

We have in gcc/configure.ac:
if test x$enable_host_shared = xyes; then
  PICFLAG=-fPIC
elif test x$enable_host_pie = xyes; then
  PICFLAG=-fPIE
elif test x$gcc_cv_c_no_fpie = xyes; then
  PICFLAG=-fno-PIE
else
  PICFLAG=
fi

if test x$enable_host_pie = xyes; then
  LD_PICFLAG=-pie
elif test x$gcc_cv_no_pie = xyes; then
  LD_PICFLAG=-no-pie
else
  LD_PICFLAG=
fi

if test x$enable_host_bind_now = xyes; then
  LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"
fi

Now, for object files linked into cc1, cc1plus, xgcc etc. we carefully
arrange for them to be compiled with $(PICFLAG) and do the link with
$(LD_PICFLAG).
For the generator programs, we don't do anything like that, we simply
compile their objects without $(PICFLAG) and link without $(LD_PICFLAG).
It isn't that big deal, the generator programs runs once or a couple of
times during the build and that is it, we don't ship them and don't
care much if they are PIE or not.
Except that after my changes to link in libcommon.a into build/genmatch,
we now link -fno-PIE compiled objects into a binary which is linked with
default flags.  Our distro compiler just links a normal executable and
everything works fine (-fPIE/-pie is added through spec file snippet and
just added in rpm default flags), but seems the buildbot system gcc
defaults to -fPIE -pie instead and so building build/genmatch fails.

The following patch is a minimal fix for that, just add -no-pie when
linking build/genmatch, but don't add -pie.

If we wanted to start building even the build/gen* tools with $(PICFLAG)
and $(LD_PICFLAG), that would be much larger change.

2024-10-12  Jakub Jelinek  <jakub@redhat.com>

* Makefile.in (LINKER_FOR_BUILD): Append -no-pie if it is in
$(LD_PICFLAG) when building build/genmatch.

9 months agogcc.target/i386/pr55583.c: Use long long for 64-bit integer
H.J. Lu [Fri, 11 Oct 2024 22:15:28 +0000 (06:15 +0800)] 
gcc.target/i386/pr55583.c: Use long long for 64-bit integer

Since long is 32-bit for x32, use long long for 64-bit integer.

* gcc.target/i386/pr55583.c: Use long long for 64-bit integer.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
9 months agogcc.target/i386/pr115749.c: Use word_mode integer
H.J. Lu [Fri, 11 Oct 2024 21:22:52 +0000 (05:22 +0800)] 
gcc.target/i386/pr115749.c: Use word_mode integer

Use word_mode integer with func so that 64-bit integer is used with
x32.

* gcc.target/i386/pr115749.c (uword): New.
(func): Replace unsigned long with uword.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
9 months agogcc.target/i386/invariant-ternlog-1.c: Also scan (%edx)
H.J. Lu [Fri, 11 Oct 2024 21:04:33 +0000 (05:04 +0800)] 
gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx)

Since x32 uses (%edx), instead of (%rdx), also scan (%edx).

* gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx).

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
9 months agolibcpp, genmatch: Use gcc_diag instead of printf for libcpp diagnostics
Jakub Jelinek [Sat, 12 Oct 2024 08:44:17 +0000 (10:44 +0200)] 
libcpp, genmatch: Use gcc_diag instead of printf for libcpp diagnostics

When working on #embed support, or -Wheader-guard or other recent libcpp
changes, I've been annoyed by the libcpp diagnostics being visually
different from normal gcc diagnostics, especially in the area of quoting
stuff in the diagnostic messages.
Normall GCC diagnostics is gcc_diag/gcc_tdiag, one can use
%</%>, %qs etc. in there, while libcpp diagnostics was marked as printf
and in libcpp we've been very creative with quoting stuff, either
no quotes at all, or "something" quoting, or 'something' quoting, or
`something' quoting (but in none of the cases it used colors consistently
with the rest of the compiler).

Now, libcpp diagnostics is always emitted using a callback,
pfile->cb.diagnostic.  On the gcc/ side, this callback is initialized with
genmatch.cc:  cb->diagnostic = diagnostic_cb;
c-family/c-opts.cc:  cb->diagnostic = c_cpp_diagnostic;
fortran/cpp.cc:  cb->diagnostic = cb_cpp_diagnostic;
where the latter two just use diagnostic_report_diagnostic, so actually
support all the gcc_diag stuff, only the genmatch.cc case didn't.

So, the following patch changes genmatch.cc to use pp_format* instead
of vfprintf so that it supports the gcc_diag formatting (pretty-print.o
unfortunately has various dependencies, so had to link genmatch with
libcommon.a libbacktrace.a and tweak Makefile.in so that there are no
circular dependencies) and marks the libcpp diagnostic routines as
gcc_diag rather than printf.  That change resulted in hundreds of
-Wformat-diag new warnings (most of them useful and resulting IMHO in
better diagnostics), so the rest of the patch is changing the format
strings to make -Wformat-diag happy and adjusting the testsuite for
the differences in how is the diagnostic reformatted.

Dunno if some out of GCC tree projects use libcpp, that case would
make it harder because one couldn't use vfprintf in the diagnostic
callback anymore, but there is always David's libdiagnostic which could
be used for that purpose IMHO.

2024-10-12  Jakub Jelinek  <jakub@redhat.com>

libcpp/
* include/cpplib.h (ATTRIBUTE_CPP_PPDIAG): Define.
(struct cpp_callbacks): Use ATTRIBUTE_CPP_PPDIAG instead of
ATTRIBUTE_FPTR_PRINTF on diagnostic callback.
(cpp_error, cpp_warning, cpp_pedwarning, cpp_warning_syshdr): Use
ATTRIBUTE_CPP_PPDIAG (3, 4) instead of ATTRIBUTE_PRINTF_3.
(cpp_warning_at, cpp_pedwarning_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5)
instead of ATTRIBUTE_PRINTF_4.
(cpp_error_with_line, cpp_warning_with_line, cpp_pedwarning_with_line,
cpp_warning_with_line_syshdr): Use ATTRIBUTE_CPP_PPDIAG (5, 6)
instead of ATTRIBUTE_PRINTF_5.
(cpp_error_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of
ATTRIBUTE_PRINTF_4.
* Makefile.in (po/$(PACKAGE).pot): Use --language=GCC-source rather
than --language=c.
* errors.cc (cpp_diagnostic_at, cpp_diagnostic,
cpp_diagnostic_with_line): Use ATTRIBUTE_CPP_PPDIAG instead of
-ATTRIBUTE_FPTR_PRINTF.
* charset.cc (cpp_host_to_exec_charset, _cpp_valid_ucn, convert_hex,
convert_oct, convert_escape): Fix up -Wformat-diag warnings.
(cpp_interpret_string_ranges, count_source_chars): Use
ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF.
(narrow_str_to_charconst): Fix up -Wformat-diag warnings.
* directives.cc (check_eol_1, directive_diagnostics, lex_macro_node,
do_undef, glue_header_name, parse_include, do_include_common,
do_include_next, _cpp_parse_embed_params, do_embed, read_flag,
do_line, do_linemarker, register_pragma_1, do_pragma_once,
do_pragma_push_macro, do_pragma_pop_macro, do_pragma_poison,
do_pragma_system_header, do_pragma_warning_or_error, _cpp_do__Pragma,
do_else, do_elif, do_endif, parse_answer, do_assert,
cpp_define_unused): Likewise.
* expr.cc (cpp_classify_number, parse_defined, eval_token,
_cpp_parse_expr, reduce, check_promotion): Likewise.
* files.cc (_cpp_find_file, finish_base64_embed,
_cpp_pop_file_buffer): Likewise.
* init.cc (sanity_checks): Likewise.
* lex.cc (_cpp_process_line_notes, maybe_warn_bidi_on_char,
_cpp_warn_invalid_utf8, _cpp_skip_block_comment,
warn_about_normalization, forms_identifier_p, maybe_va_opt_error,
identifier_diagnostics_on_lex, cpp_maybe_module_directive): Likewise.
* macro.cc (class vaopt_state, builtin_has_include_1,
builtin_has_include, builtin_has_embed, _cpp_warn_if_unused_macro,
_cpp_builtin_macro_text, builtin_macro, stringify_arg,
_cpp_arguments_ok, collect_args, enter_macro_context,
_cpp_save_parameter, parse_params, create_iso_definition,
_cpp_create_definition, check_trad_stringification): Likewise.
* pch.cc (cpp_valid_state): Likewise.
* traditional.cc (_cpp_scan_out_logical_line, recursive_macro):
Likewise.
gcc/
* Makefile.in (generated_files): Remove {gimple,generic}-match*.
(generated_match_files): New variable.  Add a dependency of
$(filter-out $(OBJS-libcommon),$(ALL_HOST_OBJS)) files on those.
(build/genmatch$(build_exeext)): Depend on and link against
libcommon.a and $(LIBBACKTRACE).
* genmatch.cc: Include pretty-print.h and input.h.
(ggc_internal_cleared_alloc, ggc_free): Remove.
(fatal): New function.
(line_table): Remove.
(linemap_client_expand_location_to_spelling_point): Remove.
(diagnostic_cb): Use gcc_diag rather than printf format.  Use
pp_format_verbatim on a temporary pretty_printer instead of
vfprintf.
(fatal_at, warning_at): Use gcc_diag rather than printf format.
(output_line_directive): Rename location_hash to loc_hash.
(parser::eat_ident, parser::parse_operation, parser::parse_expr,
parser::parse_pattern, parser::finish_match_operand): Fix up
-Wformat-diag warnings.
gcc/c-family/
* c-lex.cc (c_common_has_attribute,
c_common_lex_availability_macro): Fix up -Wformat-diag warnings.
gcc/testsuite/
* c-c++-common/cpp/counter-2.c: Adjust expected diagnostics for
libcpp diagnostic formatting changes.
* c-c++-common/cpp/embed-3.c: Likewise.
* c-c++-common/cpp/embed-4.c: Likewise.
* c-c++-common/cpp/embed-16.c: Likewise.
* c-c++-common/cpp/embed-18.c: Likewise.
* c-c++-common/cpp/eof-2.c: Likewise.
* c-c++-common/cpp/eof-3.c: Likewise.
* c-c++-common/cpp/fmax-include-depth.c: Likewise.
* c-c++-common/cpp/has-builtin.c: Likewise.
* c-c++-common/cpp/line-2.c: Likewise.
* c-c++-common/cpp/line-3.c: Likewise.
* c-c++-common/cpp/macro-arg-count-1.c: Likewise.
* c-c++-common/cpp/macro-arg-count-2.c: Likewise.
* c-c++-common/cpp/macro-ranges.c: Likewise.
* c-c++-common/cpp/named-universal-char-escape-4.c: Likewise.
* c-c++-common/cpp/named-universal-char-escape-5.c: Likewise.
* c-c++-common/cpp/pr88974.c: Likewise.
* c-c++-common/cpp/va-opt-error.c: Likewise.
* c-c++-common/cpp/va-opt-pedantic.c: Likewise.
* c-c++-common/cpp/Wheader-guard-2.c: Likewise.
* c-c++-common/cpp/Wheader-guard-3.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-1.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-2.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-3.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c:
Likewise.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c:
Likewise.
* c-c++-common/pr68833-3.c: Likewise.
* c-c++-common/raw-string-directive-1.c: Likewise.
* gcc.dg/analyzer/named-constants-Wunused-macros.c: Likewise.
* gcc.dg/binary-constants-4.c: Likewise.
* gcc.dg/builtin-redefine.c: Likewise.
* gcc.dg/cpp/19951025-1.c: Likewise.
* gcc.dg/cpp/c11-warning-1.c: Likewise.
* gcc.dg/cpp/c11-warning-2.c: Likewise.
* gcc.dg/cpp/c11-warning-3.c: Likewise.
* gcc.dg/cpp/c23-elifdef-2.c: Likewise.
* gcc.dg/cpp/c23-warning-2.c: Likewise.
* gcc.dg/cpp/embed-2.c: Likewise.
* gcc.dg/cpp/embed-3.c: Likewise.
* gcc.dg/cpp/embed-4.c: Likewise.
* gcc.dg/cpp/expr.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-2.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-3.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-4.c: Likewise.
* gcc.dg/cpp/gnu11-warning-1.c: Likewise.
* gcc.dg/cpp/gnu11-warning-2.c: Likewise.
* gcc.dg/cpp/gnu11-warning-3.c: Likewise.
* gcc.dg/cpp/gnu23-warning-2.c: Likewise.
* gcc.dg/cpp/include6.c: Likewise.
* gcc.dg/cpp/pr35322.c: Likewise.
* gcc.dg/cpp/tr-warn6.c: Likewise.
* gcc.dg/cpp/undef2.c: Likewise.
* gcc.dg/cpp/warn-comments.c: Likewise.
* gcc.dg/cpp/warn-comments-2.c: Likewise.
* gcc.dg/cpp/warn-comments-3.c: Likewise.
* gcc.dg/cpp/warn-cxx-compat.c: Likewise.
* gcc.dg/cpp/warn-cxx-compat-2.c: Likewise.
* gcc.dg/cpp/warn-deprecated.c: Likewise.
* gcc.dg/cpp/warn-deprecated-2.c: Likewise.
* gcc.dg/cpp/warn-long-long.c: Likewise.
* gcc.dg/cpp/warn-long-long-2.c: Likewise.
* gcc.dg/cpp/warn-normalized-1.c: Likewise.
* gcc.dg/cpp/warn-normalized-2.c: Likewise.
* gcc.dg/cpp/warn-normalized-3.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-unicode.c: Likewise.
* gcc.dg/cpp/warn-redefined.c: Likewise.
* gcc.dg/cpp/warn-redefined-2.c: Likewise.
* gcc.dg/cpp/warn-traditional.c: Likewise.
* gcc.dg/cpp/warn-traditional-2.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-1.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-2.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-3.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-4.c: Likewise.
* gcc.dg/cpp/warn-undef.c: Likewise.
* gcc.dg/cpp/warn-undef-2.c: Likewise.
* gcc.dg/cpp/warn-unused-macros.c: Likewise.
* gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
* gcc.dg/pch/counter-2.c: Likewise.
* g++.dg/cpp0x/udlit-error1.C: Likewise.
* g++.dg/cpp23/named-universal-char-escape1.C: Likewise.
* g++.dg/cpp23/named-universal-char-escape2.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-1.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-2.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-3.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-4.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-5.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-6.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-7.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-8.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-9.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-10.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-11.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-12.C: Likewise.
* g++.dg/cpp/elifdef-3.C: Likewise.
* g++.dg/cpp/elifdef-5.C: Likewise.
* g++.dg/cpp/elifdef-6.C: Likewise.
* g++.dg/cpp/elifdef-7.C: Likewise.
* g++.dg/cpp/embed-1.C: Likewise.
* g++.dg/cpp/embed-2.C: Likewise.
* g++.dg/cpp/pedantic-errors.C: Likewise.
* g++.dg/cpp/warning-1.C: Likewise.
* g++.dg/cpp/warning-2.C: Likewise.
* g++.dg/ext/bitint1.C: Likewise.
* g++.dg/ext/bitint2.C: Likewise.

9 months agoFortran: Unify gfc_get_location handling; fix expr->ts bug
Tobias Burnus [Sat, 12 Oct 2024 08:48:41 +0000 (10:48 +0200)] 
Fortran: Unify gfc_get_location handling; fix expr->ts bug

This commit reduces code duplication by moving gfc_get_location
from trans.cc to error.cc.  The gcc_assert is now used more often
and reveald a bug in gfc_match_array_constructor where the union
expr->ts.u.derived of a derived type is partially overwritten by
the assignment expr->ts.u.cl->... as a ts.type == BT_CHARACTER check
was missing.

gcc/fortran/ChangeLog:

* array.cc (gfc_match_array_constructor): Only update the
character length if the expression is of character type.
* error.cc (gfc_get_location_with_offset): New; split off
from ...
(gfc_format_decoder): ... here; call it.
* gfortran.h (gfc_get_location_with_offset): New prototype.
(gfc_get_location): New inline function.
* trans.cc (gfc_get_location): Remove function definition.
* trans.h (gfc_get_location): Remove declaration.

9 months agotestsuite/i386: Add vector sat_sub testcases [PR112600]
Uros Bizjak [Sat, 12 Oct 2024 08:04:03 +0000 (10:04 +0200)] 
testsuite/i386: Add vector sat_sub testcases [PR112600]

PR middle-end/112600

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112600-4a.c: New test.
* gcc.target/i386/pr112600-4b.c: New test.

9 months agoMAINTAINERS: Add myself to write after approval
Feng Xue [Sat, 12 Oct 2024 07:45:58 +0000 (15:45 +0800)] 
MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval.

9 months agoc++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]
Simon Martin [Fri, 11 Oct 2024 08:16:26 +0000 (10:16 +0200)] 
c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

We currently emit an incorrect -Woverloaded-virtual warning upon the
following test case

=== cut here ===
struct A {
  virtual operator int() { return 42; }
  virtual operator char() = 0;
};
struct B : public A {
  operator char() { return 'A'; }
};
=== cut here ===

The problem is that when iterating over ovl_range (fns), warn_hidden
gets confused by the conversion operator marker, concludes that
seen_non_override is true and therefore emits a warning for all
conversion operators in A that do not convert to char, even if
-Woverloaded-virtual is 1 (e.g. with -Wall, the case reported).

A second set of problems is highlighted when -Woverloaded-virtual is 2.

First, with the same test case, since base_fndecls contains all
conversion operators in A (except the one to char, that's been removed
when iterating over ovl_range (fns)), we emit a spurious warning for
the conversion operator to int, even though it's unrelated.

Second, in case there are several conversion operators with different
cv-qualifiers to the same type in A, we rightfully emit a warning,
however the note uses the location of the conversion operator marker
instead of the right one; location_of should go over conv_op_marker.

This patch fixes all these by explicitly keeping track of (1) base
methods that are overriden, as well as (2) base methods that are hidden
but not overriden (and by what), and warning about methods that are in
(2) but not (1). It also ignores non virtual base methods, per
"definition" of -Woverloaded-virtual.

PR c++/109918

gcc/cp/ChangeLog:

* class.cc (warn_hidden): Keep track of overloaded and of hidden
base methods. Mention the actual hiding function in the warning,
not the first overload.
* error.cc (location_of): Skip over conv_op_marker.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Woverloaded-virt1.C: Check that no warning is
emitted for non virtual base methods.
* g++.dg/warn/Woverloaded-virt5.C: New test.
* g++.dg/warn/Woverloaded-virt6.C: New test.
* g++.dg/warn/Woverloaded-virt7.C: New test.
* g++.dg/warn/Woverloaded-virt8.C: New test.
* g++.dg/warn/Woverloaded-virt9.C: New test.

9 months agoRISC-V: Add testcases for form 1 of vector signed SAT_SUB
Pan Li [Fri, 11 Oct 2024 04:12:03 +0000 (12:12 +0800)] 
RISC-V: Add testcases for form 1 of vector signed SAT_SUB

Form 1:
  #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0                                          \
          ? minus                                                      \
          : (minus ^ x) >= 0                                           \
            ? minus                                                    \
            : x < 0 ? MIN : MAX;                                       \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper
macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Implement vector SAT_SUB for signed integer
Pan Li [Fri, 11 Oct 2024 04:05:10 +0000 (12:05 +0800)] 
RISC-V: Implement vector SAT_SUB for signed integer

This patch would like to implement the sssub for vector signed integer.

Form 1:
  #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0                                          \
          ? minus                                                      \
          : (minus ^ x) >= 0                                           \
            ? minus                                                    \
            : x < 0 ? MIN : MAX;                                       \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)

Before this patch:
  28   │     vle8.v  v1,0(a1)
  29   │     vle8.v  v2,0(a2)
  30   │     sub a3,a3,a5
  31   │     add a1,a1,a5
  32   │     add a2,a2,a5
  33   │     vsra.vi v4,v1,7
  34   │     vsub.vv v3,v1,v2
  35   │     vxor.vv v2,v1,v2
  36   │     vxor.vv v0,v1,v3
  37   │     vmslt.vi    v2,v2,0
  38   │     vmslt.vi    v0,v0,0
  39   │     vmand.mm    v0,v0,v2
  40   │     vxor.vv v3,v4,v5,v0.t
  41   │     vse8.v  v3,0(a0)
  42   │     add a0,a0,a5

After this patch:
  25   │     vle8.v  v1,0(a1)
  26   │     vle8.v  v2,0(a2)
  27   │     sub a3,a3,a5
  28   │     add a1,a1,a5
  29   │     add a2,a2,a5
  30   │     vssub.vv    v1,v1,v2
  31   │     vse8.v  v1,0(a0)
  32   │     add a0,a0,a5

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec.md (sssub<mode>3): Add new pattern for
signed SAT_SUB.
* config/riscv/riscv-protos.h (expand_vec_sssub): Add new func
decl to expand sssub to vssub.
* config/riscv/riscv-v.cc (expand_vec_sssub): Add new func
impl to expand sssub to vssub.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoVect: Try the pattern of vector signed integer SAT_SUB
Pan Li [Fri, 11 Oct 2024 03:58:30 +0000 (11:58 +0800)] 
Vect: Try the pattern of vector signed integer SAT_SUB

Almost the same as vector unsigned integer SAT_SUB, try to match
the signed version during the vector pattern matching.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* tree-vect-patterns.cc (gimple_signed_integer_sat_sub): Add new
func decl for signed SAT_SUB.
(vect_recog_sat_sub_pattern_transform): Update comments.
(vect_recog_sat_sub_pattern): Try the vector signed SAT_SUB
pattern.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoMatch: Support form 1 for vector signed integer SAT_SUB
Pan Li [Fri, 11 Oct 2024 03:51:52 +0000 (11:51 +0800)] 
Match: Support form 1 for vector signed integer SAT_SUB

This patch would like to support the form 1 of the vector signed
integer SAT_SUB.  Aka below example:

Form 1:
  #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0                                          \
          ? minus                                                      \
          : (minus ^ x) >= 0                                           \
            ? minus                                                    \
            : x < 0 ? MIN : MAX;                                       \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)

Before this patch:
  91   │   _108 = .SELECT_VL (ivtmp_106, POLY_INT_CST [16, 16]);
  92   │   vect_x_16.11_80 = .MASK_LEN_LOAD (vectp_op_1.9_78, 8B, { -1, ... }, _108, 0);
  93   │   _69 = vect_x_16.11_80 >> 7;
  94   │   vect_x.12_81 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_x_16.11_80);
  95   │   vect_y_18.15_85 = .MASK_LEN_LOAD (vectp_op_2.13_83, 8B, { -1, ... }, _108, 0);
  96   │   vect__7.21_91 = vect_x_16.11_80 ^ vect_y_18.15_85;
  97   │   mask__44.22_92 = vect__7.21_91 < { 0, ... };
  98   │   vect_y.16_86 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_y_18.15_85);
  99   │   vect__6.17_87 = vect_x.12_81 - vect_y.16_86;
 100   │   vect_minus_19.18_88 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(vect__6.17_87);
 101   │   vect__8.19_89 = vect_x_16.11_80 ^ vect_minus_19.18_88;
 102   │   mask__42.20_90 = vect__8.19_89 < { 0, ... };
 103   │   mask__41.23_93 = mask__42.20_90 & mask__44.22_92;
 104   │   _4 = .COND_XOR (mask__41.23_93, _69, { 127, ... }, vect_minus_19.18_88);
 105   │   .MASK_LEN_STORE (vectp_out.31_102, 8B, { -1, ... }, _108, 0, _4);
 106   │   vectp_op_1.9_79 = vectp_op_1.9_78 + _108;
 107   │   vectp_op_2.13_84 = vectp_op_2.13_83 + _108;
 108   │   vectp_out.31_103 = vectp_out.31_102 + _108;
 109   │   ivtmp_107 = ivtmp_106 - _108;

After this patch:
  81   │   _102 = .SELECT_VL (ivtmp_100, POLY_INT_CST [16, 16]);
  82   │   vect_x_16.11_89 = .MASK_LEN_LOAD (vectp_op_1.9_87, 8B, { -1, ... }, _102, 0);
  83   │   vect_y_18.14_93 = .MASK_LEN_LOAD (vectp_op_2.12_91, 8B, { -1, ... }, _102, 0);
  84   │   vect_patt_38.15_94 = .SAT_SUB (vect_x_16.11_89, vect_y_18.14_93);
  85   │   .MASK_LEN_STORE (vectp_out.16_96, 8B, { -1, ... }, _102, 0, vect_patt_38.15_94);
  86   │   vectp_op_1.9_88 = vectp_op_1.9_87 + _102;
  87   │   vectp_op_2.12_92 = vectp_op_2.12_91 + _102;
  88   │   vectp_out.16_97 = vectp_out.16_96 + _102;
  89   │   ivtmp_101 = ivtmp_100 - _102;

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add case 1 matching pattern for vector signed SAT_SUB.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoDaily bump.
GCC Administrator [Sat, 12 Oct 2024 00:18:49 +0000 (00:18 +0000)] 
Daily bump.

9 months agoIntroduce GFC_STD_UNSIGNED.
Thomas Koenig [Fri, 11 Oct 2024 20:58:51 +0000 (22:58 +0200)] 
Introduce GFC_STD_UNSIGNED.

This patch creates an unsigned "standard" for the
gfc_option.allow_std field.

One of the main reason why people want UNSIGNED for Fortran is
interfacing for C.

This is a preparation for further work on the ISO_C_BINDING constants.
That, we do via iso-c-binding.def , whose last field is a standard
for the constant to be defined for the standard in question, which is
then checked.  I could try and invent a different method for this,
but I'd rather not.

gcc/fortran/ChangeLog:

* intrinsic.cc (add_functions): Convert uint and
selected_unsigned_kind to GFC_STD_UNSIGNED.
(gfc_check_intrinsic_standard): Handle GFC_STD_UNSIGNED.
* libgfortran.h (GFC_STD_UNSIGNED): Add.
* options.cc (gfc_post_options): Set GFC_STD_UNSIGNED
if -funsigned is set.

9 months agogcc.target/i386: Replace long with long long
H.J. Lu [Thu, 10 Oct 2024 09:22:36 +0000 (17:22 +0800)] 
gcc.target/i386: Replace long with long long

Since long is 64-bit for x32, replace long with long long for x32.

* gcc.target/i386/bmi2-pr112526.c: Replace long with long long.
* gcc.target/i386/pr105854.c: Likewise.
* gcc.target/i386/pr112943.c: Likewise.
* gcc.target/i386/pr67325.c: Likewise.
* gcc.target/i386/pr97971.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
9 months agog++.target/i386/pr105953.C: Skip for x32
H.J. Lu [Thu, 10 Oct 2024 11:00:32 +0000 (19:00 +0800)] 
g++.target/i386/pr105953.C: Skip for x32

Since -mabi=ms isn't supported for x32, skip g++.target/i386/pr105953.C
for x32.

* g++.target/i386/pr105953.C: Skip for x32.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
9 months agogcc.target/i386/pr115407.c: Only run for lp64
H.J. Lu [Thu, 10 Oct 2024 09:29:27 +0000 (17:29 +0800)] 
gcc.target/i386/pr115407.c: Only run for lp64

Since -mcmodel=large is valid only for lp64, run pr115407.c only for
lp64.

* gcc.target/i386/pr115407.c: Only run for lp64.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>
9 months agoFix thinko in previous change
Eric Botcazou [Fri, 11 Oct 2024 17:29:15 +0000 (19:29 +0200)] 
Fix thinko in previous change

gcc/ada/
PR ada/116498
PR ada/117087
* gcc-interface/decl.cc (validate_size): Fix thinko.

9 months agolibstdc++: Rearrange std::move_iterator helpers in stl_iterator.h
Jonathan Wakely [Thu, 10 Oct 2024 21:47:46 +0000 (22:47 +0100)] 
libstdc++: Rearrange std::move_iterator helpers in stl_iterator.h

The __niter_base(move_iterator<I>) overload and __is_move_iterator trait
were originally immediately after the definition of move_iterator. The
addition of C++20 features after move_iterator meant that those helpers
were no longer anywhere near move_iterator.

This change puts them back where they used to be, before all the new
C++20 additions.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (__niter_base(move_iterator<I>))
(__is_move_iterator, __miter_base, _GLIBCXX_MAKE_MOVE_ITERATOR)
(_GLIBCXX_MAKE_MOVE_IF_NOEXCEPT_ITERATOR): Move earlier in the
file.

9 months agoPR target/117048 aarch64: Use more canonical and optimization-friendly representation...
Kyrylo Tkachov [Wed, 9 Oct 2024 16:40:33 +0000 (09:40 -0700)] 
PR target/117048 aarch64: Use more canonical and optimization-friendly representation for XAR instruction

The pattern for the Advanced SIMD XAR instruction isn't very
optimization-friendly at the moment.
In the testcase from the PR once simlify-rtx has done its work it
generates the RTL:
(set (reg:V2DI 119 [ _14 ])
    (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
            (reg:V2DI 116 [ *m1_01_8(D) ]))
        (const_vector:V2DI [
                (const_int 32 [0x20]) repeated x2
            ])))

which fails to match our XAR pattern because the pattern expects:
1) A ROTATERT instead of the ROTATE.  However, according to the RTL ops
documentation the preferred form of rotate-by-immediate is ROTATE, which
I take to mean it's the canonical form.
ROTATE (x, C) <-> ROTATERT (x, MODE_WIDTH - C) so it's better to match just
one canonical representation.
2) A CONST_INT shift amount whereas the midend asks for a repeated vector
constant.

These issues are fixed by introducing a dedicated expander for the
aarch64_xarqv2di name, needed by the arm_neon.h intrinsic, that translate
the intrinsic-level CONST_INT immediate (the right-rotate amount) into
a repeated vector constant subtracted from 64 to give the corresponding
left-rotate amount that is fed to the new representation for the XAR
define_insn that uses the ROTATE RTL code.  This is a similar approach
to have we handle the discrepancy between intrinsic-level and RTL-level
vector lane numbers for big-endian.

With this patch and [1/2] the arithmetic parts of the testcase now simplify
to just one XAR instruction.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/
PR target/117048
* config/aarch64/aarch64-simd.md (aarch64_xarqv2di): Redefine into a
define_expand.
(*aarch64_xarqv2di_insn): Define.

gcc/testsuite/
PR target/117048
* g++.target/aarch64/pr117048.C: New test.

9 months agoPR 117048: simplify-rtx: Extend (x << C1) | (X >> C2) --> ROTATE transformation to...
Kyrylo Tkachov [Wed, 9 Oct 2024 16:39:55 +0000 (09:39 -0700)] 
PR 117048: simplify-rtx: Extend (x << C1) | (X >> C2) --> ROTATE transformation to vector operands

In the testcase from patch [2/2] we want to match a vector rotate operate from
an IOR of left and right shifts by immediate.  simplify-rtx has code for just
that but it looks like it's prepared to do handle only scalar operands.
In practice most of the code works for vector modes as well except the shift
amounts are checked to be CONST_INT rather than vector constants that we have
here.  This is easily extended by using unwrap_const_vec_duplicate to extract
the repeating constant shift amount.  With this change combine now tries
matching the simpler and expected:
(set (reg:V2DI 119 [ _14 ])
    (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
            (reg:V2DI 116 [ *m1_01_8(D) ]))
        (const_vector:V2DI [
                (const_int 32 [0x20]) repeated x2
            ])))
instead of the previous:
(set (reg:V2DI 119 [ _14 ])
    (ior:V2DI (ashift:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
                (reg:V2DI 116 [ *m1_01_8(D) ]))
            (const_vector:V2DI [
                    (const_int 32 [0x20]) repeated x2
                ]))
        (lshiftrt:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
                (reg:V2DI 116 [ *m1_01_8(D) ]))
            (const_vector:V2DI [
                    (const_int 32 [0x20]) repeated x2
                ]))))

To actually fix the PR the aarch64 backend needs some adjustment as well
which is done in patch [2/2], which adds the testcase as well.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
PR target/117048
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
Handle vector constants in (x << C1) | (x >> C2) -> ROTATE
simplification.

9 months agoFortran: Dead-function removal in error.cc (shrinking by 40%)
Tobias Burnus [Fri, 11 Oct 2024 15:05:37 +0000 (17:05 +0200)] 
Fortran: Dead-function removal in error.cc (shrinking by 40%)

This patch removes a large number of unused static functions from error.cc,
which previously were used for diagnostic but have been replaced by the common
diagnostic code.

gcc/fortran/ChangeLog:

* error.cc (error_char, error_string, error_uinteger, error_integer,
error_hwuint, error_hwint, gfc_widechar_display_length,
gfc_wide_display_length, error_printf, show_locus, show_loci):
Remove unused static functions.
(IBUF_LEN, MAX_ARGS): Remove now unused #define.

9 months agomatch.pd: Fold logarithmic identities.
Jennifer Schmitz [Wed, 25 Sep 2024 10:21:22 +0000 (03:21 -0700)] 
match.pd: Fold logarithmic identities.

This patch implements 4 rules for logarithmic identities in match.pd
under -funsafe-math-optimizations:
1) logN(1.0/a) -> -logN(a). This avoids the division instruction.
2) logN(C/a) -> logN(C) - logN(a), where C is a real constant. Same as 1).
3) logN(a) + logN(b) -> logN(a*b). This reduces the number of calls to
log function.
4) logN(a) - logN(b) -> logN(a/b). Same as 4).
Tests were added for float, double, and long double.

The patch was bootstrapped and regtested on aarch64-linux-gnu and
x86_64-linux-gnu, no regression.
Additionally, SPEC 2017 fprate was run. While the transform does not seem
to be triggered, we also see no non-noise impact on performance.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
PR tree-optimization/116826
PR tree-optimization/86710
* match.pd: Fold logN(1.0/a) -> -logN(a),
logN(C/a) -> logN(C) - logN(a), logN(a) + logN(b) -> logN(a*b),
and logN(a) - logN(b) -> logN(a/b).

gcc/testsuite/
PR tree-optimization/116826
PR tree-optimization/86710
* gcc.dg/tree-ssa/log_ident.c: New test.

9 months agolibstdc++: Use appropriate feature test macro for std::byte
Jonathan Wakely [Fri, 11 Oct 2024 12:29:06 +0000 (13:29 +0100)] 
libstdc++: Use appropriate feature test macro for std::byte

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_byte<byte>): Guard with
__glibcxx_byte macro instead of checking __cplusplus.

9 months agolibstdc++: Fix localized %c formatting for <chrono> [PR117085]
Jonathan Wakely [Fri, 11 Oct 2024 08:40:38 +0000 (09:40 +0100)] 
libstdc++: Fix localized %c formatting for <chrono> [PR117085]

When formatting a time point with %c we call std::vformat_to using the
formatting locale's D_T_FMT string, but we weren't adding the L option
to the format string. This meant we always interpreted D_T_FMT in the C
locale, instead of using the formatting locale as obviously intended
when %c is used.

libstdc++-v3/ChangeLog:

PR libstdc++/117085
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Add L
option to format string.
* testsuite/std/time/format.cc: Move to...
* testsuite/std/time/format/format.cc: ...here.
* testsuite/std/time/format_localized.cc: Move to...
* testsuite/std/time/format/localized.cc: ...here.
* testsuite/std/time/format/pr117085.cc: New test.

9 months agolibstdc++: Add missing whitespace in dg-do directives
Jonathan Wakely [Fri, 11 Oct 2024 14:42:10 +0000 (15:42 +0100)] 
libstdc++: Add missing whitespace in dg-do directives

libstdc++-v3/ChangeLog:

* testsuite/22_locale/time_get/get/char/5.cc: Fix dg-do
directive.
* testsuite/22_locale/time_get/get/wchar_t/5.cc: Likewise.

9 months agotree-optimization/117080 - Add SLP_TREE_MEMORY_ACCESS_TYPE
Richard Biener [Thu, 6 Jun 2024 13:52:02 +0000 (15:52 +0200)] 
tree-optimization/117080 - Add SLP_TREE_MEMORY_ACCESS_TYPE

It turns out target costing code looks at STMT_VINFO_MEMORY_ACCESS_TYPE
to identify operations from (emulated) gathers for example.  This
doesn't work for SLP loads since we do not set STMT_VINFO_MEMORY_ACCESS_TYPE
there as the vectorization strathegy might differ between different
stmt uses.  It seems we got away with setting it for stores though.
The following adds a memory_access_type field to slp_tree and sets it
from load and store vectorization code.  All the costing doesn't record
the SLP node (that was only done selectively for some corner case).  The
costing is really in need of a big overhaul, the following just massages
the two relevant ops to fix gcc.dg/target/pr88531-2[bc].c FAILs when
switching on SLP for non-grouped stores.  In particular currently
we either have a SLP node or a stmt_info in the cost hook but not both.

So the following mitigates this, postponing a rewrite of costing to
next stage1.  Other targets look possibly affected as well but are
left to respective maintainers to update.

PR tree-optimization/117080
* tree-vectorizer.h (_slp_tree::memory_access_type): Add.
(SLP_TREE_MEMORY_ACCESS_TYPE): New.
(record_stmt_cost): Add another overload.
* tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize
memory_access_type.
* tree-vect-stmts.cc (vectorizable_store): Set
SLP_TREE_MEMORY_ACCESS_TYPE.
(vectorizable_load): Likewise.  Also record the SLP node
when costing emulated gather offset decompose and vector
composition.
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Also
recognize SLP emulated gather/scatter.

9 months agoaarch64: Add codegen support for SVE2 faminmax
Saurabh Jha [Mon, 30 Sep 2024 14:38:32 +0000 (14:38 +0000)] 
aarch64: Add codegen support for SVE2 faminmax

The AArch64 FEAT_FAMINMAX extension introduces instructions for
computing the floating point absolute maximum and minimum of the
two vectors element-wise.

This patch adds code generation for famax and famin in terms of existing
unspecs. With this patch:
1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands
   and then taking absolute value of their result.
2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands
   and then taking absolute value of their result.

This fusion of operators is only possible when
-march=armv9-a+faminmax+sve flags are passed. We also need to pass
-ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX
and UNSPEC_COND_SMIN.

This code generation is only available on -O2 or -O3 as that is when
auto-vectorization is enabled.

gcc/ChangeLog:

* config/aarch64/aarch64-sve2.md
(*aarch64_pred_faminmax_fused): Instruction pattern for faminmax
codegen.
* config/aarch64/iterators.md: Iterator and attribute for
faminmax codegen.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/faminmax_1.c: New test.
* gcc.target/aarch64/sve/faminmax_2.c: New test.

9 months agoaarch64: Add SVE2 faminmax intrinsics
Saurabh Jha [Wed, 25 Sep 2024 22:08:33 +0000 (22:08 +0000)] 
aarch64: Add SVE2 faminmax intrinsics

The AArch64 FEAT_FAMINMAX extension introduces instructions for
computing the floating point absolute maximum and minimum of the
two vectors element-wise.

This patch introduces SVE2 faminmax intrinsics. The intrinsics of this
extension are implemented as the following builtin functions:
* sva[max|min]_[m|x|z]
* sva[max|min]_[f16|f32|f64]_[m|x|z]
* sva[max|min]_n_[f16|f32|f64]_[m|x|z]

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc
(svamax): Absolute maximum declaration.
(svamin): Absolute minimum declaration.
* config/aarch64/aarch64-sve-builtins-base.def
(REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag.
(svamax): Absolute maximum declaration.
(svamin): Absolute minimum declaration.
* config/aarch64/aarch64-sve-builtins-base.h: Declaring function
bases for the new intrinsics.
* config/aarch64/aarch64.h
(TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax.
* config/aarch64/iterators.md: New unspecs, iterators, and attrs
for the new intrinsics.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test.

9 months agomiddle-end/117086 - fixup vec_cond simplifications
Richard Biener [Fri, 11 Oct 2024 09:46:45 +0000 (11:46 +0200)] 
middle-end/117086 - fixup vec_cond simplifications

The following adds missing checks for a vector type result type
to simplifications that end up creating a vec_cond.

PR middle-end/117086
* match.pd ((op (vec_cond ...) ..) -> (vec_cond ...)): Add
missing checks for VECTOR_TYPE_P (type).

* gcc.dg/torture/pr117086.c: New testcase.

9 months agoRISC-V: Add testcases for form 8 of scalar signed SAT_TRUNC
Pan Li [Thu, 10 Oct 2024 08:24:08 +0000 (16:24 +0800)] 
RISC-V: Add testcases for form 8 of scalar signed SAT_TRUNC

Form 8:
  #define DEF_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_8 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN > x || x >= (WT)NT_MAX            \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-8-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 7 of scalar signed SAT_TRUNC
Pan Li [Thu, 10 Oct 2024 08:08:40 +0000 (16:08 +0800)] 
RISC-V: Add testcases for form 7 of scalar signed SAT_TRUNC

Form 7:
  #define DEF_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_7 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN >= x || x >= (WT)NT_MAX           \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-7-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 6 of scalar signed SAT_TRUNC
Pan Li [Thu, 10 Oct 2024 07:53:45 +0000 (15:53 +0800)] 
RISC-V: Add testcases for form 6 of scalar signed SAT_TRUNC

Form 6:
  #define DEF_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_6 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN >= x || x > (WT)NT_MAX            \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-6-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 5 of scalar signed SAT_TRUNC
Pan Li [Thu, 10 Oct 2024 07:35:33 +0000 (15:35 +0800)] 
RISC-V: Add testcases for form 5 of scalar signed SAT_TRUNC

Form 5:
  #define DEF_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_5 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN > x || x > (WT)NT_MAX             \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-5-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 4 of scalar signed SAT_TRUNC
Pan Li [Thu, 10 Oct 2024 06:52:04 +0000 (14:52 +0800)] 
RISC-V: Add testcases for form 4 of scalar signed SAT_TRUNC

Form 4:
  #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN <= x && x < (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-4-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoMatch: Support form 4 for scalar signed integer SAT_TRUNC
Pan Li [Thu, 10 Oct 2024 06:47:34 +0000 (14:47 +0800)] 
Match: Support form 4 for scalar signed integer SAT_TRUNC

This patch would like to support the form 4 of the scalar signed
integer SAT_TRUNC.  Aka below example:

Form 4:
  #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN <= x && x < (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

DEF_SAT_S_TRUNC_FMT_4(int8_t, int16_t, INT8_MIN, INT8_MAX)

Before this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_4 (int16_t x)
   6   │ {
   7   │   int8_t trunc;
   8   │   unsigned short x.0_1;
   9   │   unsigned short _2;
  10   │   int8_t _3;
  11   │   _Bool _7;
  12   │   signed char _8;
  13   │   signed char _9;
  14   │   signed char _10;
  15   │
  16   │ ;;   basic block 2, loop depth 0
  17   │ ;;    pred:       ENTRY
  18   │   x.0_1 = (unsigned short) x_4(D);
  19   │   _2 = x.0_1 + 128;
  20   │   if (_2 > 254)
  21   │     goto <bb 4>; [50.00%]
  22   │   else
  23   │     goto <bb 3>; [50.00%]
  24   │ ;;    succ:       4
  25   │ ;;                3
  26   │
  27   │ ;;   basic block 3, loop depth 0
  28   │ ;;    pred:       2
  29   │   trunc_5 = (int8_t) x_4(D);
  30   │   goto <bb 5>; [100.00%]
  31   │ ;;    succ:       5
  32   │
  33   │ ;;   basic block 4, loop depth 0
  34   │ ;;    pred:       2
  35   │   _7 = x_4(D) < 0;
  36   │   _8 = (signed char) _7;
  37   │   _9 = -_8;
  38   │   _10 = _9 ^ 127;
  39   │ ;;    succ:       5
  40   │
  41   │ ;;   basic block 5, loop depth 0
  42   │ ;;    pred:       3
  43   │ ;;                4
  44   │   # _3 = PHI <trunc_5(3), _10(4)>
  45   │   return _3;
  46   │ ;;    succ:       EXIT
  47   │
  48   │ }

After this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_4 (int16_t x)
   6   │ {
   7   │   int8_t _3;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _3 = .SAT_TRUNC (x_4(D)); [tail call]
  12   │   return _3;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add case 4 matching pattern for signed SAT_TRUNC.

Signed-off-by: Pan Li <pan2.li@intel.com>
9 months agoRISC-V: Add testcases for form 3 of scalar signed SAT_TRUNC
Pan Li [Wed, 9 Oct 2024 14:37:00 +0000 (22:37 +0800)] 
RISC-V: Add testcases for form 3 of scalar signed SAT_TRUNC

Form 3:
  #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN < x && x <= (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-3-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>