git.ipfire.org Git - thirdparty/gcc.git/log

]> git.ipfire.org Git - thirdparty/gcc.git/log

projects / thirdparty / gcc.git / log

commit | commitdiff | tree

GCC Administrator [Sun, 13 Oct 2024 00:18:21 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Thomas Koenig [Sat, 12 Oct 2024 17:09:14 +0000 (19:09 +0200)]

Unsigned constants for ISO_FORTRAN_ENV and ISO_C_BINDING.

gcc/fortran/ChangeLog:

* dump-parse-tree.cc (get_c_type_name): Also handle BT_UNSIGNED.
* gfortran.h (NAMED_UINTCST): Define before inclusion
of iso-c-binding.def and iso-fortran-env.def.
(gfc_get_uint_kind_from_width_isofortranenv): Prototype.
* gfortran.texi: Mention new constants in iso_c_binding and
iso_fortran_env.
* iso-c-binding.def: Handle NAMED_UINTCST. Add c_unsigned,
c_unsigned_short,c_unsigned_char, c_unsigned_long,
c_unsigned_long_long, c_uintmax_t, c_uint8_t, c_uint16_t,
c_uint32_t, c_uint64_t, c_uint128_t, c_uint_least8_t,
c_uint_least16_t, c_uint_least32_t, c_uint_least64_t,
c_uint_least128_t, c_uint_fast8_t, c_uint_fast16_t,
c_uint_fast32_t, c_uint_fast64_t and c_uint_fast128_t.
* iso-fortran-env.def: Handle NAMED_UINTCST. Add uint8, uint16,
uint32 and uint64.
* module.cc (parse_integer): Whitespace fix.
(write_module): Whitespace fix.
(NAMED_UINTCST): Define before inclusion of iso-fortran-evn.def
and iso-fortran-env.def.
* symbol.cc: Likewise.
* trans-types.cc (get_unsigned_kind_from_node): New function.
(get_uint_kind_from_name): New function.
(gfc_get_uint_kind_from_width_isofortranenv): New function.
(get_uint_kind_from_width): New function.
(gfc_init_kinds): Initialize gfc_c_uint_kind.

gcc/testsuite/ChangeLog:

* gfortran.dg/unsigned_36.f90: New test.

commit | commitdiff | tree

Feng Xue [Fri, 11 Oct 2024 06:55:05 +0000 (14:55 +0800)]

vect: Fix inconsistency in fully-masked lane-reducing op generation [PR116985]

To align vectorized def/use when lane-reducing op is present in loop reduction,
we may need to insert extra trivial pass-through copies, which would cause
mismatch between lane-reducing vector copy and loop mask index. This could be
fixed by computing the right index around a new counter on effective lane-
reducing vector copies.

2024-10-11 Feng Xue <fxue@os.amperecomputing.com>

gcc/
PR tree-optimization/116985
* tree-vect-loop.cc (vect_transform_reduction): Compute loop mask
index based on effective vector copies for reduction op.

gcc/testsuite/
PR tree-optimization/116985
* gcc.dg/vect/pr116985.c: New testcase.

commit | commitdiff | tree

Richard Biener [Sat, 12 Oct 2024 12:51:37 +0000 (14:51 +0200)]

tree-optimization/117104 - add missed guards to max(a,b) != a simplification

For vector types we have to make sure the comparison result is a vector
type and the resulting compare operation is supported. As the resulting
compare is never an equality compare I didn't bother to check for the
cbranch case.

PR tree-optimization/117104
* match.pd ((cmp:c (minmax:c @0 @1) @0) -> (out @0 @1)): Properly
guard the vector case.

* gcc.dg/pr117104.c: New testcase.

commit | commitdiff | tree

Jeff Law [Sat, 12 Oct 2024 13:12:53 +0000 (07:12 -0600)]

RISC-V] Slightly improve broadcasting small constants into vectors

I probably spent way more time on this than it's worth...

I was looking at the code we generate for vector SAD and noticed that we were
being a bit silly.  Specifically:

        li      a4,0            # 272   [c=4 l=4]  *movsi_internal/1

Followed shortly by:

        vmv.s.x v3,a4   # 261   [c=4 l=4]  *pred_broadcastrvvm1si/6

And no other uses of a4.  We could have used x0 trivially.

First we adjust the expander so that it doesn't force the constant into a
register.  In the matching pattern we change the appropriate source constraints
from "r" to "rJ" and the output template is changed to use %z for the operand.
The net is we drop the li completely and emit vmv.s.x,v3,x0.

But wait, there's more.  If we're broadcasting a constant in the range
[-16..15] into a vector, we currently load the constant into a register and use
vmv.v.r.  We can instead use vmv.v.i, which avoids loading the constant into a
GPR.  For that case we again avoid forcing the constant into a register in the
expander and adjust the output template to emit vmv.v.x or vmv.v.i based on
whether or not the appropriate operand is a constant or general purpose
register.  So again, we'll drop a load immediate into a scalar for this case.

Whether or not we should use vmv.v.i vs vmv.s.x for loading [-16..15] into the
0th element is probably uarch dependent.  The tradeoff is loading the GPR vs
the broadcast in the vector unit.  I didn't bother with this case.

Tested in my tester (which tests rv64gcv as a default codegen option). Will
wait for the pre-commit tester to render a verdict.

gcc/
* config/riscv/constraints.md (P): New constraint.
* config/riscv/vector.md (pred_broadcast<mode> expander): Do
not force small integers into GPRs so aggressively.
(pred_broadcast<mode> insn & splitter): Allow splatting small
constants across the vector register directly.  Allow splatting
(const_int 0) into element 0 directly.

commit | commitdiff | tree

Tobias Burnus [Sat, 12 Oct 2024 12:55:22 +0000 (14:55 +0200)]

Fortran/OpenMP: Warn when mapping polymorphic variables

OpenMP (TR13) states for Fortran:
* For map: "If a list item has polymorphic type, the behavior is unspecified."
* "If the firstprivate clause is on a target construct and a variable is of
polymorphic type, the behavior is unspecified."
which this commit now warns for.

gcc/fortran/ChangeLog:

* openmp.cc (resolve_omp_clauses): Diagnose polymorphic mapping.
* trans-openmp.cc (gfc_omp_finish_clause): Warn when
polymorphic variable is implicitly mapped.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/polymorphic-mapping.f90: New test.
* gfortran.dg/gomp/polymorphic-mapping-2.f90: New test.

commit | commitdiff | tree

Jakub Jelinek [Sat, 12 Oct 2024 11:47:45 +0000 (13:47 +0200)]

bootstrap: Fix genmatch build where system gcc defaults to -fPIE -pie

Seems our buildbot is unhappy about my latest commit to link genmatch with
libcommon.a in order to support gcc_diag diagnostics in libcpp.

We have in gcc/configure.ac:
if test x$enable_host_shared = xyes; then
  PICFLAG=-fPIC
elif test x$enable_host_pie = xyes; then
  PICFLAG=-fPIE
elif test x$gcc_cv_c_no_fpie = xyes; then
  PICFLAG=-fno-PIE
else
  PICFLAG=
fi

if test x$enable_host_pie = xyes; then
  LD_PICFLAG=-pie
elif test x$gcc_cv_no_pie = xyes; then
  LD_PICFLAG=-no-pie
else
  LD_PICFLAG=
fi

if test x$enable_host_bind_now = xyes; then
  LD_PICFLAG="$LD_PICFLAG -Wl,-z,now"
fi

Now, for object files linked into cc1, cc1plus, xgcc etc. we carefully
arrange for them to be compiled with $(PICFLAG) and do the link with
$(LD_PICFLAG).
For the generator programs, we don't do anything like that, we simply
compile their objects without $(PICFLAG) and link without $(LD_PICFLAG).
It isn't that big deal, the generator programs runs once or a couple of
times during the build and that is it, we don't ship them and don't
care much if they are PIE or not.
Except that after my changes to link in libcommon.a into build/genmatch,
we now link -fno-PIE compiled objects into a binary which is linked with
default flags.  Our distro compiler just links a normal executable and
everything works fine (-fPIE/-pie is added through spec file snippet and
just added in rpm default flags), but seems the buildbot system gcc
defaults to -fPIE -pie instead and so building build/genmatch fails.

The following patch is a minimal fix for that, just add -no-pie when
linking build/genmatch, but don't add -pie.

If we wanted to start building even the build/gen* tools with $(PICFLAG)
and $(LD_PICFLAG), that would be much larger change.

2024-10-12  Jakub Jelinek  <jakub@redhat.com>

* Makefile.in (LINKER_FOR_BUILD): Append -no-pie if it is in
$(LD_PICFLAG) when building build/genmatch.

commit | commitdiff | tree

H.J. Lu [Fri, 11 Oct 2024 22:15:28 +0000 (06:15 +0800)]

gcc.target/i386/pr55583.c: Use long long for 64-bit integer

Since long is 32-bit for x32, use long long for 64-bit integer.

* gcc.target/i386/pr55583.c: Use long long for 64-bit integer.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

H.J. Lu [Fri, 11 Oct 2024 21:22:52 +0000 (05:22 +0800)]

gcc.target/i386/pr115749.c: Use word_mode integer

Use word_mode integer with func so that 64-bit integer is used with
x32.

* gcc.target/i386/pr115749.c (uword): New.
(func): Replace unsigned long with uword.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

H.J. Lu [Fri, 11 Oct 2024 21:04:33 +0000 (05:04 +0800)]

gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx)

Since x32 uses (%edx), instead of (%rdx), also scan (%edx).

* gcc.target/i386/invariant-ternlog-1.c: Also scan (%edx).

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

Jakub Jelinek [Sat, 12 Oct 2024 08:44:17 +0000 (10:44 +0200)]

libcpp, genmatch: Use gcc_diag instead of printf for libcpp diagnostics

When working on #embed support, or -Wheader-guard or other recent libcpp
changes, I've been annoyed by the libcpp diagnostics being visually
different from normal gcc diagnostics, especially in the area of quoting
stuff in the diagnostic messages.
Normall GCC diagnostics is gcc_diag/gcc_tdiag, one can use
%</%>, %qs etc. in there, while libcpp diagnostics was marked as printf
and in libcpp we've been very creative with quoting stuff, either
no quotes at all, or "something" quoting, or 'something' quoting, or
`something' quoting (but in none of the cases it used colors consistently
with the rest of the compiler).

Now, libcpp diagnostics is always emitted using a callback,
pfile->cb.diagnostic.  On the gcc/ side, this callback is initialized with
genmatch.cc:  cb->diagnostic = diagnostic_cb;
c-family/c-opts.cc:  cb->diagnostic = c_cpp_diagnostic;
fortran/cpp.cc:  cb->diagnostic = cb_cpp_diagnostic;
where the latter two just use diagnostic_report_diagnostic, so actually
support all the gcc_diag stuff, only the genmatch.cc case didn't.

So, the following patch changes genmatch.cc to use pp_format* instead
of vfprintf so that it supports the gcc_diag formatting (pretty-print.o
unfortunately has various dependencies, so had to link genmatch with
libcommon.a libbacktrace.a and tweak Makefile.in so that there are no
circular dependencies) and marks the libcpp diagnostic routines as
gcc_diag rather than printf.  That change resulted in hundreds of
-Wformat-diag new warnings (most of them useful and resulting IMHO in
better diagnostics), so the rest of the patch is changing the format
strings to make -Wformat-diag happy and adjusting the testsuite for
the differences in how is the diagnostic reformatted.

Dunno if some out of GCC tree projects use libcpp, that case would
make it harder because one couldn't use vfprintf in the diagnostic
callback anymore, but there is always David's libdiagnostic which could
be used for that purpose IMHO.

2024-10-12  Jakub Jelinek  <jakub@redhat.com>

libcpp/
* include/cpplib.h (ATTRIBUTE_CPP_PPDIAG): Define.
(struct cpp_callbacks): Use ATTRIBUTE_CPP_PPDIAG instead of
ATTRIBUTE_FPTR_PRINTF on diagnostic callback.
(cpp_error, cpp_warning, cpp_pedwarning, cpp_warning_syshdr): Use
ATTRIBUTE_CPP_PPDIAG (3, 4) instead of ATTRIBUTE_PRINTF_3.
(cpp_warning_at, cpp_pedwarning_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5)
instead of ATTRIBUTE_PRINTF_4.
(cpp_error_with_line, cpp_warning_with_line, cpp_pedwarning_with_line,
cpp_warning_with_line_syshdr): Use ATTRIBUTE_CPP_PPDIAG (5, 6)
instead of ATTRIBUTE_PRINTF_5.
(cpp_error_at): Use ATTRIBUTE_CPP_PPDIAG (4, 5) instead of
ATTRIBUTE_PRINTF_4.
* Makefile.in (po/$(PACKAGE).pot): Use --language=GCC-source rather
than --language=c.
* errors.cc (cpp_diagnostic_at, cpp_diagnostic,
cpp_diagnostic_with_line): Use ATTRIBUTE_CPP_PPDIAG instead of
-ATTRIBUTE_FPTR_PRINTF.
* charset.cc (cpp_host_to_exec_charset, _cpp_valid_ucn, convert_hex,
convert_oct, convert_escape): Fix up -Wformat-diag warnings.
(cpp_interpret_string_ranges, count_source_chars): Use
ATTRIBUTE_CPP_PPDIAG instead of ATTRIBUTE_FPTR_PRINTF.
(narrow_str_to_charconst): Fix up -Wformat-diag warnings.
* directives.cc (check_eol_1, directive_diagnostics, lex_macro_node,
do_undef, glue_header_name, parse_include, do_include_common,
do_include_next, _cpp_parse_embed_params, do_embed, read_flag,
do_line, do_linemarker, register_pragma_1, do_pragma_once,
do_pragma_push_macro, do_pragma_pop_macro, do_pragma_poison,
do_pragma_system_header, do_pragma_warning_or_error, _cpp_do__Pragma,
do_else, do_elif, do_endif, parse_answer, do_assert,
cpp_define_unused): Likewise.
* expr.cc (cpp_classify_number, parse_defined, eval_token,
_cpp_parse_expr, reduce, check_promotion): Likewise.
* files.cc (_cpp_find_file, finish_base64_embed,
_cpp_pop_file_buffer): Likewise.
* init.cc (sanity_checks): Likewise.
* lex.cc (_cpp_process_line_notes, maybe_warn_bidi_on_char,
_cpp_warn_invalid_utf8, _cpp_skip_block_comment,
warn_about_normalization, forms_identifier_p, maybe_va_opt_error,
identifier_diagnostics_on_lex, cpp_maybe_module_directive): Likewise.
* macro.cc (class vaopt_state, builtin_has_include_1,
builtin_has_include, builtin_has_embed, _cpp_warn_if_unused_macro,
_cpp_builtin_macro_text, builtin_macro, stringify_arg,
_cpp_arguments_ok, collect_args, enter_macro_context,
_cpp_save_parameter, parse_params, create_iso_definition,
_cpp_create_definition, check_trad_stringification): Likewise.
* pch.cc (cpp_valid_state): Likewise.
* traditional.cc (_cpp_scan_out_logical_line, recursive_macro):
Likewise.
gcc/
* Makefile.in (generated_files): Remove {gimple,generic}-match*.
(generated_match_files): New variable.  Add a dependency of
$(filter-out $(OBJS-libcommon),$(ALL_HOST_OBJS)) files on those.
(build/genmatch$(build_exeext)): Depend on and link against
libcommon.a and $(LIBBACKTRACE).
* genmatch.cc: Include pretty-print.h and input.h.
(ggc_internal_cleared_alloc, ggc_free): Remove.
(fatal): New function.
(line_table): Remove.
(linemap_client_expand_location_to_spelling_point): Remove.
(diagnostic_cb): Use gcc_diag rather than printf format.  Use
pp_format_verbatim on a temporary pretty_printer instead of
vfprintf.
(fatal_at, warning_at): Use gcc_diag rather than printf format.
(output_line_directive): Rename location_hash to loc_hash.
(parser::eat_ident, parser::parse_operation, parser::parse_expr,
parser::parse_pattern, parser::finish_match_operand): Fix up
-Wformat-diag warnings.
gcc/c-family/
* c-lex.cc (c_common_has_attribute,
c_common_lex_availability_macro): Fix up -Wformat-diag warnings.
gcc/testsuite/
* c-c++-common/cpp/counter-2.c: Adjust expected diagnostics for
libcpp diagnostic formatting changes.
* c-c++-common/cpp/embed-3.c: Likewise.
* c-c++-common/cpp/embed-4.c: Likewise.
* c-c++-common/cpp/embed-16.c: Likewise.
* c-c++-common/cpp/embed-18.c: Likewise.
* c-c++-common/cpp/eof-2.c: Likewise.
* c-c++-common/cpp/eof-3.c: Likewise.
* c-c++-common/cpp/fmax-include-depth.c: Likewise.
* c-c++-common/cpp/has-builtin.c: Likewise.
* c-c++-common/cpp/line-2.c: Likewise.
* c-c++-common/cpp/line-3.c: Likewise.
* c-c++-common/cpp/macro-arg-count-1.c: Likewise.
* c-c++-common/cpp/macro-arg-count-2.c: Likewise.
* c-c++-common/cpp/macro-ranges.c: Likewise.
* c-c++-common/cpp/named-universal-char-escape-4.c: Likewise.
* c-c++-common/cpp/named-universal-char-escape-5.c: Likewise.
* c-c++-common/cpp/pr88974.c: Likewise.
* c-c++-common/cpp/va-opt-error.c: Likewise.
* c-c++-common/cpp/va-opt-pedantic.c: Likewise.
* c-c++-common/cpp/Wheader-guard-2.c: Likewise.
* c-c++-common/cpp/Wheader-guard-3.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-1.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-2.c: Likewise.
* c-c++-common/cpp/Winvalid-utf8-3.c: Likewise.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c:
Likewise.
* c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-3.c:
Likewise.
* c-c++-common/pr68833-3.c: Likewise.
* c-c++-common/raw-string-directive-1.c: Likewise.
* gcc.dg/analyzer/named-constants-Wunused-macros.c: Likewise.
* gcc.dg/binary-constants-4.c: Likewise.
* gcc.dg/builtin-redefine.c: Likewise.
* gcc.dg/cpp/19951025-1.c: Likewise.
* gcc.dg/cpp/c11-warning-1.c: Likewise.
* gcc.dg/cpp/c11-warning-2.c: Likewise.
* gcc.dg/cpp/c11-warning-3.c: Likewise.
* gcc.dg/cpp/c23-elifdef-2.c: Likewise.
* gcc.dg/cpp/c23-warning-2.c: Likewise.
* gcc.dg/cpp/embed-2.c: Likewise.
* gcc.dg/cpp/embed-3.c: Likewise.
* gcc.dg/cpp/embed-4.c: Likewise.
* gcc.dg/cpp/expr.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-2.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-3.c: Likewise.
* gcc.dg/cpp/gnu11-elifdef-4.c: Likewise.
* gcc.dg/cpp/gnu11-warning-1.c: Likewise.
* gcc.dg/cpp/gnu11-warning-2.c: Likewise.
* gcc.dg/cpp/gnu11-warning-3.c: Likewise.
* gcc.dg/cpp/gnu23-warning-2.c: Likewise.
* gcc.dg/cpp/include6.c: Likewise.
* gcc.dg/cpp/pr35322.c: Likewise.
* gcc.dg/cpp/tr-warn6.c: Likewise.
* gcc.dg/cpp/undef2.c: Likewise.
* gcc.dg/cpp/warn-comments.c: Likewise.
* gcc.dg/cpp/warn-comments-2.c: Likewise.
* gcc.dg/cpp/warn-comments-3.c: Likewise.
* gcc.dg/cpp/warn-cxx-compat.c: Likewise.
* gcc.dg/cpp/warn-cxx-compat-2.c: Likewise.
* gcc.dg/cpp/warn-deprecated.c: Likewise.
* gcc.dg/cpp/warn-deprecated-2.c: Likewise.
* gcc.dg/cpp/warn-long-long.c: Likewise.
* gcc.dg/cpp/warn-long-long-2.c: Likewise.
* gcc.dg/cpp/warn-normalized-1.c: Likewise.
* gcc.dg/cpp/warn-normalized-2.c: Likewise.
* gcc.dg/cpp/warn-normalized-3.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-bytes.c: Likewise.
* gcc.dg/cpp/warn-normalized-4-unicode.c: Likewise.
* gcc.dg/cpp/warn-redefined.c: Likewise.
* gcc.dg/cpp/warn-redefined-2.c: Likewise.
* gcc.dg/cpp/warn-traditional.c: Likewise.
* gcc.dg/cpp/warn-traditional-2.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-1.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-2.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-3.c: Likewise.
* gcc.dg/cpp/warn-trigraphs-4.c: Likewise.
* gcc.dg/cpp/warn-undef.c: Likewise.
* gcc.dg/cpp/warn-undef-2.c: Likewise.
* gcc.dg/cpp/warn-unused-macros.c: Likewise.
* gcc.dg/cpp/warn-unused-macros-2.c: Likewise.
* gcc.dg/pch/counter-2.c: Likewise.
* g++.dg/cpp0x/udlit-error1.C: Likewise.
* g++.dg/cpp23/named-universal-char-escape1.C: Likewise.
* g++.dg/cpp23/named-universal-char-escape2.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-1.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-2.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-3.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-4.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-5.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-6.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-7.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-8.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-9.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-10.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-11.C: Likewise.
* g++.dg/cpp23/Winvalid-utf8-12.C: Likewise.
* g++.dg/cpp/elifdef-3.C: Likewise.
* g++.dg/cpp/elifdef-5.C: Likewise.
* g++.dg/cpp/elifdef-6.C: Likewise.
* g++.dg/cpp/elifdef-7.C: Likewise.
* g++.dg/cpp/embed-1.C: Likewise.
* g++.dg/cpp/embed-2.C: Likewise.
* g++.dg/cpp/pedantic-errors.C: Likewise.
* g++.dg/cpp/warning-1.C: Likewise.
* g++.dg/cpp/warning-2.C: Likewise.
* g++.dg/ext/bitint1.C: Likewise.
* g++.dg/ext/bitint2.C: Likewise.

commit | commitdiff | tree

Tobias Burnus [Sat, 12 Oct 2024 08:48:41 +0000 (10:48 +0200)]

Fortran: Unify gfc_get_location handling; fix expr->ts bug

This commit reduces code duplication by moving gfc_get_location
from trans.cc to error.cc. The gcc_assert is now used more often
and reveald a bug in gfc_match_array_constructor where the union
expr->ts.u.derived of a derived type is partially overwritten by
the assignment expr->ts.u.cl->... as a ts.type == BT_CHARACTER check
was missing.

gcc/fortran/ChangeLog:

* array.cc (gfc_match_array_constructor): Only update the
character length if the expression is of character type.
* error.cc (gfc_get_location_with_offset): New; split off
from ...
(gfc_format_decoder): ... here; call it.
* gfortran.h (gfc_get_location_with_offset): New prototype.
(gfc_get_location): New inline function.
* trans.cc (gfc_get_location): Remove function definition.
* trans.h (gfc_get_location): Remove declaration.

commit | commitdiff | tree

Uros Bizjak [Sat, 12 Oct 2024 08:04:03 +0000 (10:04 +0200)]

testsuite/i386: Add vector sat_sub testcases [PR112600]

PR middle-end/112600

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr112600-4a.c: New test.
* gcc.target/i386/pr112600-4b.c: New test.

commit | commitdiff | tree

Feng Xue [Sat, 12 Oct 2024 07:45:58 +0000 (15:45 +0800)]

MAINTAINERS: Add myself to write after approval

ChangeLog:

* MAINTAINERS: Add myself to write after approval.

commit | commitdiff | tree

Simon Martin [Fri, 11 Oct 2024 08:16:26 +0000 (10:16 +0200)]

c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

We currently emit an incorrect -Woverloaded-virtual warning upon the
following test case

=== cut here ===
struct A {
  virtual operator int() { return 42; }
  virtual operator char() = 0;
};
struct B : public A {
  operator char() { return 'A'; }
};
=== cut here ===

The problem is that when iterating over ovl_range (fns), warn_hidden
gets confused by the conversion operator marker, concludes that
seen_non_override is true and therefore emits a warning for all
conversion operators in A that do not convert to char, even if
-Woverloaded-virtual is 1 (e.g. with -Wall, the case reported).

A second set of problems is highlighted when -Woverloaded-virtual is 2.

First, with the same test case, since base_fndecls contains all
conversion operators in A (except the one to char, that's been removed
when iterating over ovl_range (fns)), we emit a spurious warning for
the conversion operator to int, even though it's unrelated.

Second, in case there are several conversion operators with different
cv-qualifiers to the same type in A, we rightfully emit a warning,
however the note uses the location of the conversion operator marker
instead of the right one; location_of should go over conv_op_marker.

This patch fixes all these by explicitly keeping track of (1) base
methods that are overriden, as well as (2) base methods that are hidden
but not overriden (and by what), and warning about methods that are in
(2) but not (1). It also ignores non virtual base methods, per
"definition" of -Woverloaded-virtual.

PR c++/109918

gcc/cp/ChangeLog:

* class.cc (warn_hidden): Keep track of overloaded and of hidden
base methods. Mention the actual hiding function in the warning,
not the first overload.
* error.cc (location_of): Skip over conv_op_marker.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Woverloaded-virt1.C: Check that no warning is
emitted for non virtual base methods.
* g++.dg/warn/Woverloaded-virt5.C: New test.
* g++.dg/warn/Woverloaded-virt6.C: New test.
* g++.dg/warn/Woverloaded-virt7.C: New test.
* g++.dg/warn/Woverloaded-virt8.C: New test.
* g++.dg/warn/Woverloaded-virt9.C: New test.

commit | commitdiff | tree

Pan Li [Fri, 11 Oct 2024 04:12:03 +0000 (12:12 +0800)]

RISC-V: Add testcases for form 1 of vector signed SAT_SUB

Form 1:
  #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0                                          \
          ? minus                                                      \
          : (minus ^ x) >= 0                                           \
            ? minus                                                    \
            : x < 0 ? MIN : MAX;                                       \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for run test.
* gcc.target/riscv/rvv/autovec/vec_sat_arith.h: Add test helper
macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-1-i8.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i16.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i32.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i64.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_s_sub-run-1-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 11 Oct 2024 04:05:10 +0000 (12:05 +0800)]

RISC-V: Implement vector SAT_SUB for signed integer

This patch would like to implement the sssub for vector signed integer.

Form 1:
  #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0                                          \
          ? minus                                                      \
          : (minus ^ x) >= 0                                           \
            ? minus                                                    \
            : x < 0 ? MIN : MAX;                                       \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)

Before this patch:
  28   │     vle8.v  v1,0(a1)
  29   │     vle8.v  v2,0(a2)
  30   │     sub a3,a3,a5
  31   │     add a1,a1,a5
  32   │     add a2,a2,a5
  33   │     vsra.vi v4,v1,7
  34   │     vsub.vv v3,v1,v2
  35   │     vxor.vv v2,v1,v2
  36   │     vxor.vv v0,v1,v3
  37   │     vmslt.vi    v2,v2,0
  38   │     vmslt.vi    v0,v0,0
  39   │     vmand.mm    v0,v0,v2
  40   │     vxor.vv v3,v4,v5,v0.t
  41   │     vse8.v  v3,0(a0)
  42   │     add a0,a0,a5

After this patch:
  25   │     vle8.v  v1,0(a1)
  26   │     vle8.v  v2,0(a2)
  27   │     sub a3,a3,a5
  28   │     add a1,a1,a5
  29   │     add a2,a2,a5
  30   │     vssub.vv    v1,v1,v2
  31   │     vse8.v  v1,0(a0)
  32   │     add a0,a0,a5

The below test suites are passed for this patch.
* The rv64gcv fully regression test.

gcc/ChangeLog:

* config/riscv/autovec.md (sssub<mode>3): Add new pattern for
signed SAT_SUB.
* config/riscv/riscv-protos.h (expand_vec_sssub): Add new func
decl to expand sssub to vssub.
* config/riscv/riscv-v.cc (expand_vec_sssub): Add new func
impl to expand sssub to vssub.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 11 Oct 2024 03:58:30 +0000 (11:58 +0800)]

Vect: Try the pattern of vector signed integer SAT_SUB

Almost the same as vector unsigned integer SAT_SUB, try to match
the signed version during the vector pattern matching.

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* tree-vect-patterns.cc (gimple_signed_integer_sat_sub): Add new
func decl for signed SAT_SUB.
(vect_recog_sat_sub_pattern_transform): Update comments.
(vect_recog_sat_sub_pattern): Try the vector signed SAT_SUB
pattern.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Fri, 11 Oct 2024 03:51:52 +0000 (11:51 +0800)]

Match: Support form 1 for vector signed integer SAT_SUB

This patch would like to support the form 1 of the vector signed
integer SAT_SUB.  Aka below example:

Form 1:
  #define DEF_VEC_SAT_S_SUB_FMT_1(T, UT, MIN, MAX)                     \
  void __attribute__((noinline))                                       \
  vec_sat_s_add_##T##_fmt_1 (T *out, T *op_1, T *op_2, unsigned limit) \
  {                                                                    \
    unsigned i;                                                        \
    for (i = 0; i < limit; i++)                                        \
      {                                                                \
        T x = op_1[i];                                                 \
        T y = op_2[i];                                                 \
        T minus = (UT)x - (UT)y;                                       \
        out[i] = (x ^ y) >= 0                                          \
          ? minus                                                      \
          : (minus ^ x) >= 0                                           \
            ? minus                                                    \
            : x < 0 ? MIN : MAX;                                       \
      }                                                                \
  }

DEF_VEC_SAT_S_SUB_FMT_1(int8_t, uint8_t, INT8_MIN, INT8_MAX)

Before this patch:
  91   │   _108 = .SELECT_VL (ivtmp_106, POLY_INT_CST [16, 16]);
  92   │   vect_x_16.11_80 = .MASK_LEN_LOAD (vectp_op_1.9_78, 8B, { -1, ... }, _108, 0);
  93   │   _69 = vect_x_16.11_80 >> 7;
  94   │   vect_x.12_81 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_x_16.11_80);
  95   │   vect_y_18.15_85 = .MASK_LEN_LOAD (vectp_op_2.13_83, 8B, { -1, ... }, _108, 0);
  96   │   vect__7.21_91 = vect_x_16.11_80 ^ vect_y_18.15_85;
  97   │   mask__44.22_92 = vect__7.21_91 < { 0, ... };
  98   │   vect_y.16_86 = VIEW_CONVERT_EXPR<vector([16,16]) unsigned char>(vect_y_18.15_85);
  99   │   vect__6.17_87 = vect_x.12_81 - vect_y.16_86;
100   │   vect_minus_19.18_88 = VIEW_CONVERT_EXPR<vector([16,16]) signed char>(vect__6.17_87);
101   │   vect__8.19_89 = vect_x_16.11_80 ^ vect_minus_19.18_88;
102   │   mask__42.20_90 = vect__8.19_89 < { 0, ... };
103   │   mask__41.23_93 = mask__42.20_90 & mask__44.22_92;
104   │   _4 = .COND_XOR (mask__41.23_93, _69, { 127, ... }, vect_minus_19.18_88);
105   │   .MASK_LEN_STORE (vectp_out.31_102, 8B, { -1, ... }, _108, 0, _4);
106   │   vectp_op_1.9_79 = vectp_op_1.9_78 + _108;
107   │   vectp_op_2.13_84 = vectp_op_2.13_83 + _108;
108   │   vectp_out.31_103 = vectp_out.31_102 + _108;
109   │   ivtmp_107 = ivtmp_106 - _108;

After this patch:
  81   │   _102 = .SELECT_VL (ivtmp_100, POLY_INT_CST [16, 16]);
  82   │   vect_x_16.11_89 = .MASK_LEN_LOAD (vectp_op_1.9_87, 8B, { -1, ... }, _102, 0);
  83   │   vect_y_18.14_93 = .MASK_LEN_LOAD (vectp_op_2.12_91, 8B, { -1, ... }, _102, 0);
  84   │   vect_patt_38.15_94 = .SAT_SUB (vect_x_16.11_89, vect_y_18.14_93);
  85   │   .MASK_LEN_STORE (vectp_out.16_96, 8B, { -1, ... }, _102, 0, vect_patt_38.15_94);
  86   │   vectp_op_1.9_88 = vectp_op_1.9_87 + _102;
  87   │   vectp_op_2.12_92 = vectp_op_2.12_91 + _102;
  88   │   vectp_out.16_97 = vectp_out.16_96 + _102;
  89   │   ivtmp_101 = ivtmp_100 - _102;

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add case 1 matching pattern for vector signed SAT_SUB.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

GCC Administrator [Sat, 12 Oct 2024 00:18:49 +0000 (00:18 +0000)]

Daily bump.

commit | commitdiff | tree

Thomas Koenig [Fri, 11 Oct 2024 20:58:51 +0000 (22:58 +0200)]

Introduce GFC_STD_UNSIGNED.

This patch creates an unsigned "standard" for the
gfc_option.allow_std field.

One of the main reason why people want UNSIGNED for Fortran is
interfacing for C.

This is a preparation for further work on the ISO_C_BINDING constants.
That, we do via iso-c-binding.def , whose last field is a standard
for the constant to be defined for the standard in question, which is
then checked. I could try and invent a different method for this,
but I'd rather not.

gcc/fortran/ChangeLog:

* intrinsic.cc (add_functions): Convert uint and
selected_unsigned_kind to GFC_STD_UNSIGNED.
(gfc_check_intrinsic_standard): Handle GFC_STD_UNSIGNED.
* libgfortran.h (GFC_STD_UNSIGNED): Add.
* options.cc (gfc_post_options): Set GFC_STD_UNSIGNED
if -funsigned is set.

commit | commitdiff | tree

H.J. Lu [Thu, 10 Oct 2024 09:22:36 +0000 (17:22 +0800)]

gcc.target/i386: Replace long with long long

Since long is 64-bit for x32, replace long with long long for x32.

* gcc.target/i386/bmi2-pr112526.c: Replace long with long long.
* gcc.target/i386/pr105854.c: Likewise.
* gcc.target/i386/pr112943.c: Likewise.
* gcc.target/i386/pr67325.c: Likewise.
* gcc.target/i386/pr97971.c: Likewise.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

H.J. Lu [Thu, 10 Oct 2024 11:00:32 +0000 (19:00 +0800)]

g++.target/i386/pr105953.C: Skip for x32

Since -mabi=ms isn't supported for x32, skip g++.target/i386/pr105953.C
for x32.

* g++.target/i386/pr105953.C: Skip for x32.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

H.J. Lu [Thu, 10 Oct 2024 09:29:27 +0000 (17:29 +0800)]

gcc.target/i386/pr115407.c: Only run for lp64

Since -mcmodel=large is valid only for lp64, run pr115407.c only for
lp64.

* gcc.target/i386/pr115407.c: Only run for lp64.

Signed-off-by: H.J. Lu <hjl.tools@gmail.com>

commit | commitdiff | tree

Eric Botcazou [Fri, 11 Oct 2024 17:29:15 +0000 (19:29 +0200)]

Fix thinko in previous change

gcc/ada/
PR ada/116498
PR ada/117087
* gcc-interface/decl.cc (validate_size): Fix thinko.

commit | commitdiff | tree

Jonathan Wakely [Thu, 10 Oct 2024 21:47:46 +0000 (22:47 +0100)]

libstdc++: Rearrange std::move_iterator helpers in stl_iterator.h

The __niter_base(move_iterator<I>) overload and __is_move_iterator trait
were originally immediately after the definition of move_iterator. The
addition of C++20 features after move_iterator meant that those helpers
were no longer anywhere near move_iterator.

This change puts them back where they used to be, before all the new
C++20 additions.

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h (__niter_base(move_iterator<I>))
(__is_move_iterator, __miter_base, _GLIBCXX_MAKE_MOVE_ITERATOR)
(_GLIBCXX_MAKE_MOVE_IF_NOEXCEPT_ITERATOR): Move earlier in the
file.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 9 Oct 2024 16:40:33 +0000 (09:40 -0700)]

PR target/117048 aarch64: Use more canonical and optimization-friendly representation for XAR instruction

The pattern for the Advanced SIMD XAR instruction isn't very
optimization-friendly at the moment.
In the testcase from the PR once simlify-rtx has done its work it
generates the RTL:
(set (reg:V2DI 119 [ _14 ])
    (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
            (reg:V2DI 116 [ *m1_01_8(D) ]))
        (const_vector:V2DI [
                (const_int 32 [0x20]) repeated x2
            ])))

which fails to match our XAR pattern because the pattern expects:
1) A ROTATERT instead of the ROTATE.  However, according to the RTL ops
documentation the preferred form of rotate-by-immediate is ROTATE, which
I take to mean it's the canonical form.
ROTATE (x, C) <-> ROTATERT (x, MODE_WIDTH - C) so it's better to match just
one canonical representation.
2) A CONST_INT shift amount whereas the midend asks for a repeated vector
constant.

These issues are fixed by introducing a dedicated expander for the
aarch64_xarqv2di name, needed by the arm_neon.h intrinsic, that translate
the intrinsic-level CONST_INT immediate (the right-rotate amount) into
a repeated vector constant subtracted from 64 to give the corresponding
left-rotate amount that is fed to the new representation for the XAR
define_insn that uses the ROTATE RTL code.  This is a similar approach
to have we handle the discrepancy between intrinsic-level and RTL-level
vector lane numbers for big-endian.

With this patch and [1/2] the arithmetic parts of the testcase now simplify
to just one XAR instruction.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
gcc/
PR target/117048
* config/aarch64/aarch64-simd.md (aarch64_xarqv2di): Redefine into a
define_expand.
(*aarch64_xarqv2di_insn): Define.

gcc/testsuite/
PR target/117048
* g++.target/aarch64/pr117048.C: New test.

commit | commitdiff | tree

Kyrylo Tkachov [Wed, 9 Oct 2024 16:39:55 +0000 (09:39 -0700)]

PR 117048: simplify-rtx: Extend (x << C1) | (X >> C2) --> ROTATE transformation to vector operands

In the testcase from patch [2/2] we want to match a vector rotate operate from
an IOR of left and right shifts by immediate.  simplify-rtx has code for just
that but it looks like it's prepared to do handle only scalar operands.
In practice most of the code works for vector modes as well except the shift
amounts are checked to be CONST_INT rather than vector constants that we have
here.  This is easily extended by using unwrap_const_vec_duplicate to extract
the repeating constant shift amount.  With this change combine now tries
matching the simpler and expected:
(set (reg:V2DI 119 [ _14 ])
    (rotate:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
            (reg:V2DI 116 [ *m1_01_8(D) ]))
        (const_vector:V2DI [
                (const_int 32 [0x20]) repeated x2
            ])))
instead of the previous:
(set (reg:V2DI 119 [ _14 ])
    (ior:V2DI (ashift:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
                (reg:V2DI 116 [ *m1_01_8(D) ]))
            (const_vector:V2DI [
                    (const_int 32 [0x20]) repeated x2
                ]))
        (lshiftrt:V2DI (xor:V2DI (reg:V2DI 114 [ vect__1.12_16 ])
                (reg:V2DI 116 [ *m1_01_8(D) ]))
            (const_vector:V2DI [
                    (const_int 32 [0x20]) repeated x2
                ]))))

To actually fix the PR the aarch64 backend needs some adjustment as well
which is done in patch [2/2], which adds the testcase as well.

Bootstrapped and tested on aarch64-none-linux-gnu.

Signed-off-by: Kyrylo Tkachov <ktkachov@nvidia.com>
PR target/117048
* simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
Handle vector constants in (x << C1) | (x >> C2) -> ROTATE
simplification.

commit | commitdiff | tree

Tobias Burnus [Fri, 11 Oct 2024 15:05:37 +0000 (17:05 +0200)]

Fortran: Dead-function removal in error.cc (shrinking by 40%)

This patch removes a large number of unused static functions from error.cc,
which previously were used for diagnostic but have been replaced by the common
diagnostic code.

gcc/fortran/ChangeLog:

* error.cc (error_char, error_string, error_uinteger, error_integer,
error_hwuint, error_hwint, gfc_widechar_display_length,
gfc_wide_display_length, error_printf, show_locus, show_loci):
Remove unused static functions.
(IBUF_LEN, MAX_ARGS): Remove now unused #define.

commit | commitdiff | tree

Jennifer Schmitz [Wed, 25 Sep 2024 10:21:22 +0000 (03:21 -0700)]

match.pd: Fold logarithmic identities.

This patch implements 4 rules for logarithmic identities in match.pd
under -funsafe-math-optimizations:
1) logN(1.0/a) -> -logN(a). This avoids the division instruction.
2) logN(C/a) -> logN(C) - logN(a), where C is a real constant. Same as 1).
3) logN(a) + logN(b) -> logN(a*b). This reduces the number of calls to
log function.
4) logN(a) - logN(b) -> logN(a/b). Same as 4).
Tests were added for float, double, and long double.

The patch was bootstrapped and regtested on aarch64-linux-gnu and
x86_64-linux-gnu, no regression.
Additionally, SPEC 2017 fprate was run. While the transform does not seem
to be triggered, we also see no non-noise impact on performance.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
PR tree-optimization/116826
PR tree-optimization/86710
* match.pd: Fold logN(1.0/a) -> -logN(a),
logN(C/a) -> logN(C) - logN(a), logN(a) + logN(b) -> logN(a*b),
and logN(a) - logN(b) -> logN(a/b).

gcc/testsuite/
PR tree-optimization/116826
PR tree-optimization/86710
* gcc.dg/tree-ssa/log_ident.c: New test.

commit | commitdiff | tree

Jonathan Wakely [Fri, 11 Oct 2024 12:29:06 +0000 (13:29 +0100)]

libstdc++: Use appropriate feature test macro for std::byte

libstdc++-v3/ChangeLog:

* include/bits/cpp_type_traits.h (__is_byte<byte>): Guard with
__glibcxx_byte macro instead of checking __cplusplus.

commit | commitdiff | tree

Jonathan Wakely [Fri, 11 Oct 2024 08:40:38 +0000 (09:40 +0100)]

libstdc++: Fix localized %c formatting for <chrono> [PR117085]

When formatting a time point with %c we call std::vformat_to using the
formatting locale's D_T_FMT string, but we weren't adding the L option
to the format string. This meant we always interpreted D_T_FMT in the C
locale, instead of using the formatting locale as obviously intended
when %c is used.

libstdc++-v3/ChangeLog:

PR libstdc++/117085
* include/bits/chrono_io.h (__formatter_chrono::_M_c): Add L
option to format string.
* testsuite/std/time/format.cc: Move to...
* testsuite/std/time/format/format.cc: ...here.
* testsuite/std/time/format_localized.cc: Move to...
* testsuite/std/time/format/localized.cc: ...here.
* testsuite/std/time/format/pr117085.cc: New test.

commit | commitdiff | tree

Jonathan Wakely [Fri, 11 Oct 2024 14:42:10 +0000 (15:42 +0100)]

libstdc++: Add missing whitespace in dg-do directives

libstdc++-v3/ChangeLog:

* testsuite/22_locale/time_get/get/char/5.cc: Fix dg-do
directive.
* testsuite/22_locale/time_get/get/wchar_t/5.cc: Likewise.

commit | commitdiff | tree

Richard Biener [Thu, 6 Jun 2024 13:52:02 +0000 (15:52 +0200)]

tree-optimization/117080 - Add SLP_TREE_MEMORY_ACCESS_TYPE

It turns out target costing code looks at STMT_VINFO_MEMORY_ACCESS_TYPE
to identify operations from (emulated) gathers for example.  This
doesn't work for SLP loads since we do not set STMT_VINFO_MEMORY_ACCESS_TYPE
there as the vectorization strathegy might differ between different
stmt uses.  It seems we got away with setting it for stores though.
The following adds a memory_access_type field to slp_tree and sets it
from load and store vectorization code.  All the costing doesn't record
the SLP node (that was only done selectively for some corner case).  The
costing is really in need of a big overhaul, the following just massages
the two relevant ops to fix gcc.dg/target/pr88531-2[bc].c FAILs when
switching on SLP for non-grouped stores.  In particular currently
we either have a SLP node or a stmt_info in the cost hook but not both.

So the following mitigates this, postponing a rewrite of costing to
next stage1.  Other targets look possibly affected as well but are
left to respective maintainers to update.

PR tree-optimization/117080
* tree-vectorizer.h (_slp_tree::memory_access_type): Add.
(SLP_TREE_MEMORY_ACCESS_TYPE): New.
(record_stmt_cost): Add another overload.
* tree-vect-slp.cc (_slp_tree::_slp_tree): Initialize
memory_access_type.
* tree-vect-stmts.cc (vectorizable_store): Set
SLP_TREE_MEMORY_ACCESS_TYPE.
(vectorizable_load): Likewise.  Also record the SLP node
when costing emulated gather offset decompose and vector
composition.
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost): Also
recognize SLP emulated gather/scatter.

commit | commitdiff | tree

Saurabh Jha [Mon, 30 Sep 2024 14:38:32 +0000 (14:38 +0000)]

aarch64: Add codegen support for SVE2 faminmax

The AArch64 FEAT_FAMINMAX extension introduces instructions for
computing the floating point absolute maximum and minimum of the
two vectors element-wise.

This patch adds code generation for famax and famin in terms of existing
unspecs. With this patch:
1. famax can be expressed as taking UNSPEC_COND_SMAX of the two operands
and then taking absolute value of their result.
2. famin can be expressed as taking UNSPEC_COND_SMIN of the two operands
and then taking absolute value of their result.

This fusion of operators is only possible when
-march=armv9-a+faminmax+sve flags are passed. We also need to pass
-ffast-math flag; this is what enables compiler to use UNSPEC_COND_SMAX
and UNSPEC_COND_SMIN.

This code generation is only available on -O2 or -O3 as that is when
auto-vectorization is enabled.

gcc/ChangeLog:

* config/aarch64/aarch64-sve2.md
(*aarch64_pred_faminmax_fused): Instruction pattern for faminmax
codegen.
* config/aarch64/iterators.md: Iterator and attribute for
faminmax codegen.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve/faminmax_1.c: New test.
* gcc.target/aarch64/sve/faminmax_2.c: New test.

commit | commitdiff | tree

Saurabh Jha [Wed, 25 Sep 2024 22:08:33 +0000 (22:08 +0000)]

aarch64: Add SVE2 faminmax intrinsics

The AArch64 FEAT_FAMINMAX extension introduces instructions for
computing the floating point absolute maximum and minimum of the
two vectors element-wise.

This patch introduces SVE2 faminmax intrinsics. The intrinsics of this
extension are implemented as the following builtin functions:
* sva[max|min]_[m|x|z]
* sva[max|min]_[f16|f32|f64]_[m|x|z]
* sva[max|min]_n_[f16|f32|f64]_[m|x|z]

gcc/ChangeLog:

* config/aarch64/aarch64-sve-builtins-base.cc
(svamax): Absolute maximum declaration.
(svamin): Absolute minimum declaration.
* config/aarch64/aarch64-sve-builtins-base.def
(REQUIRED_EXTENSIONS): Add faminmax intrinsics behind a flag.
(svamax): Absolute maximum declaration.
(svamin): Absolute minimum declaration.
* config/aarch64/aarch64-sve-builtins-base.h: Declaring function
bases for the new intrinsics.
* config/aarch64/aarch64.h
(TARGET_SVE_FAMINMAX): New flag for SVE2 faminmax.
* config/aarch64/iterators.md: New unspecs, iterators, and attrs
for the new intrinsics.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/sve2/acle/asm/amax_f16.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amax_f32.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amax_f64.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amin_f16.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amin_f32.c: New test.
* gcc.target/aarch64/sve2/acle/asm/amin_f64.c: New test.

commit | commitdiff | tree

Richard Biener [Fri, 11 Oct 2024 09:46:45 +0000 (11:46 +0200)]

middle-end/117086 - fixup vec_cond simplifications

The following adds missing checks for a vector type result type
to simplifications that end up creating a vec_cond.

PR middle-end/117086
* match.pd ((op (vec_cond ...) ..) -> (vec_cond ...)): Add
missing checks for VECTOR_TYPE_P (type).

* gcc.dg/torture/pr117086.c: New testcase.

commit | commitdiff | tree

Pan Li [Thu, 10 Oct 2024 08:24:08 +0000 (16:24 +0800)]

RISC-V: Add testcases for form 8 of scalar signed SAT_TRUNC

Form 8:
  #define DEF_SAT_S_TRUNC_FMT_8(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_8 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN > x || x >= (WT)NT_MAX            \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-8-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-8-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-8-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Thu, 10 Oct 2024 08:08:40 +0000 (16:08 +0800)]

RISC-V: Add testcases for form 7 of scalar signed SAT_TRUNC

Form 7:
  #define DEF_SAT_S_TRUNC_FMT_7(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_7 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN >= x || x >= (WT)NT_MAX           \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-7-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-7-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-7-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Thu, 10 Oct 2024 07:53:45 +0000 (15:53 +0800)]

RISC-V: Add testcases for form 6 of scalar signed SAT_TRUNC

Form 6:
  #define DEF_SAT_S_TRUNC_FMT_6(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_6 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN >= x || x > (WT)NT_MAX            \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-6-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-6-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-6-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Thu, 10 Oct 2024 07:35:33 +0000 (15:35 +0800)]

RISC-V: Add testcases for form 5 of scalar signed SAT_TRUNC

Form 5:
  #define DEF_SAT_S_TRUNC_FMT_5(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_5 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN > x || x > (WT)NT_MAX             \
      ? x < 0 ? NT_MIN : NT_MAX                         \
      : trunc;                                          \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-5-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-5-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-5-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Thu, 10 Oct 2024 06:52:04 +0000 (14:52 +0800)]

RISC-V: Add testcases for form 4 of scalar signed SAT_TRUNC

Form 4:
  #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN <= x && x < (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-4-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-4-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-4-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Thu, 10 Oct 2024 06:47:34 +0000 (14:47 +0800)]

Match: Support form 4 for scalar signed integer SAT_TRUNC

This patch would like to support the form 4 of the scalar signed
integer SAT_TRUNC.  Aka below example:

Form 4:
  #define DEF_SAT_S_TRUNC_FMT_4(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_4 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN <= x && x < (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

DEF_SAT_S_TRUNC_FMT_4(int8_t, int16_t, INT8_MIN, INT8_MAX)

Before this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_4 (int16_t x)
   6   │ {
   7   │   int8_t trunc;
   8   │   unsigned short x.0_1;
   9   │   unsigned short _2;
  10   │   int8_t _3;
  11   │   _Bool _7;
  12   │   signed char _8;
  13   │   signed char _9;
  14   │   signed char _10;
  15   │
  16   │ ;;   basic block 2, loop depth 0
  17   │ ;;    pred:       ENTRY
  18   │   x.0_1 = (unsigned short) x_4(D);
  19   │   _2 = x.0_1 + 128;
  20   │   if (_2 > 254)
  21   │     goto <bb 4>; [50.00%]
  22   │   else
  23   │     goto <bb 3>; [50.00%]
  24   │ ;;    succ:       4
  25   │ ;;                3
  26   │
  27   │ ;;   basic block 3, loop depth 0
  28   │ ;;    pred:       2
  29   │   trunc_5 = (int8_t) x_4(D);
  30   │   goto <bb 5>; [100.00%]
  31   │ ;;    succ:       5
  32   │
  33   │ ;;   basic block 4, loop depth 0
  34   │ ;;    pred:       2
  35   │   _7 = x_4(D) < 0;
  36   │   _8 = (signed char) _7;
  37   │   _9 = -_8;
  38   │   _10 = _9 ^ 127;
  39   │ ;;    succ:       5
  40   │
  41   │ ;;   basic block 5, loop depth 0
  42   │ ;;    pred:       3
  43   │ ;;                4
  44   │   # _3 = PHI <trunc_5(3), _10(4)>
  45   │   return _3;
  46   │ ;;    succ:       EXIT
  47   │
  48   │ }

After this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_4 (int16_t x)
   6   │ {
   7   │   int8_t _3;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _3 = .SAT_TRUNC (x_4(D)); [tail call]
  12   │   return _3;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add case 4 matching pattern for signed SAT_TRUNC.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Wed, 9 Oct 2024 14:37:00 +0000 (22:37 +0800)]

RISC-V: Add testcases for form 3 of scalar signed SAT_TRUNC

Form 3:
  #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN < x && x <= (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-3-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-3-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-3-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Wed, 9 Oct 2024 14:33:10 +0000 (22:33 +0800)]

Match: Support form 3 for scalar signed integer SAT_TRUNC

This patch would like to support the form 3 of the scalar signed
integer SAT_TRUNC.  Aka below example:

Form 3:
  #define DEF_SAT_S_TRUNC_FMT_3(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_3 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN < x && x <= (WT)NT_MAX            \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

DEF_SAT_S_TRUNC_FMT_3(int8_t, int16_t, INT8_MIN, INT8_MAX)

Before this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_sub_int8_t_fmt_3 (int8_t x, int8_t y)
   6   │ {
   7   │   signed char _1;
   8   │   signed char _2;
   9   │   int8_t _3;
  10   │   __complex__ signed char _6;
  11   │   _Bool _8;
  12   │   signed char _9;
  13   │   signed char _10;
  14   │   signed char _11;
  15   │
  16   │ ;;   basic block 2, loop depth 0
  17   │ ;;    pred:       ENTRY
  18   │   _6 = .SUB_OVERFLOW (x_4(D), y_5(D));
  19   │   _2 = IMAGPART_EXPR <_6>;
  20   │   if (_2 != 0)
  21   │     goto <bb 4>; [50.00%]
  22   │   else
  23   │     goto <bb 3>; [50.00%]
  24   │ ;;    succ:       4
  25   │ ;;                3
  26   │
  27   │ ;;   basic block 3, loop depth 0
  28   │ ;;    pred:       2
  29   │   _1 = REALPART_EXPR <_6>;
  30   │   goto <bb 5>; [100.00%]
  31   │ ;;    succ:       5
  32   │
  33   │ ;;   basic block 4, loop depth 0
  34   │ ;;    pred:       2
  35   │   _8 = x_4(D) < 0;
  36   │   _9 = (signed char) _8;
  37   │   _10 = -_9;
  38   │   _11 = _10 ^ 127;
  39   │ ;;    succ:       5
  40   │
  41   │ ;;   basic block 5, loop depth 0
  42   │ ;;    pred:       3
  43   │ ;;                4
  44   │   # _3 = PHI <_1(3), _11(4)>
  45   │   return _3;
  46   │ ;;    succ:       EXIT
  47   │
  48   │ }

After this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_3 (int16_t x)
   6   │ {
   7   │   int8_t _3;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _3 = .SAT_TRUNC (x_4(D)); [tail call]
  12   │   return _3;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add case 3 matching pattern for signed SAT_TRUNC.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Wed, 9 Oct 2024 02:33:31 +0000 (10:33 +0800)]

RISC-V: Add testcases for form 2 of scalar signed SAT_TRUNC

Form 2:
  #define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN < x && x < (WT)NT_MAX             \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

The below test are passed for this patch.
* The rv64gcv fully regression test.

It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/sat_arith.h: Add test helper macros.
* gcc.target/riscv/sat_s_trunc-2-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-2-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-2-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-2-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-2-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-2-i64-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-2-i16-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-2-i32-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-2-i32-to-i8.c: New test.
* gcc.target/riscv/sat_s_trunc-run-2-i64-to-i16.c: New test.
* gcc.target/riscv/sat_s_trunc-run-2-i64-to-i32.c: New test.
* gcc.target/riscv/sat_s_trunc-run-2-i64-to-i8.c: New test.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Pan Li [Wed, 9 Oct 2024 02:28:55 +0000 (10:28 +0800)]

Match: Support form 2 for scalar signed integer SAT_TRUNC

This patch would like to support the form 2 of the scalar signed
integer SAT_TRUNC.  Aka below example:

Form 2:
  #define DEF_SAT_S_TRUNC_FMT_2(NT, WT, NT_MIN, NT_MAX) \
  NT __attribute__((noinline))                          \
  sat_s_trunc_##WT##_to_##NT##_fmt_2 (WT x)             \
  {                                                     \
    NT trunc = (NT)x;                                   \
    return (WT)NT_MIN < x && x < (WT)NT_MAX             \
      ? trunc                                           \
      : x < 0 ? NT_MIN : NT_MAX;                        \
  }

DEF_SAT_S_TRUNC_FMT_2(int8_t, int16_t, INT8_MIN, INT8_MAX)

Before this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_2 (int16_t x)
   6   │ {
   7   │   int8_t trunc;
   8   │   unsigned short x.0_1;
   9   │   unsigned short _2;
  10   │   int8_t _3;
  11   │   _Bool _7;
  12   │   signed char _8;
  13   │   signed char _9;
  14   │   signed char _10;
  15   │
  16   │ ;;   basic block 2, loop depth 0
  17   │ ;;    pred:       ENTRY
  18   │   x.0_1 = (unsigned short) x_4(D);
  19   │   _2 = x.0_1 + 127;
  20   │   if (_2 > 253)
  21   │     goto <bb 4>; [50.00%]
  22   │   else
  23   │     goto <bb 3>; [50.00%]
  24   │ ;;    succ:       4
  25   │ ;;                3
  26   │
  27   │ ;;   basic block 3, loop depth 0
  28   │ ;;    pred:       2
  29   │   trunc_5 = (int8_t) x_4(D);
  30   │   goto <bb 5>; [100.00%]
  31   │ ;;    succ:       5
  32   │
  33   │ ;;   basic block 4, loop depth 0
  34   │ ;;    pred:       2
  35   │   _7 = x_4(D) < 0;
  36   │   _8 = (signed char) _7;
  37   │   _9 = -_8;
  38   │   _10 = _9 ^ 127;
  39   │ ;;    succ:       5
  40   │
  41   │ ;;   basic block 5, loop depth 0
  42   │ ;;    pred:       3
  43   │ ;;                4
  44   │   # _3 = PHI <trunc_5(3), _10(4)>
  45   │   return _3;
  46   │ ;;    succ:       EXIT
  47   │
  48   │ }

After this patch:
   4   │ __attribute__((noinline))
   5   │ int8_t sat_s_trunc_int16_t_to_int8_t_fmt_2 (int16_t x)
   6   │ {
   7   │   int8_t _3;
   8   │
   9   │ ;;   basic block 2, loop depth 0
  10   │ ;;    pred:       ENTRY
  11   │   _3 = .SAT_TRUNC (x_4(D)); [tail call]
  12   │   return _3;
  13   │ ;;    succ:       EXIT
  14   │
  15   │ }

The below test suites are passed for this patch.
* The rv64gcv fully regression test.
* The x86 bootstrap test.
* The x86 fully regression test.

gcc/ChangeLog:

* match.pd: Add case 2 matching pattern for signed SAT_TRUNC.

Signed-off-by: Pan Li <pan2.li@intel.com>

commit | commitdiff | tree

Jakub Jelinek [Fri, 11 Oct 2024 09:41:53 +0000 (11:41 +0200)]

i386: Fix up spaceship expanders for -mtune=i[45]86 [PR117053]

The adjusted and new spaceship expanders ICE with -mtune=i486 or
-mtune=i586.
The problem is that in that case TARGET_ZERO_EXTEND_WITH_AND is true
and zero_extendqisi2 isn't allowed in that case, and we can't use
the replacement AND, because that clobbers flags and we want to use them
again.

The following patch fixes that by using in those cases roughly what
we want to expand it to after peephole2 optimizations, i.e. xor
before the comparison, *setcc_qi_slp and sbbl $0 (or for signed
int case xoring of 2 regs, two *setcc_qi_slp, subl).
For *setcc_qi_slp, it uses the setcc_si_slp hacks with UNSPEC that
were in use for the floating point jp case (so such code is IMHO
undesirable for the !TARGET_ZERO_EXTEND_WITH_AND case as we want to
give combiner more liberty in that case).

2024-10-11 Jakub Jelinek <jakub@redhat.com>

PR target/117053
* config/i386/i386-expand.cc (ix86_expand_fp_spaceship): Handle
TARGET_ZERO_EXTEND_WITH_AND differently.
(ix86_expand_int_spaceship): Likewise.

* g++.target/i386/pr116896-3.C: New test.

commit | commitdiff | tree

Richard Biener [Thu, 10 Oct 2024 09:02:47 +0000 (11:02 +0200)]

tree-optimization/117050 - fix ICE with non-grouped .MASK_LOAD SLP

The following temporarily reverts the support of permuted .MASK_LOAD for the
case of non-grouped accesses.

PR tree-optimization/117050
* tree-vect-slp.cc (vect_build_slp_tree_2): Do not support
permutes of non-grouped .MASK_LOAD.

* gcc.dg/vect/pr117050.c: New testcase.

commit | commitdiff | tree

Jonathan Wakely [Wed, 9 Oct 2024 13:24:19 +0000 (14:24 +0100)]

libstdc++: Fix some test failures with -fno-char8_t

libstdc++-v3/ChangeLog:

* testsuite/20_util/duration/io.cc [!__cpp_lib_char8_t]: Define
char8_t as a typedef for unsigned char.
* testsuite/std/format/parse_ctx_neg.cc: Skip for -fno-char8_t.

commit | commitdiff | tree

Richard Biener [Thu, 10 Oct 2024 12:00:11 +0000 (14:00 +0200)]

Fix possible wrong-code with masked store-lanes

When we're doing masked store-lanes one mask element applies to all
loads of one struct element.  This requires uniform masks for all
of the SLP lanes, something we already compute into STMT_VINFO_SLP_VECT_ONLY
but fail to check when doing SLP store-lanes.  The following corrects
this.  The following also adjusts the store-lane heuristic to properly
check for masked or non-masked optab support.

* tree-vect-slp.cc (vect_slp_prefer_store_lanes_p): Allow
passing in of vectype, pass in whether the stores are masked
and query the correct optab.
(vect_build_slp_instance): Guard store-lanes query with
! STMT_VINFO_SLP_VECT_ONLY, guaranteeing an uniform mask.

commit | commitdiff | tree

Hu, Lin1 [Wed, 9 Oct 2024 02:20:05 +0000 (10:20 +0800)]

i386: Fix some patterns's mem attribute.

Hi, all

This is another patch to modify some pattern's type attr from ssemov to
ssemov2.

Some ssemov pattern's mem attr should be load when their 2 operand is a memory
operand.

Bootstrapped and regtested on x86-64-linux-pc, OK for trunk?

BRs,
Lin

gcc/ChangeLog:

* config/i386/sse.md
(sse_movhlps): Change type attr from ssemov to ssemov2.
(sse_loadhps): Ditto.
(*vec_concat<mode>): Ditto.
(vec_setv2df_0): Ditto.
(sse_loadlps): Change attr from ssemov to ssemov2 except for 2, 3.
(sse2_loadhps): Change attr from ssemov to ssemov2 except for 0, 1.
(sse2_loadlpd): Change attr from ssemov to ssemov2 except for 0, 1,
2.
(sse2_movsd_<mode>): Change attr from ssemov to ssemov2 except for 5.
(vec_concatv2df): Change attr from ssemov to ssemov2 except for 0, 1,
2.
(*vec_concat<mode>): Change attr from ssemov to ssemov2 for 3, 4.
(vec_concatv2di): Change attr from ssemov to ssemov2 except for 0, 1,
2, 3, 4, 5.

commit | commitdiff | tree

GCC Administrator [Fri, 11 Oct 2024 00:17:48 +0000 (00:17 +0000)]

Daily bump.

commit | commitdiff | tree

Richard Ball [Thu, 10 Oct 2024 18:16:39 +0000 (19:16 +0100)]

aarch64: Alter pr116258.c test to correct for big endian.

The test at pr116258.c fails on big endian targets,
this is because the test checks that the index of a floating
point multiply is 0, which is correct only for little endian.

gcc/testsuite/ChangeLog:

PR tree-optimization/116258
* gcc.target/aarch64/pr116258.c:
Alter test to add big-endian support.

commit | commitdiff | tree

Michael Matz [Thu, 10 Oct 2024 14:36:51 +0000 (16:36 +0200)]

Fix PR116650: check all regs in regrename targets

(this came up for m68k vs. LRA, but is a generic problem)

Regrename wants to use new registers for certain def-use chains.
For validity of replacements it needs to check that the selected
candidates are unused up to then.  That's done in check_new_reg_p.
But if it so happens that the new register needs more hardregs
than the old register (which happens if the target allows inter-bank
moves and the mode is something like a DFmode that needs to be placed
into a SImode reg-pair), then check_new_reg_p only checks the
first of those registers for free-ness.

This is caused by that function looking up the number of necessary
hardregs only in terms of the old hardreg number.  It of course needs
to do that in terms of the new candidate regnumber.  The symptom is that
regrename sometimes clobbers the higher numbered registers of such a
regrename target pair.  This patch fixes that problem.

(In the particular case of the bug report it was LRA that left over a
inter-bank move instruction that triggers regrename, ultimately causing
the mis-compile.  Reload didn't do that, but in general we of course
can't rely on such moves not happening if the target allows them.)

This also shows a general confusion in that function and the target hook
interface here:

  for (i = nregs - 1; i >= 0; --)
    ...
    || ! HARD_REGNO_RENAME_OK (reg + i, new_reg + i))

it uses nregs in a way that requires it to be the same between old and
new register.  The problem is that the target hook only gets register
numbers, when it instead should get a mode and register numbers and
would be called only for the first but not for subsequent registers.
I've looked at a number of definitions of that target hook and I think
that this is currently harmless in the sense that it would merely rule
out some potential reg-renames that would in fact be okay to do.  So I'm
not changing the target hook interface here and hence that problem
remains unfixed.

PR rtl-optimization/116650
* regrename.cc (check_new_reg_p): Calculate nregs in terms of
the new candidate register.

commit | commitdiff | tree

Andrew Pinski [Thu, 10 Oct 2024 04:44:23 +0000 (04:44 +0000)]

phiopt: Remove candorest variable return instead

After r15-3560-gb081e6c860eb9688d24365d39, the setting of candorest
with the break can just change to a return since this is inside a lambda now.

Bootstrapped and tested on x86_64-linux-gnu.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (pass_phiopt::execute): Remove candorest
and return instead of setting candorest.

Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>

commit | commitdiff | tree

Li Xu [Thu, 10 Oct 2024 14:51:19 +0000 (08:51 -0600)]

RISC-V:Bugfix for C++ code compilation failure with rv32imafc_zve32f[pr116883]

From: xuli <xuli1@eswincomputing.com>

Example as follows:

int main()
{
  unsigned long arraya[128], arrayb[128], arrayc[128];
  for (int i = 0; i < 128; i++)
   {
      arraya[i] = arrayb[i] + arrayc[i];
   }
  return 0;
}

Compiled with -march=rv32imafc_zve32f -mabi=ilp32f, it will cause a compilation issue:

riscv_vector.h:40:25: error: ambiguating new declaration of 'vint64m4_t __riscv_vle64(vbool16_t, const long long int*, unsigned int)'
   40 | #pragma riscv intrinsic "vector"
      |                         ^~~~~~~~
riscv_vector.h:40:25: note: old declaration 'vint64m1_t __riscv_vle64(vbool64_t, const long long int*, unsigned int)'

With zvl=32b, vbool16_t is registered in init_builtins() with
type_common.precision=0x101 (nunits=2), mode_nunits[E_RVVMF16BI]=[2,2].

Normally, vbool64_t is only valid when TARGET_MIN_VLEN > 32, so vbool64_t
is not registered in init_builtins(), meaning vbool64_t=null.

In order to implement __attribute__((target("arch=+v"))), we must register
all vector types and all RVV intrinsics. Therefore, vbool64_t will be registered
by default with zvl=128b in reinit_builtins(), resulting in
type_common.precision=0x101 (nunits=2) and mode_nunits[E_RVVMF64BI]=[2,2].

We then get TYPE_VECTOR_SUBPARTS(vbool16_t) == TYPE_VECTOR_SUBPARTS(vbool64_t),
calculated using type_common.precision, resulting in 2. Since vbool16_t and
vbool64_t have the same element type (boolean_type), the compiler treats them
as the same type, leading to a re-declaration conflict.

After all types and intrinsics have been registered, processing
__attribute__((target("arch=+v"))) will update the parameters option and
init_adjust_machine_modes. Therefore, to avoid conflicts, we can choose
zvl=4096b for the null type reinit_builtins().

command option zvl=32b
  type         nunits
  vbool64_t => null
  vbool32_t=> [1,1]
  vbool16_t=> [2,2]
  vbool8_t=>  [4,4]
  vbool4_t=>  [8,8]
  vbool2_t=>  [16,16]
  vbool1_t=>  [32,32]

reinit zvl=128b
  vbool64_t => [2,2] conflict with zvl32b vbool16_t=> [2,2]
reinit zvl=256b
  vbool64_t => [4,4] conflict with zvl32b vbool8_t=>  [4,4]
reinit zvl=512b
  vbool64_t => [8,8] conflict with zvl32b vbool4_t=>  [8,8]
reinit zvl=1024b
  vbool64_t => [16,16] conflict with zvl32b vbool2_t=>  [16,16]
reinit zvl=2048b
  vbool64_t => [32,32] conflict with zvl32b vbool1_t=>  [32,32]
reinit zvl=4096b
  vbool64_t => [64,64] zvl=4096b is ok

Signed-off-by: xuli <xuli1@eswincomputing.com>
PR target/116883

gcc/ChangeLog:

* config/riscv/riscv-c.cc (riscv_pragma_intrinsic_flags_pollute): Choose zvl4096b
to initialize null type.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/pr116883.C: New test.

commit | commitdiff | tree

Richard Sandiford [Thu, 10 Oct 2024 14:15:26 +0000 (15:15 +0100)]

vect: Avoid divide by zero for permutes of extern VLA vectors

My recent VLA SLP patches caused a regression with cross compilers
in gcc.dg/torture/neon-sve-bridge.c.  There we have a VEC_PERM_EXPR
created from two BIT_FIELD_REFs, with the child node being an
external VLA vector:

note:   node 0x3704a70 (max_nunits=1, refcnt=2) vector(2) long int
note:   op: VEC_PERM_EXPR
note:          stmt 0 val1Return_9 = BIT_FIELD_REF <sveReturn_8, 64, 0>;
note:          stmt 1 val2Return_10 = BIT_FIELD_REF <sveReturn_8, 64, 64>;
note:          lane permutation { 0[0] 0[1] }
note:          children 0x3704b08
note:   node (external) 0x3704b08 (max_nunits=1, refcnt=1) svint64_t
note:          { }

For this kind of external node, the SLP_TREE_LANES is normally
the total number of lanes in the vector, but it is zero if the
vector has variable length:

      auto nunits = TYPE_VECTOR_SUBPARTS (SLP_TREE_VECTYPE (vnode));
      unsigned HOST_WIDE_INT const_nunits;
      if (nunits.is_constant (&const_nunits))
SLP_TREE_LANES (vnode) = const_nunits;

This led to division by zero in:

      /* Check whether the output has N times as many lanes per vector.  */
      else if (constant_multiple_p (SLP_TREE_LANES (node) * op_nunits,
    SLP_TREE_LANES (child) * nunits,
    &this_unpack_factor)
       && (i == 0 || unpack_factor == this_unpack_factor))
unpack_factor = this_unpack_factor;

No repetition takes place for this kind of external node, so this
patch goes with Richard's suggestion to check for external nodes
that have no scalar statements.

This didn't show up for my native testing since division by zero
doesn't trap on AArch64.

gcc/
* tree-vect-slp.cc (vectorizable_slp_permutation_1): Set repeating_p
to false if we have an external node for a pre-existing vector.

commit | commitdiff | tree

Simon Martin [Thu, 10 Oct 2024 13:29:32 +0000 (15:29 +0200)]

libiberty: Restore build with CP_DEMANGLE_DEBUG defined

cp-demangle.c does not build when CP_DEMANGLE_DEBUG is defined since
r13-2887-gb04208895fed34. This trivial patch fixes the issue.

libiberty/ChangeLog:

* cp-demangle.c (d_dump): Fix compilation when CP_DEMANGLE_DEBUG
is defined.

commit | commitdiff | tree

Richard Biener [Thu, 10 Oct 2024 12:15:13 +0000 (14:15 +0200)]

tree-optimization/117060 - fix oversight in vect_build_slp_tree_1

We are failing to match call vs. non-call when dealing with matching
loads or stores.

PR tree-optimization/117060
* tree-vect-slp.cc (vect_build_slp_tree_1): When comparing
calls also fail if the first isn't a call.

* gfortran.dg/pr117060.f90: New testcase.

commit | commitdiff | tree

Jennifer Schmitz [Thu, 3 Oct 2024 11:46:51 +0000 (04:46 -0700)]

match.pd: Check trunc_mod vector obtap before folding.

This patch guards the simplification x / y * y == x -> x % y == 0 in
match.pd by a check for:
1) Non-vector mode of x OR
2) Lack of support for vector division OR
3) Support of vector modulo

The patch was bootstrapped and tested with no regression on
aarch64-linux-gnu and x86_64-linux-gnu.
OK for mainline?

Signed-off-by: Jennifer Schmitz <jschmitz@nvidia.com>
gcc/
PR tree-optimization/116831
* match.pd: Guard simplification to trunc_mod with check for
mod optab support.

gcc/testsuite/
PR tree-optimization/116831
* gcc.dg/torture/pr116831.c: New test.

commit | commitdiff | tree

Richard Biener [Wed, 9 Oct 2024 13:31:59 +0000 (15:31 +0200)]

Allow SLP store of mixed external and constant

vect_build_slp_tree_1 rejected this during SLP discovery because it
ran into the rhs code comparison code for stores.  The following
skips that completely for loads and stores as those are handled
later anyway.

This needs a heuristic adjustment in vect_get_and_check_slp_defs
to avoid fallout with regard to BB vectorization and splitting
of a store group vs. demoting one operand to external.

gcc.dg/Wstringop-overflow-47.c needs adjustment given we now have
vast improvements for code generation.  gcc.dg/strlenopt-32.c
needs adjustment because the strlen pass doesn't handle

  _11 = {0, b_6(D)};
  __builtin_memcpy (&a, "foo.bar", 8);
  MEM <vector(2) char> [(char *)&a + 3B] = _11;
  _9 = strlen (&a);

I have opened PR117057 for this.

* tree-vect-slp.cc (vect_build_slp_tree_1): Do not compare
RHS codes for loads or stores.
(vect_get_and_check_slp_defs): Only demote operand to external
in case there is more than one operand.

* gcc.dg/vect/slp-57.c: New testcase.
* gcc.dg/Wstringop-overflow-47.c: Adjust.
* gcc.dg/strlenopt-32.c: XFAIL parts.

commit | commitdiff | tree

liuhongt [Wed, 25 Sep 2024 05:11:11 +0000 (13:11 +0800)]

Add a new tune avx256_avoid_vec_perm for SRF.

According to Intel SOM[1], For Crestmont, most 256-bit Intel AVX2
instructions can be decomposed into two independent 128-bit
micro-operations, except for a subset of Intel AVX2 instructions,
known as cross-lane operations, can only compute the result for an
element by utilizing one or more sources belonging to other elements.

The 256-bit instructions listed below use more operand sources than
can be natively supported by a single reservation station within these
microarchitectures. They are decomposed into two μops, where the first
μop resolves a subset of operand dependencies across two cycles. The
dependent second μop executes the 256-bit operation by using a single
128-bit execution port for two consecutive cycles with a five-cycle
latency for a total latency of seven cycles.

VPERM2I128 ymm1, ymm2, ymm3/m256, imm8
VPERM2F128 ymm1, ymm2, ymm3/m256, imm8
VPERMPD ymm1, ymm2/m256, imm8
VPERMPS ymm1, ymm2, ymm3/m256
VPERMD ymm1, ymm2, ymm3/m256
VPERMQ ymm1, ymm2/m256, imm8

Instead of setting tune avx128_optimal for SRF, the patch add a new
tune avx256_avoid_vec_perm for it. so by default, vectorizer still
uses 256-bit VF if cost is profitable, but lowers to 128-bit whenever
256-bit vec_perm is needed for auto-vectorization. w/o vec_perm,
performance of 256-bit vectorization should be similar as 128-bit
ones(some benchmark results show it's even better than 128-bit
vectorization since it enables more parallelism for convert cases.)

[1] https://www.intel.com/content/www/us/en/content-details/814198/intel-64-and-ia-32-architectures-optimization-reference-manual-volume-1.html

gcc/ChangeLog:

* config/i386/i386.cc (ix86_vector_costs::ix86_vector_costs):
Add new member m_num_avx256_vec_perm.
(ix86_vector_costs::add_stmt_cost): Record 256-bit vec_perm.
(ix86_vector_costs::finish_cost): Prevent vectorization for
TAREGT_AVX256_AVOID_VEC_PERM when there's 256-bit vec_perm
instruction.
* config/i386/i386.h (TARGET_AVX256_AVOID_VEC_PERM): New
Macro.
* config/i386/x86-tune.def (X86_TUNE_AVX256_SPLIT_REGS): Add
m_CORE_ATOM.
(X86_TUNE_AVX256_AVOID_VEC_PERM): New tune.

gcc/testsuite/ChangeLog:

* gcc.target/i386/avx256_avoid_vec_perm.c: New test.

commit | commitdiff | tree

liuhongt [Tue, 24 Sep 2024 07:53:14 +0000 (15:53 +0800)]

Add new microarchitecture tune for SRF/GRR/CWF.

For Crestmont, 4-operand vex blendv instructions come from MSROM and
is slower than 3-instructions sequence (op1 & mask) | (op2 & ~mask).
legacy blendv instruction can still be handled by the decoder.

The patch add a new tune which is enabled for all processors except
for SRF/CWF. It will use vpand + vpandn + vpor instead of
vpblendvb(similar for vblendvps/vblendvpd) for SRF/CWF.

gcc/ChangeLog:

* config/i386/i386-expand.cc (ix86_expand_sse_movcc): Guard
instruction blendv generation under new tune.
* config/i386/i386.h (TARGET_SSE_MOVCC_USE_BLENDV): New Macro.
* config/i386/x86-tune.def (X86_TUNE_SSE_MOVCC_USE_BLENDV):
New tune.

commit | commitdiff | tree

Levy Hsu [Wed, 25 Sep 2024 03:32:35 +0000 (14:32 +1100)]

x86: Implement Fast-Math Float Truncation to BF16 via PSRLD Instruction

gcc/ChangeLog:

* config/i386/i386.md: Rewrite insn truncsfbf2.

gcc/testsuite/ChangeLog:

* gcc.target/i386/truncsfbf-1.c: New test.
* gcc.target/i386/truncsfbf-2.c: New test.

commit | commitdiff | tree

David Malcolm [Thu, 10 Oct 2024 01:26:09 +0000 (21:26 -0400)]

diagnostics: move text output member functions to correct file

No functional change intended.

gcc/ChangeLog:
* diagnostic-format-text.cc
(diagnostic_text_output_format::after_diagnostic): Replace call to
show_any_path with body, taken from diagnostic.cc.
(diagnostic_text_output_format::build_prefix): Move here from
diagnostic.cc, updating to use get_diagnostic_kind_text and
diagnostic_get_color_for_kind.
(diagnostic_text_output_format::file_name_as_prefix): Move here
from diagnostic.cc
(diagnostic_text_output_format::append_note): Likewise.
* diagnostic-format-text.h
(diagnostic_text_output_format::show_any_path): Drop decl.
* diagnostic.cc
(diagnostic_text_output_format::file_name_as_prefix): Move to
diagnostic-format-text.cc.
(diagnostic_text_output_format::build_prefix): Likewise.
(diagnostic_text_output_format::show_any_path): Move to body of
diagnostic_text_output_format::after_diagnostic.
(diagnostic_text_output_format::append_note): Move to
diagnostic-format-text.cc.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

David Malcolm [Thu, 10 Oct 2024 01:26:09 +0000 (21:26 -0400)]

diagnostics: mark the JSON output format as deprecated

The bulk of the documentation for -fdiagnostics-format= is taken up
by a description of the "json" format added in r9-4156-g478dd60ddcf177.

I don't plan to add any extra features to the "json" format; all my
future work on machine-readable GCC diagnostics is likely to be on the
SARIF output format (https://gcc.gnu.org/wiki/SARIF).

Hence users seeking machine-readable output from GCC should use SARIF.

This patch removes the long documentation of the format and describes it
as deprecated.

gcc/ChangeLog:
* doc/invoke.texi (fdiagnostics-format): Describe "json" et al as
deprecated, and remove the long description of the output format.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

David Malcolm [Thu, 10 Oct 2024 01:26:08 +0000 (21:26 -0400)]

lto: reimplement print_lto_docs_link [PR116613]

gcc/ChangeLog:
PR other/116613
* lto-wrapper.cc (print_lto_docs_link): Use a format string rather
than building the string manually. Fix memory leak of "url" by
using label_text.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

commit | commitdiff | tree

Sébastien Michelland [Thu, 10 Oct 2024 00:24:39 +0000 (09:24 +0900)]

SH: Use softfp for sh-elf

libgcc/ChangeLog:

PR target/29845
* config.host (sh-*-elf*): Replace fdpbit with softfp.
* config/sh/sfp-machine.h: New file.

Signed-off-by: Sébastien Michelland <sebastien.michelland@lcis.grenoble-inp.fr>

commit | commitdiff | tree

GCC Administrator [Thu, 10 Oct 2024 00:19:03 +0000 (00:19 +0000)]

Daily bump.

commit | commitdiff | tree

liuhongt [Thu, 19 Sep 2024 05:38:34 +0000 (13:38 +0800)]

Adjust testcase after relax O2 vectorization.

gcc/testsuite/ChangeLog:

* gcc.dg/fstack-protector-strong.c: Adjust
scan-assembler-times.
* gcc.dg/graphite/scop-6.c: Refine the testcase to avoid array
out of bounds.
* gcc.dg/graphite/scop-9.c: Ditto.
* gcc.dg/tree-ssa/ivopts-lt-2.c: Add -fno-tree-vectorize.
* gcc.dg/tree-ssa/ivopts-lt.c: Ditto.
* gcc.dg/tree-ssa/loop-16.c: Ditto.
* gcc.dg/tree-ssa/loop-28.c: Ditto.
* gcc.dg/tree-ssa/loop-bound-2.c: Ditto.
* gcc.dg/tree-ssa/loop-bound-4.c: Ditto.
* gcc.dg/tree-ssa/loop-bound-6.c: Ditto.
* gcc.dg/tree-ssa/predcom-4.c: Ditto.
* gcc.dg/tree-ssa/predcom-5.c: Ditto.
* gcc.dg/tree-ssa/scev-11.c: Ditto.
* gcc.dg/tree-ssa/scev-9.c: Ditto.
* gcc.dg/tree-ssa/split-path-11.c: Ditto.
* gcc.dg/unroll-8.c: Ditto.
* gcc.dg/var-expand1.c: Ditto.
* gcc.dg/vect/vect-cost-model-6.c: Removed.
* gcc.target/i386/pr86270.c: Ditto.
* gcc.target/i386/pr86722.c: Ditto.
* gcc.target/x86_64/abi/callabi/leaf-2.c: Ditto.

commit | commitdiff | tree

liuhongt [Tue, 26 Mar 2024 04:28:14 +0000 (21:28 -0700)]

Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization.

gcc/ChangeLog:

* tree-vect-loop.cc (vect_analyze_loop_costing): Enable
vectorization for LOOP_VINFO_PEELING_FOR_NITER in very cheap
cost model.
(vect_analyze_loop): Disable epilogue vectorization in very
cheap cost model.
* doc/invoke.texi: Adjust documents for very-cheap cost model.

commit | commitdiff | tree

Jovan Vukic [Wed, 9 Oct 2024 22:53:38 +0000 (16:53 -0600)]

RISC-V: Optimize branches with shifted immediate operands

After the valuable feedback I received, it’s clear to me that the
oversight was in the tests showing the benefits of the patch. In the
test file, I added functions f5 and f6, which now generate more
efficient code with fewer instructions.

Before the patch:

f5:
        li      a4,2097152
        addi    a4,a4,-2048
        li      a5,1167360
        and     a0,a0,a4
        addi    a5,a5,-2048
        beq     a0,a5,.L4

f6:
        li      a5,3407872
        addi    a5,a5,-2048
        and     a0,a0,a5
        li      a5,1114112
        beq     a0,a5,.L7

After the patch:

f5:
        srli    a5,a0,11
        andi    a5,a5,1023
        li      a4,569
        beq     a5,a4,.L5

f6:
        srli    a5,a0,11
        andi    a5,a5,1663
        li      a4,544
        beq     a5,a4,.L9

PR target/115921

gcc/ChangeLog:

* config/riscv/iterators.md (any_eq): New code iterator.
* config/riscv/riscv.h (COMMON_TRAILING_ZEROS): New macro.
(SMALL_AFTER_COMMON_TRAILING_SHIFT): Ditto.
* config/riscv/riscv.md (*branch<ANYI:mode>_shiftedarith_<optab>_shifted):
New pattern.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/branch-1.c: Additional tests.

commit | commitdiff | tree

Jeff Law [Wed, 9 Oct 2024 22:22:06 +0000 (16:22 -0600)]

Revert "RISC-V: Add implication for M extension."

This reverts commit 0a193466f2e87acef9b86e0d086bc6f6017518b0.

commit | commitdiff | tree

Jeff Law [Wed, 9 Oct 2024 22:21:56 +0000 (16:21 -0600)]

Revert "RISC-V: Enable builtin __riscv_mul with Zmmul extension."

This reverts commit 2990f5802a727cbd717587c3a345fa940193049f.

commit | commitdiff | tree

Eric Botcazou [Wed, 9 Oct 2024 19:31:13 +0000 (21:31 +0200)]

Fix LTO bootstrap failure with -Werror=lto-type-mismatch

In GNAT's implementation model, using convention C (or C_Pass_By_Copy) has
no effect on the internal representation of types since the representation
is identical to that of C by default. It's even counter-productive given
the implementation advice listed in B.3(63-71) so the interface between the
front-end and gigi does not use it and instead uses structurally identical
types on both sides.

gcc/ada
PR ada/117038
* fe.h (struct c_array): Add 'const' to declaration of pointer.
(C_Source_Buffer): Use consistent formatting.
* par-ch3.adb (P_Component_Items): Properly set Aliased_Present on
access definition.
* sinput.ads: Remove clause for Interfaces.C.
(C_Array): Change type of Length to Integer and make both components
aliased. Remove Convention aspect.
(C_Source_Buffer): Remove all aspects.
* sinput.adb (C_Source_Buffer): Adjust to above change.

commit | commitdiff | tree

Eric Botcazou [Wed, 9 Oct 2024 19:21:36 +0000 (21:21 +0200)]

Remove support for HP-UX 10

gcc/ada
* Makefile.rtl: Remove HP-UX 10 section.
* libgnarl/s-osinte__hpux-dce.ads: Delete.
* libgnarl/s-osinte__hpux-dce.adb: Likewise.
* libgnarl/s-taprop__hpux-dce.adb: Likewise.
* libgnarl/s-taspri__hpux-dce.ads: Likewise.
* libgnat/s-oslock__hpux-dce.ads: Likewise.

commit | commitdiff | tree

Jason Merrill [Wed, 9 Oct 2024 16:28:46 +0000 (12:28 -0400)]

c++: more modules and -M

In r15-4119-gc877a27f04f648 I told preprocess_file to use the
directives-only scan with modules, but it seems that I also need to set the
cpp_option so that communication between _cpp_handle_directive and
scan_translation_unit_directives_only works properly in
c-c++-common/cpp/embed-6.c.

gcc/c-family/ChangeLog:

* c-ppoutput.cc (preprocess_file): Set directives_only flag.

commit | commitdiff | tree

Jason Merrill [Wed, 9 Oct 2024 16:31:57 +0000 (12:31 -0400)]

libcpp: fix typo

libcpp/ChangeLog:

* macro.cc (_cpp_pop_context): Fix typo.

commit | commitdiff | tree

Torbjörn SVENSSON [Wed, 9 Oct 2024 20:02:58 +0000 (22:02 +0200)]

testsuite: arm: use effective-target for mod* tests

This fixes a typo introduced in r15-4200-gcf08dd297ca that was reported
at https://linaro.atlassian.net/browse/GNU-1369.

gcc/testsuite/ChangeLog

* gcc.target/arm/mod_2.c: Corrected effective-target to
arm_cpu_cortex_a57_ok.
* gcc.target/arm/mod_256.c: Likewise.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

commit | commitdiff | tree

Jonathan Wakely [Fri, 4 Oct 2024 11:40:47 +0000 (12:40 +0100)]

libstdc++: Test 17_intro/names.cc with -D_FORTIFY_SOURCE=2 [PR116210]

Add a new testcase that repeats 17_intro/names.cc but with
_FORTIFY_SOURCE defined, to find problems in Glibc fortify wrappers like
https://sourceware.org/bugzilla/show_bug.cgi?id=32052 (which is fixed
now).

libstdc++-v3/ChangeLog:

PR libstdc++/116210
* testsuite/17_intro/names.cc (sz): Undef for versions of Glibc
that use it in the fortify wrappers.
* testsuite/17_intro/names_fortify.cc: New test.

commit | commitdiff | tree

Jonathan Wakely [Fri, 4 Oct 2024 11:08:12 +0000 (12:08 +0100)]

libstdc++: Drop format attribute from snprintf wrapper [PR116969]

When __LONG_DOUBLE_IEEE128__ is defined we need to declare a wrapper for
Glibc's 'snprintf' symbol, so we can call the original definition that
works with the IBM128 format of long double. Because we were declaring
the wrapper using __typeof__(__builtin_snprintf) it inherited the
__attribute__((format(printf, 3, 4))) decoration, and then we got a
warning for calling that wrapper with an __ibm128 argument for a %Lf
conversion specifier. The warning is bogus, because the function we're
calling really does want __ibm128 for %Lf, but there's no "printf but
with a different long double format" archetype for the attribute.

In r15-4039-g28911f626864e7 I added a diagnostic pragma to suppress the
warning, but it would be better to just declare the wrapper without the
attribute, and not have to suppress a warning for code that we know is
actually correct.

libstdc++-v3/ChangeLog:

PR libstdc++/116969
* include/bits/locale_facets_nonio.tcc (money_put::__do_put):
Remove diagnostic pragmas.
(__glibcxx_snprintfibm128): Declare type manually, instead of
using __typeof__(__builtin_snprintf).

commit | commitdiff | tree

Frank Scheiner [Tue, 8 Oct 2024 18:48:09 +0000 (19:48 +0100)]

libstdc++: Workaround glibc headers on ia64-linux

We see:

```
FAIL: 17_intro/names.cc  -std=gnu++17 (test for excess errors)
FAIL: 17_intro/names_pstl.cc  -std=gnu++17 (test for excess errors)
FAIL: experimental/names.cc  -std=gnu++17 (test for excess errors)
```

...on ia64-linux.

This is due to:

* /usr/include/bits/sigcontext.h:32-38:
```
32 struct __ia64_fpreg
33   {
34     union
35       {
36         unsigned long bits[2];
37       } u;
38   } __attribute__ ((__aligned__ (16)));
```

* /usr/include/sys/ucontext.h:39-45:
```
  39 struct __ia64_fpreg_mcontext
  40   {
  41     union
  42       {
  43         unsigned long __ctx(bits)[2];
  44       } __ctx(u);
  45   } __attribute__ ((__aligned__ (16)));
```

...from glibc 2.39 (w/ia64 support re-added). See the discussion
starting on [1].

[1]: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654487.html

Signed-off-by: Frank Scheiner <frank.scheiner@web.de>
libstdc++-v3/ChangeLog:

* testsuite/17_intro/names.cc [__linux__ && __ia64__]: Undefine
'u' as used in glibc headers.

commit | commitdiff | tree

Richard Sandiford [Wed, 9 Oct 2024 12:57:36 +0000 (13:57 +0100)]

aarch64: Fix SVE ACLE gimple folds for C++ LTO [PR116629]

The SVE ACLE code has two ways of handling overloaded functions.
One, used by C, is to define a single dummy function for each unique
overloaded name, with resolve_overloaded_builtin then resolving calls
to real non-overloaded functions.  The other, used by C++, is to
define a separate function for each individual overload.

The builtins harness assigns integer function codes programmatically.
However, LTO requires it to use the same assignment for every
translation unit, regardless of language.  This means that C++ TUs
need to create (unused) slots for the C overloads and that C TUs
need to create (unused) slots for the C++ overloads.

In many ways, it doesn't matter whether the LTO frontend itself
uses the C approach or the C++ approach to defining overloaded
functions, since the LTO frontend never has to resolve source-level
overloading.  However, the C++ approach of defining a separate
function for each overload means that C++ calls never need to
be redirected to a different function.  Calls to an overload
can appear in the LTO dump and survive until expand.  In contrast,
calls to C's dummy overload functions are resolved by the front
end and never survive to LTO (or expand).

Some optimisations work by moving between sibling functions, such as _m
to _x.  If the source function is an overload, the expected destination
function is too.  The LTO frontend needs to define C++ overloads if it
wants to do this optimisation properly for C++.

The PR is about a tree checking failure caused by trying to use a
stubbed-out C++ overload in LTO.  Dealing with that by detecting the
stub (rather than changing which overloads are defined) would have
turned this from an ice-on-valid to a missed optimisation.

In future, it would probably make sense to redirect overloads to
non-overloaded functions during gimple folding, in case that exposes
more CSE opportunities.  But it'd probably be of limited benefit, since
it should be rare for code to mix overloaded and non-overloaded uses of
the same operation.  It also wouldn't be suitable for backports.

gcc/
PR target/116629
* config/aarch64/aarch64-sve-builtins.cc
(function_builder::function_builder): Use direct overloads for LTO.

gcc/testsuite/
PR target/116629
* gcc.target/aarch64/sve/acle/general/pr106326_2.c: New test.

commit | commitdiff | tree

Richard Sandiford [Wed, 9 Oct 2024 12:57:36 +0000 (13:57 +0100)]

testsuite: Make check-function-bodies work with LTO

This patch tries to make check-function-bodies automatically
choose between reading the regular assembly file and reading the
LTO assembly file. There should only ever be one right answer,
since check-function-bodies doesn't make sense on slim LTO output.

Maybe this will turn out to be impossible to get right, but I'd like
to try at least.

gcc/testsuite/
* lib/scanasm.exp (check-function-bodies): Look in ltrans0.ltrans.s
if the test appears to be using LTO.

commit | commitdiff | tree

Jonathan Wakely [Mon, 7 Oct 2024 09:22:24 +0000 (10:22 +0100)]

libstdc++: Ignore _GLIBCXX_USE_POSIX_SEMAPHORE if not supported [PR116992]

If _GLIBCXX_HAVE_POSIX_SEMAPHRE is undefined then users get an error
when defining _GLIBCXX_USE_POSIX_SEMAPHORE. We can just ignore it
instead (and warn them it's being ignored).

This fixes a testsuite failure on hppa64-hp-hpux11.11 (and probably some
other targets):

FAIL: 30_threads/semaphore/platform_try_acquire_for.cc -std=gnu++20 (test for excess errors)
Excess errors:
semaphore:49: error: '__semaphore_impl' has not been declared

libstdc++-v3/ChangeLog:

PR libstdc++/116992
* include/bits/semaphore_base.h (_GLIBCXX_USE_POSIX_SEMAPHORE):
Undefine and issue a warning if POSIX sem_t is not supported.
* testsuite/30_threads/semaphore/platform_try_acquire_for.cc:
Prune new warning.

commit | commitdiff | tree

Jonathan Wakely [Mon, 7 Oct 2024 09:19:29 +0000 (10:19 +0100)]

libstdc++: Fix -Wnarrowing in <complex> [PR116991]

When _GLIBCXX_USE_C99_COMPLEX_ARC is undefined we use the generic
__complex_acos function template for _Float32 etc. and that gives a
-Wnarrowing warning:

complex:2043: warning: ISO C++ does not allow converting to '_Float32' from 'long double' with greater conversion rank [-Wnarrowing]

Use a cast to do the conversion so that it doesn't warn.

libstdc++-v3/ChangeLog:

PR libstdc++/116991
* include/std/complex (__complex_acos): Cast literal to
destination type.

commit | commitdiff | tree

Jonathan Wakely [Fri, 4 Oct 2024 17:11:06 +0000 (18:11 +0100)]

libstdc++: Fix -Wsign-compare in std::latch::count_down

Also add assertions for the precondition on the parameter's value.

libstdc++-v3/ChangeLog:

* include/std/latch (latch::count_down): Add assertions for
preconditions. Cast parameter to avoid -Wsign-compare on some
targets.

commit | commitdiff | tree

Jonathan Wakely [Thu, 26 Sep 2024 15:55:07 +0000 (16:55 +0100)]

libstdc++: Enable _GLIBCXX_ASSERTIONS by default for -O0 [PR112808]

Too many users don't know about -D_GLIBCXX_ASSERTIONS and so are missing
valuable checks for C++ standard library preconditions. This change
enables libstdc++ assertions by default when compiling with -O0 so that
we diagnose more bugs by default.

When users enable optimization we don't add the assertions by default
(because they have non-zero overhead) so they still need to enable them
manually.

For users who really don't want the assertions even in unoptimized
builds, defining _GLIBCXX_NO_ASSERTIONS will prevent them from being
enabled automatically.

libstdc++-v3/ChangeLog:

PR libstdc++/112808
* doc/xml/manual/using.xml (_GLIBCXX_ASSERTIONS): Document
implicit definition for -O0 compilation.
(_GLIBCXX_NO_ASSERTIONS): Document.
* doc/html/manual/using_macros.html: Regenerate.
* include/bits/c++config [!__OPTIMIZE__] (_GLIBCXX_ASSERTIONS):
Define for unoptimized builds.

commit | commitdiff | tree

Jonathan Wakely [Thu, 26 Sep 2024 15:42:27 +0000 (16:42 +0100)]

libstdc++: Simplify std::aligned_storage and fix for versioned namespace [PR61458]

This simplifies the implementation of std::aligned_storage. For the
unstable ABI it also fixes the bug where its size is too large when the
default alignment is used. We can't fix that for the stable ABI though,
so just add a comment about the bug.

libstdc++-v3/ChangeLog:

PR libstdc++/61458
* doc/doxygen/user.cfg.in (GENERATE_BUGLIST): Set to NO.
* include/std/type_traits (__aligned_storage_msa): Remove.
(__aligned_storage_max_align_t): New struct.
(__aligned_storage_default_alignment): New function.
(aligned_storage): Use __aligned_storage_default_alignment for
default alignment. Replace union with a struct containing an
aligned buffer. Improve Doxygen comment.
(aligned_storage_t): Use __aligned_storage_default_alignment for
default alignment.

commit | commitdiff | tree

Jonathan Wakely [Thu, 11 Jul 2024 19:38:05 +0000 (20:38 +0100)]

libstdc++: Do not cast away const-ness in std::construct_at (LWG 3870)

This change also requires implementing the proposed resolution of LWG
3216 so that std::make_shared and std::allocate_shared still work, and
the proposed resolution of LWG 3891 so that std::expected still works.

libstdc++-v3/ChangeLog:

* include/bits/shared_ptr_base.h: Remove cv-qualifiers from
type managed by _Sp_counted_ptr_inplace, as per LWG 3210.
* include/bits/stl_construct.h: Do not cast away cv-qualifiers
when passing pointer to placement new.
* include/std/expected: Use remove_cv_t for union member, as per
LWG 3891.
* testsuite/20_util/allocator/void.cc: Do not test construction
via const pointer.

commit | commitdiff | tree

Jonathan Wakely [Mon, 18 Mar 2024 16:59:50 +0000 (16:59 +0000)]

libstdc++: Make std::construct_at support arrays (LWG 3436)

The issue was approved at the recent St. Louis meeting, requiring
support for bounded arrays, but only without arguments to initialize the
array elements.

libstdc++-v3/ChangeLog:

* include/bits/stl_construct.h (construct_at): Support array
types (LWG 3436).
* testsuite/20_util/specialized_algorithms/construct_at/array.cc:
New test.
* testsuite/20_util/specialized_algorithms/construct_at/array_neg.cc:
New test.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-opt1.C: Adjust for different diagnostics
from std::construct_at by adding -fconcepts-diagnostics-depth=2.

commit | commitdiff | tree

Jonathan Wakely [Fri, 27 Sep 2024 15:54:31 +0000 (16:54 +0100)]

libstdc++: Tweak %c formatting for chrono types

libstdc++-v3/ChangeLog:

* include/bits/chrono_io.h (__formatter_chrono::_M_c): Add
[[unlikely]] attribute to condition for missing %c format in
locale. Use %T instead of %H:%M:%S in fallback.

commit | commitdiff | tree

Jonathan Wakely [Wed, 18 Sep 2024 16:20:29 +0000 (17:20 +0100)]

libstdc++: Fix formatting of chrono::duration with character rep [PR116755]

Implement Peter Dimov's suggestion for resolving LWG 4118, which is to
use +d.count() so that character types are promoted to an integer type
before formatting them. This didn't have unanimous consensus in the
committee as Howard Hinnant proposed that we should format the rep
consistently with std::format("{}", d.count()) instead. That ends up
being more complicated, because it makes std::formattable a precondition
of operator<< which was not previously the case, and it means that
ios_base::fmtflags from the stream would be ignored because std::format
doesn't use them.

libstdc++-v3/ChangeLog:

PR libstdc++/116755
* include/bits/chrono_io.h (operator<<): Use +d.count() for
duration inserter.
(__formatter_chrono::_M_format): Likewise for %Q format.
* testsuite/20_util/duration/io.cc: Test durations with
character types as reps.

commit | commitdiff | tree

Richard Biener [Wed, 9 Oct 2024 09:47:08 +0000 (11:47 +0200)]

Clear DR_GROUP_NEXT_ELEMENT upon group dissolving

I've tried to sanitize DR_GROUP_NEXT_ELEMENT accesses but there are too
many so the following instead makes sure DR_GROUP_NEXT_ELEMENT is never
non-NULL for !STMT_VINFO_GROUPED_ACCESS.

* tree-vect-data-refs.cc (vect_analyze_data_ref_access): When
cancelling a DR group also clear DR_GROUP_NEXT_ELEMENT.

commit | commitdiff | tree

Richard Biener [Wed, 9 Oct 2024 09:42:59 +0000 (11:42 +0200)]

tree-optimization/117041 - fix load classification of former grouped load

When we first detect a grouped load but later dis-associate it we
only set DR_GROUP_FIRST_ELEMENT to NULL, indicating it is not a
STMT_VINFO_GROUPED_ACCESS but leave DR_GROUP_NEXT_ELEMENT set. This
causes a stray DR_GROUP_NEXT_ELEMENT access in get_group_load_store_type
to go wrong, indicating a load isn't single_element_p when it actually
is, leading to wrong classification and an ICE.

PR tree-optimization/117041
* tree-vect-stmts.cc (get_group_load_store_type): Only
check DR_GROUP_NEXT_ELEMENT for STMT_VINFO_GROUPED_ACCESS.

* gcc.dg/torture/pr117041.c: New testcase.

commit | commitdiff | tree

Torbjörn SVENSSON [Mon, 7 Oct 2024 07:06:37 +0000 (09:06 +0200)]

testsuite: arm: use effective-target for vsel*, mod* and pr65647.c tests

Update test cases to use -mcpu=unset/-march=unset feature introduced in
r15-3606-g7d6c6a0d15c.

gcc/testsuite/ChangeLog

* gcc.target/arm/pr65647.c: Use effective-target arm_arch_v6m.
Removed unneeded dg-skip-if.
* gcc.target/arm/mod_2.c: Use effective-target arm_cpu_cortex_a57.
* gcc.target/arm/mod_256.c: Likewise.
* gcc.target/arm/vseleqdf.c: Likewise.
* gcc.target/arm/vseleqsf.c: Likewise.
* gcc.target/arm/vselgedf.c: Likewise.
* gcc.target/arm/vselgesf.c: Likewise.
* gcc.target/arm/vselgtdf.c: Likewise.
* gcc.target/arm/vselgtsf.c: Likewise.
* gcc.target/arm/vselledf.c: Likewise.
* gcc.target/arm/vsellesf.c: Likewise.
* gcc.target/arm/vselltdf.c: Likewise.
* gcc.target/arm/vselltsf.c: Likewise.
* gcc.target/arm/vselnedf.c: Likewise.
* gcc.target/arm/vselnesf.c: Likewise.
* gcc.target/arm/vselvcdf.c: Likewise.
* gcc.target/arm/vselvcsf.c: Likewise.
* gcc.target/arm/vselvsdf.c: Likewise.
* gcc.target/arm/vselvssf.c: Likewise.
* lib/target-supports.exp: Define effective-target arm_cpu_cortex_a57.
Update effective-target arm_v8_1_lob_ok to use -mcpu=unset.

Signed-off-by: Torbjörn SVENSSON <torbjorn.svensson@foss.st.com>

commit | commitdiff | tree

Ken Matsui [Wed, 9 Oct 2024 11:32:20 +0000 (07:32 -0400)]

libcpp: Use ' instead of %< and %> [PR117039]

PR bootstrap/117039

libcpp/ChangeLog:

* directives.cc (do_pragma_once): Use ' instead of %< and %>.

Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>

commit | commitdiff | tree

René Rebe [Wed, 12 Jun 2024 10:42:00 +0000 (12:42 +0200)]

Enable LRA for ia64

This was tested by bootstrapping GCC natively on ia64-t2-linux-gnu and
running the testsuite (based on
236116068151bbc72aaaf53d0f223fe06f7e3bac):

https://gcc.gnu.org/pipermail/gcc-testresults/2024-June/817268.html

For comparison, the same with just
236116068151bbc72aaaf53d0f223fe06f7e3bac:

https://gcc.gnu.org/pipermail/gcc-testresults/2024-June/817267.html

gcc/
* config/ia64/ia64.cc: Enable LRA for ia64.
* config/ia64/ia64.md: Likewise.
* config/ia64/predicates.md: Likewise.

Signed-off-by: René Rebe <rene@exactcode.de>

commit | commitdiff | tree

René Rebe [Wed, 12 Jun 2024 10:42:00 +0000 (12:42 +0200)]

Remove ia64*-*-linux from the list of obsolete targets

The following un-deprecates ia64*-*-linux for GCC 15. Since we plan to
support this for some years to come.

gcc/
* config.gcc: Only list ia64*-*-(hpux|vms|elf) in the list of
obsoleted targets.

contrib/
* config-list.mk (LIST): no --enable-obsolete for ia64-linux.

Signed-off-by: René Rebe <rene@exactcode.de>

Mirror of https://gcc.gnu.org/git/gcc.git

RSS Atom