git.ipfire.org Git - thirdparty/gcc.git/log

Makefile.def: drop remnants of unused libelf

Use of libelf was removed from gcc in r0-104274-g48215350c24d52 ("re PR
lto/46273 (Failed to bootstrap)") around 2010, before gcc-4.6.0.

This change removes unused references to libelf from top-level configure
and Makefile.

/
* Makefile.def: Drop libelf module and gcc-configure dependency
on it.
* Makefile.in: Regenerate with 'autogen Makefile.def'.
* Makefile.tpl (HOST_EXPORTS): Drop unused LIBELFLIBS and
LIBELFINC.
* configure: Regenrate.
* configure.ac (host_libs): Drop unused libelf.

Add libgo dependency on libbacktrace.

Noticed missing dependency when regenerated Makefile.in for unrelated
change with 'autogen Makefile.def'.

The change was lost in r12-6861-gaeac414923aa1e ("Revert "Fix PR 67102:
Add libstdc++ dependancy to libffi" [PR67102]").

/
* Makefile.in: Regenerate.

rs6000: Add expand pattern for multiply-add (PR103109)

gcc/
PR target/103109
* config/rs6000/rs6000.md (<u>maddditi4): New pattern for multiply-add.
(<u>madddi4_highpart): New.
(<u>madddi4_highpart_le): New.

gcc/testsuite/
PR target/103109
* gcc.target/powerpc/pr103109.h: New.
* gcc.target/powerpc/pr103109-1.c: New.
* gcc.target/powerpc/pr103109-2.c: New.

Use gimple_range_ssa_names in path_range_query.

gcc/ChangeLog:

* gimple-range-path.cc
(path_range_query::compute_exit_dependencies): Use
gimple_range_ssa_names.

RISC-V: Add runtime invariant support

RISC-V 'V' Extension support scalable vector like ARM SVE.
To support RVV, we need to introduce runtime invariant.

- For zve32*, the runtime invariant uses 32-bit chunk.
- For zve64*, the runtime invariant uses 64-bit chunk.

[1] https://github.com/riscv/riscv-v-spec/blob/master/v-spec.adoc#sec-vector-extensions

This patch is preparing patch for RVV support.
Because we didn't introduce vector machine_mode yet, it safe to just change HOST_WIDE_INT into poly_int.
Also it safe to use "to_constant()" function for scalar operation.
This patch has been tested by full dejagnu regression.

gcc/ChangeLog:

* config/riscv/predicates.md: Adjust runtime invariant.
* config/riscv/riscv-modes.def (MAX_BITSIZE_MODE_ANY_MODE): New.
(NUM_POLY_INT_COEFFS): New.
* config/riscv/riscv-protos.h (riscv_initial_elimination_offset):Adjust
runtime invariant.
* config/riscv/riscv-sr.cc (riscv_remove_unneeded_save_restore_calls):
Adjust runtime invariant.
* config/riscv/riscv.cc (struct riscv_frame_info): Adjust runtime
invariant.
(enum riscv_microarchitecture_type): Ditto.
(riscv_valid_offset_p): Ditto.
(riscv_valid_lo_sum_p): Ditto.
(riscv_address_insns): Ditto.
(riscv_load_store_insns): Ditto.
(riscv_legitimize_move): Ditto.
(riscv_binary_cost): Ditto.
(riscv_rtx_costs): Ditto.
(riscv_output_move): Ditto.
(riscv_extend_comparands): Ditto.
(riscv_flatten_aggregate_field): Ditto.
(riscv_get_arg_info): Ditto.
(riscv_pass_by_reference): Ditto.
(riscv_elf_select_rtx_section): Ditto.
(riscv_stack_align): Ditto.
(riscv_compute_frame_info): Ditto.
(riscv_initial_elimination_offset): Ditto.
(riscv_set_return_address): Ditto.
(riscv_for_each_saved_reg): Ditto.
(riscv_first_stack_step): Ditto.
(riscv_expand_prologue): Ditto.
(riscv_expand_epilogue): Ditto.
(riscv_can_use_return_insn): Ditto.
(riscv_secondary_memory_needed): Ditto.
(riscv_hard_regno_nregs): Ditto.
(riscv_convert_vector_bits): New.
(riscv_option_override): Adjust runtime invariant.
(riscv_promote_function_mode): Ditto.
* config/riscv/riscv.h (POLY_SMALL_OPERAND_P): New.
(BITS_PER_RISCV_VECTOR): New.
(BYTES_PER_RISCV_VECTOR): New.
* config/riscv/riscv.md: Adjust runtime invariant.

LoongArch: Get __tls_get_addr address through got table when disable plt.

Fix bug, ICE with tls gd/ld var with -fno-plt.

gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_call_tls_get_addr):
Get __tls_get_addr address through got table when disable plt.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/tls-gd-noplt.c: New test.

xtensa: Optimize stack pointer updates in function pro/epilogue under certain conditions

This patch enforces the use of "addmi" machine instruction instead of
addition/subtraction with two source registers for adjusting the stack
pointer, if the adjustment fits into a signed 16-bit and is also a multiple
of 256.

    /* example */
    void test(void) {
      char buffer[4096];
      __asm__(""::"m"(buffer));
    }

    ;; before
    test:
movi.n a9, 1
slli a9, a9, 12
sub sp, sp, a9
movi.n a9, 1
slli a9, a9, 12
add.n sp, sp, a9
addi sp, sp, 0
ret.n

    ;; after
    test:
addmi sp, sp, -0x1000
addmi sp, sp, 0x1000
ret.n

gcc/ChangeLog:

* config/xtensa/xtensa.cc (xtensa_expand_prologue):
Use an "addmi" machine instruction for updating the stack pointer
rather than addition/subtraction via hard register A9, if the amount
of change satisfies the literal value conditions of that instruction
when the CALL0 ABI is used.
(xtensa_expand_epilogue): Ditto.
And also inhibit the stack pointer addition of constant zero.

Daily bump.

RISC-V/testsuite: Restrict remaining `fmin'/`fmax' tests to hard float

Complement commit 7915f6551343 ("RISC-V/testsuite: constraint some of
tests to hard_float") and also restrict the remaining `fmin'/`fmax'
tests to hard-float test configurations.

gcc/testsuite/
* gcc.target/riscv/fmax-snan.c: Add `dg-require-effective-target
hard_float'.
* gcc.target/riscv/fmaxf-snan.c: Likewise.
* gcc.target/riscv/fmin-snan.c: Likewise.
* gcc.target/riscv/fminf-snan.c: Likewise.

[Committed] PR target/106640: Fix use of XINT in TImode compute_convert_gain.

Thanks to Zdenek Sojka for reporting PR target/106640 where an RTL checking
build reveals a thinko in my recent patch to support TImode shifts/rotates
in STV. My "senior moment" was to inappropriately use XINT where I should
be using INTVAL of XEXP.

2022-08-17 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
PR target/106640
* config/i386/i386-features.cc
(timde_scalar_chain::compute_convert_gain): Replace incorrect use
of XINT with INTVAL (XEXP (src, 1)).

c++: Add new std::move test [PR67906]

As discussed in 67906, let's make sure we don't warn about a std::move
when initializing when there's a T(const T&&) ctor.

PR c++/67906

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wredundant-move11.C: New test.

Reset root oracle from path_oracle::reset_path.

When we cross a backedge in the path solver, we reset the path
relations and nuke the root oracle.  However, we forget to reset it
for the next path.  This is causing us to miss threads because
subsequent paths will have no root oracle to use.

With this patch we get 201 more threads in the threadfull passes in my
.ii files and 118 more overall (DOM gets less because threadfull runs
before).

Normally, I'd recommend this for the GCC 12 branch, but considering
how sensitive other passes are to jump threading, and that there is no
PR associated with this, perhaps we should leave this out.  Up to the
release maintainers of course.

gcc/ChangeLog:

* gimple-range-path.cc
(path_range_query::compute_ranges_in_block): Remove
set_root_oracle call.
(path_range_query::compute_ranges): Pass ranger oracle to
reset_path.
* value-relation.cc (path_oracle::reset_path): Set root oracle.
* value-relation.h (path_oracle::reset_path): Add root oracle
argument.

c++: Extend -Wredundant-move for const-qual objects [PR90428]

In this PR, Jon suggested extending the -Wredundant-move warning
to warn when the user is moving a const object as in:

  struct T { };

  T f(const T& t)
  {
    return std::move(t);
  }

where the std::move is redundant, because T does not have
a T(const T&&) constructor (which is very unlikely).  Even with
the std::move, T(T&&) would not be used because it would mean
losing the const.  Instead, T(const T&) will be called.

I had to restructure the function a bit, but it's better now.  This patch
depends on my other recent patches to maybe_warn_pessimizing_move.

PR c++/90428

gcc/cp/ChangeLog:

* typeck.cc (can_do_rvo_p): Rename to ...
(can_elide_copy_prvalue_p): ... this.
(maybe_warn_pessimizing_move): Extend the
-Wredundant-move warning to warn about std::move on a
const-qualified object.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wredundant-move1.C: Adjust dg-warning.
* g++.dg/cpp0x/Wredundant-move9.C: Likewise.
* g++.dg/cpp0x/Wredundant-move10.C: New test.

c++: Tweak for -Wpessimizing-move in templates [PR89780]

In my previous patches I've been extending our std::move warnings,
but this tweak actually dials it down a little bit.  As reported in
bug 89780, it's questionable to warn about expressions in templates
that were type-dependent, but aren't anymore because we're instantiating
the template.  As in,

  template <typename T>
  Dest withMove() {
    T x;
    return std::move(x);
  }

  template Dest withMove<Dest>(); // #1
  template Dest withMove<Source>(); // #2

Saying that the std::move is pessimizing for #1 is not incorrect, but
it's not useful, because removing the std::move would then pessimize #2.
So the user can't really win.  At the same time, disabling the warning
just because we're in a template would be going too far, I still want to
warn for

  template <typename>
  Dest withMove() {
    Dest x;
    return std::move(x);
  }

because the std::move therein will be pessimizing for any instantiation.

So I'm using the suppress_warning machinery to that effect.
Problem: I had to add a new group to nowarn_spec_t, otherwise
suppressing the -Wpessimizing-move warning would disable a whole bunch
of other warnings, which we really don't want.

PR c++/89780

gcc/cp/ChangeLog:

* pt.cc (tsubst_copy_and_build) <case CALL_EXPR>: Maybe suppress
-Wpessimizing-move.
* typeck.cc (maybe_warn_pessimizing_move): Don't issue warnings
if they are suppressed.
(check_return_expr): Disable -Wpessimizing-move when returning
a dependent expression.

gcc/ChangeLog:

* diagnostic-spec.cc (nowarn_spec_t::nowarn_spec_t): Handle
OPT_Wpessimizing_move and OPT_Wredundant_move.
* diagnostic-spec.h (nowarn_spec_t): Add NW_REDUNDANT enumerator.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wpessimizing-move3.C: Remove dg-warning.
* g++.dg/cpp0x/Wredundant-move2.C: Likewise.

c++: Extend -Wpessimizing-move to other contexts

In my recent patch which enhanced -Wpessimizing-move so that it warns
about class prvalues too I said that I'd like to extend it so that it
warns in more contexts where a std::move can prevent copy elision, such
as:

  T t = std::move(T());
  T t(std::move(T()));
  T t{std::move(T())};
  T t = {std::move(T())};
  void foo (T);
  foo (std::move(T()));

This patch does that by adding two maybe_warn_pessimizing_move calls.
These must happen before we've converted the initializers otherwise the
std::move will be buried in a TARGET_EXPR.

PR c++/106276

gcc/cp/ChangeLog:

* call.cc (build_over_call): Call maybe_warn_pessimizing_move.
* cp-tree.h (maybe_warn_pessimizing_move): Declare.
* decl.cc (build_aggr_init_full_exprs): Call
maybe_warn_pessimizing_move.
* typeck.cc (maybe_warn_pessimizing_move): Handle TREE_LIST and
CONSTRUCTOR.  Add a bool parameter and use it.  Adjust a diagnostic
message.
(check_return_expr): Adjust the call to maybe_warn_pessimizing_move.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wpessimizing-move7.C: Add dg-warning.
* g++.dg/cpp0x/Wpessimizing-move8.C: New test.

fortran: Add -static-libquadmath support [PR46539]

The following patch is a revival of the
https://gcc.gnu.org/legacy-ml/gcc-patches/2014-10/msg00771.html
patch.  While trunk configured against recent glibc and with linker
--as-needed support doesn't really need to link against -lquadmath
anymore, there are still other targets where libquadmath is still in
use.
As has been discussed, making -static-libgfortran imply statically
linking both libgfortran and libquadmath is undesirable because of
the significant licensing differences between the 2 libraries.
Compared to the 2014 patch, this one doesn't handle -lquadmath
addition in the driver, which to me looks incorrect, libgfortran
configure determines where in libgfortran.spec -lquadmath should
be present if at all and with what it should be wrapped, but
analyzes gfortran -### -static-libgfortran stderr and based on
that figures out what gcc/configure.ac determined.

2022-08-17  Francois-Xavier Coudert  <fxcoudert@gcc.gnu.org>
    Jakub Jelinek  <jakub@redhat.com>

PR fortran/46539
gcc/
* common.opt (static-libquadmath): New option.
* gcc.cc (driver_handle_option): Always accept -static-libquadmath.
* config/darwin.h (LINK_SPEC): Handle -static-libquadmath.
gcc/fortran/
* lang.opt (static-libquadmath): New option.
* invoke.texi (-static-libquadmath): Document it.
* options.cc (gfc_handle_option): Error out if -static-libquadmath
is passed but we do not support it.
libgfortran/
* acinclude.m4 (LIBQUADSPEC): From $FC -static-libgfortran -###
output determine -Bstatic/-Bdynamic, -bstatic/-bdynamic,
-aarchive_shared/-adefault linker support or Darwin remapping
of -lgfortran to libgfortran.a%s and use that around or instead
of -lquadmath in LIBQUADSPEC.
* configure: Regenerated.

Co-Authored-By: Francois-Xavier Coudert <fxcoudert@gcc.gnu.org>

Fortran: OpenMP fix declare simd inside modules and absent linear step [PR106566]

gcc/fortran/ChangeLog:

PR fortran/106566
* openmp.cc (gfc_match_omp_clauses): Fix setting linear-step value
to 1 when not specified.
(gfc_match_omp_declare_simd): Accept module procedures.

gcc/testsuite/ChangeLog:

PR fortran/106566
* gfortran.dg/gomp/declare-simd-4.f90: New test.
* gfortran.dg/gomp/declare-simd-5.f90: New test.
* gfortran.dg/gomp/declare-simd-6.f90: New test.

OpenMP requires: Fix diagnostic filename corner case

The issue occurs when there is, e.g., main._omp_fn.0 in two files with
different OpenMP requires clauses. The function entries in the offload
table ends up having the same decl tree and, hence, the diagnostic showed
the same filename for both. Solution: Use the .o filename in this case.

Note that the issue does not occur with same-named 'static' functions and
without the fatal error from the requires diagnostic, there would be
later a linker error due to having two 'main'.

gcc/
* lto-cgraph.cc (input_offload_tables): Improve requires diagnostic
when filenames come out identically.

OpenMP: Fix var replacement with 'simd' and linear-step vars [PR106548]

gcc/ChangeLog:

PR middle-end/106548
* omp-low.cc (lower_rec_input_clauses): Use build_outer_var_ref
for 'simd' linear-step values that are variable.

libgomp/ChangeLog:

PR middle-end/106548
* testsuite/libgomp.c/linear-2.c: New test.

libgomp/splay-tree.h: Fix splay_tree_prefix handling

When splay_tree_prefix is defined, the .h file
defines splay_* macros to add the prefix. However,
before those were only unset when additionally
splay_tree_c was defined.
Additionally, for consistency undefine splay_tree_c
also when no splay_tree_prefix is defined - there
is no interdependence either.

libgomp/ChangeLog:

* splay-tree.h: Fix splay_* macro unsetting if
splay_tree_prefix is defined.

OpenMP/C++: Allow classes with static members to be mappable [PR104493]

As this is the last lang-specific user of the omp_mappable_type hook,
the hook is removed, keeping only a generic omp_mappable_type for
incomplete types (or error_node).

PR c++/104493

gcc/c/ChangeLog:

* c-decl.cc (c_decl_attributes, finish_decl): Call omp_mappable_type
instead of removed langhook.
* c-typeck.cc (c_finish_omp_clauses): Likewise.

gcc/cp/ChangeLog:

* cp-objcp-common.h (LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
* cp-tree.h (cp_omp_mappable_type, cp_omp_emit_unmappable_type_notes):
Remove.
* decl2.cc (cp_omp_mappable_type_1, cp_omp_mappable_type,
cp_omp_emit_unmappable_type_notes): Remove.
(cplus_decl_attributes): Call omp_mappable_type instead of
removed langhook.
* decl.cc (cp_finish_decl): Likewise; call cxx_incomplete_type_inform
in lieu of cp_omp_emit_unmappable_type_notes.
* semantics.cc (finish_omp_clauses): Likewise.

gcc/ChangeLog:

* gimplify.cc (omp_notice_variable): Call omp_mappable_type
instead of removed langhook.
* omp-general.h (omp_mappable_type): New prototype.
* omp-general.cc (omp_mappable_type): New; moved from ...
* langhooks.cc (lhd_omp_mappable_type): ... here.
* langhooks-def.h (lhd_omp_mappable_type,
LANG_HOOKS_OMP_MAPPABLE_TYPE): Remove.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Remote the latter.
* langhooks.h (struct lang_hooks_for_types): Remove
omp_mappable_type.

gcc/testsuite/ChangeLog:

* g++.dg/gomp/unmappable-1.C: Remove dg-error; remove dg-note no
longer shown as TYPE_MAIN_DECL is NULL.
* c-c++-common/gomp/map-incomplete-type.c: New test.

Co-authored-by: Chung-Lin Tang <cltang@codesourcery.com>

arm: Define with_float to hard when target name ends with hf

On arm, the --with-float= configure option is used to define include
files search path (among other things).  However, when targeting
arm-linux-gnueabihf, one would expect to automatically default to the
hard-float ABI, but this is not the case. As a consequence, GCC
bootstrap fails on an arm-linux-gnueabihf target if --with-float=hard
is not used.

This patch checks if the target name ends with 'hf' and defines
with_float to hard if not already defined.  This is achieved in
gcc/config.gcc, just before selecting the default CPU depending on the
$with_float value.

2022-08-17  Christophe Lyon  <christophe.lyon@arm.com>

gcc/
* config.gcc (arm): Define with_float to hard if target name ends
with 'hf'.

Refactor back_threader_profitability

The following refactors profitable_path_p in the backward threader,
splitting out parts that can be computed once the exit block is known,
parts that contiguously update and that can be checked allowing
for the path to be later identified as FSM with larger limits,
possibly_profitable_path_p, and final checks done when the whole
path is known, profitable_path_p.

I've removed the back_threader_profitability instance from the
back_threader class and instead instantiate it once per path
discovery.  I've kept the size compute non-incremental to simplify
the patch and not worry about unwinding.

There's key changes to previous behavior - namely we apply
the param_max_jump_thread_duplication_stmts early only when
we know the path cannot become an FSM one (multiway + thread through
latch) but make sure to elide the path query when we we didn't
yet discover that but are over this limit.  Similarly the
speed limit is now used even when we did not yet discover a
hot BB on the path.  Basically the idea is to only stop path
discovery when we know the path will never become profitable
but avoid the expensive path range query when we know it's
currently not.

I've done a few cleanups, merging functions, on the way.

* tree-ssa-threadbackward.cc
(back_threader_profitability): Split profitable_path_p
into possibly_profitable_path_p and itself, keep state
as new members.
(back_threader::m_profit): Remove.
(back_threader::find_paths): Likewise.
(back_threader::maybe_register_path): Take profitability
instance as parameter.
(back_threader::find_paths_to_names): Likewise.  Use
possibly_profitable_path_p and avoid the path range query
when the path is currently too large.
(back_threader::find_paths): Fold into ...
(back_threader::maybe_thread_block): ... this.
(get_gimple_control_stmt): Remove.
(back_threader_profitability::possibly_profitable_path_p):
Split out from profitable_path_p, do early profitability
checks.
(back_threader_profitability::profitable_path_p): Do final
profitability path after the taken edge has been determined.

Fix bug in emergency cxa pool free

This probably has never actually affected anyone in practice. The normal
ABI implementation just uses malloc and only falls back to the pool on
malloc failure. But if that happens a bunch of times the freelist gets out
of order which violates some of the invariants of the freelist (as well as
the comments that follow the bug). The bug is just a comparison reversal
when traversing the freelist in the case where the pointer being returned
to the pool is after the existing freelist.

libstdc++-v3/
* libsupc++/eh_alloc.cc (pool::free): Inverse comparison.

LoongArch: Provide fmin/fmax RTL pattern

We already had smin/smax RTL pattern using fmin/fmax instruction. But
for smin/smax, it's unspecified what will happen if either operand is
NaN. So we would generate calls to libc fmin/fmax functions with
-fno-finite-math-only (the default for all optimization levels expect
-Ofast).

But, LoongArch fmin/fmax instruction is IEEE-754-2008 conformant so we
can also use the instruction for fmin/fmax pattern and avoid the library
function call.

gcc/ChangeLog:

* config/loongarch/loongarch.md (fmax<mode>3): New RTL pattern.
(fmin<mode>3): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/fmax-fmin.c: New test.

Abstract interesting ssa-names from GORI.

Provide a routine to pick out the ssa-names from interesting statements.

* gimple-range-fold.cc (gimple_range_ssa_names): New.
* gimple-range-fold.h (gimple_range_ssa_names): New prototype.
* gimple-range-gori.cc (range_def_chain::get_def_chain): Move
code to new routine.

Daily bump.

c++: remove some xfails

These tests are now passing.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wstringop-overflow-4.C: Only xfail for C++98.
* g++.target/i386/bfloat_cpp_typecheck.C: Remove xfail.

c++: Fix pragma suppression of -Wc++20-compat diagnostics [PR106423]

Gcc's '#pragma GCC diagnostic' directives are processed in "early mode"
(see handle_pragma_diagnostic_early) for the C++ frontend and, as such,
require that the target diagnostic option be enabled for the preprocessor
(see c_option_is_from_cpp_diagnostics).  This change modifies the
-Wc++20-compat option definition to register it as a preprocessor option
so that its associated diagnostics can be suppressed.  The changes also
implicitly disable the option in C++20 and later modes.  These changes
are consistent with the definition of the -Wc++11-compat option.

This support is motivated by the need to suppress the following diagnostic
otherwise issued in C++17 and earlier modes due to the char8_t typedef
present in the uchar.h header file in glibc 2.36.
  warning: identifier ‘char8_t’ is a keyword in C++20 [-Wc++20-compat]

Tests are added to validate suppression of both -Wc++11-compat and
-Wc++20-compat related diagnostics (fixes were only needed for the C++20
case).

PR c++/106423

gcc/c-family/ChangeLog:
* c-opts.cc (c_common_post_options): Disable -Wc++20-compat
diagnostics in C++20 and later.
* c.opt (Wc++20-compat): Enable hooks for the preprocessor.

gcc/cp/ChangeLog:
* parser.cc (cp_lexer_saving_tokens): Add comment regarding
diagnostic requirements.

gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/keywords2.C: New test.
* g++.dg/cpp2a/keywords2.C: New test.

libcpp/ChangeLog:
* include/cpplib.h (cpp_warning_reason): Add CPP_W_CXX20_COMPAT.
* init.cc (cpp_create_reader): Add cpp_warn_cxx20_compat.

docs: remove link to www.bullfreeware.com from install

As mentioned at https://gcc.gnu.org/PR106637#c2, the discontinued
providing binaries.

PR target/106637

gcc/ChangeLog:

* doc/install.texi: Remove link to www.bullfreeware.com

RISC-V: Support zfh and zfhmin extension

Zfh and Zfhmin are extensions for IEEE half precision, both are ratified
in Jan. 2022[1]:

- Zfh has full set of operation like F or D for single or double precision.
- Zfhmin has only provide minimal support for half precision operation,
like conversion, load, store and move instructions.

[1] https://github.com/riscv/riscv-isa-manual/commit/b35a54079e0da11740ce5b1e6db999d1d5172768

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc (riscv_implied_info): Add
zfh and zfhmin.
(riscv_ext_version_table): Ditto.
(riscv_ext_flag_table): Ditto.
* config/riscv/riscv-opts.h (MASK_ZFHMIN): New.
(MASK_ZFH): Ditto.
(TARGET_ZFHMIN): Ditto.
(TARGET_ZFH): Ditto.
* config/riscv/riscv.cc (riscv_output_move): Handle HFmode move
for zfh and zfhmin.
(riscv_emit_float_compare): Handle HFmode.
* config/riscv/riscv.md (ANYF): Add HF.
(SOFTF): Add HF.
(load): Ditto.
(store): Ditto.
(truncsfhf2): New.
(truncdfhf2): Ditto.
(extendhfsf2): Ditto.
(extendhfdf2): Ditto.
(*movhf_hardfloat): Ditto.
(*movhf_softfloat): Make sure not ZFHMIN.
* config/riscv/riscv.opt (riscv_zf_subext): New.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/_Float16-zfh-1.c: New.
* gcc.target/riscv/_Float16-zfh-2.c: Ditto.
* gcc.target/riscv/_Float16-zfh-3.c: Ditto.
* gcc.target/riscv/_Float16-zfhmin-1.c: Ditto.
* gcc.target/riscv/_Float16-zfhmin-2.c: Ditto.
* gcc.target/riscv/_Float16-zfhmin-3.c: Ditto.
* gcc.target/riscv/arch-16.c: Ditto.
* gcc.target/riscv/arch-17.c: Ditto.
* gcc.target/riscv/predef-21.c: Ditto.
* gcc.target/riscv/predef-22.c: Ditto.

RISC-V: Support _Float16 type.

RISC-V decide use _Float16 as primary IEEE half precision type, and this
already become part of psABI, this patch has added folloing support for
_Float16:

- Soft-float support for _Float16.
- Make sure _Float16 available on C++ mode.
- Name mangling for _Float16 on C++ mode.

gcc/ChangeLog

* config/riscv/riscv-builtins.cc: include stringpool.h
(riscv_float16_type_node): New.
(riscv_init_builtin_types): Ditto.
(riscv_init_builtins): Call riscv_init_builtin_types.
* config/riscv/riscv-modes.def (HF): New.
* config/riscv/riscv.cc (riscv_output_move): Handle HFmode.
(riscv_mangle_type): New.
(riscv_scalar_mode_supported_p): Ditto.
(riscv_libgcc_floating_mode_supported_p): Ditto.
(riscv_excess_precision): Ditto.
(riscv_floatn_mode): Ditto.
(riscv_init_libfuncs): Ditto.
(TARGET_MANGLE_TYPE): Ditto.
(TARGET_SCALAR_MODE_SUPPORTED_P): Ditto.
(TARGET_LIBGCC_FLOATING_MODE_SUPPORTED_P): Ditto.
(TARGET_INIT_LIBFUNCS): Ditto.
(TARGET_C_EXCESS_PRECISION): Ditto.
(TARGET_FLOATN_MODE): Ditto.
* config/riscv/riscv.md (mode): Add HF.
(softload): Add HF.
(softstore): Ditto.
(fmt): Ditto.
(UNITMODE): Ditto.
(movhf): New.
(*movhf_softfloat): New.

libgcc/ChangeLog:

* config/riscv/sfp-machine.h (_FP_NANFRAC_H): New.
(_FP_NANFRAC_H): Ditto.
(_FP_NANSIGN_H): Ditto.
* config/riscv/t-softfp32 (softfp_extensions): Add HF related
routines.
(softfp_truncations): Ditto.
(softfp_extras): Ditto.
* config/riscv/t-softfp64 (softfp_extras): Add HF related routines.

gcc/testsuite/ChangeLog:

* g++.target/riscv/_Float16.C: New.
* gcc.target/riscv/_Float16-soft-1.c: Ditto.
* gcc.target/riscv/_Float16-soft-2.c: Ditto.
* gcc.target/riscv/_Float16-soft-3.c: Ditto.
* gcc.target/riscv/_Float16-soft-4.c: Ditto.
* gcc.target/riscv/_Float16.c: Ditto.

soft-fp: Update soft-fp from glibc

This patch is updating all soft-fp from glibc, most changes are
copyright years update, removing "Contributed by" lines and update URL for
license, and changes other than those update are adding conversion
function between IEEE half and 32-bit/64-bit integer, those functions are
required by RISC-V _Float16 support.

libgcc/ChangeLog:

* soft-fp/fixhfdi.c: New.
* soft-fp/fixhfsi.c: Likewise.
* soft-fp/fixunshfdi.c: Likewise.
* soft-fp/fixunshfsi.c: Likewise.
* soft-fp/floatdihf.c: Likewise.
* soft-fp/floatsihf.c: Likewise.
* soft-fp/floatundihf.c: Likewise.
* soft-fp/floatunsihf.c: Likewise.
* soft-fp/adddf3.c: Updating copyright years, removing "Contributed by"
lines and update URL for license.
* soft-fp/addsf3.c: Likewise.
* soft-fp/addtf3.c: Likewise.
* soft-fp/divdf3.c: Likewise.
* soft-fp/divsf3.c: Likewise.
* soft-fp/divtf3.c: Likewise.
* soft-fp/double.h: Likewise.
* soft-fp/eqdf2.c: Likewise.
* soft-fp/eqhf2.c: Likewise.
* soft-fp/eqsf2.c: Likewise.
* soft-fp/eqtf2.c: Likewise.
* soft-fp/extenddftf2.c: Likewise.
* soft-fp/extended.h: Likewise.
* soft-fp/extendhfdf2.c: Likewise.
* soft-fp/extendhfsf2.c: Likewise.
* soft-fp/extendhftf2.c: Likewise.
* soft-fp/extendhfxf2.c: Likewise.
* soft-fp/extendsfdf2.c: Likewise.
* soft-fp/extendsftf2.c: Likewise.
* soft-fp/extendxftf2.c: Likewise.
* soft-fp/fixdfdi.c: Likewise.
* soft-fp/fixdfsi.c: Likewise.
* soft-fp/fixdfti.c: Likewise.
* soft-fp/fixhfti.c: Likewise.
* soft-fp/fixsfdi.c: Likewise.
* soft-fp/fixsfsi.c: Likewise.
* soft-fp/fixsfti.c: Likewise.
* soft-fp/fixtfdi.c: Likewise.
* soft-fp/fixtfsi.c: Likewise.
* soft-fp/fixtfti.c: Likewise.
* soft-fp/fixunsdfdi.c: Likewise.
* soft-fp/fixunsdfsi.c: Likewise.
* soft-fp/fixunsdfti.c: Likewise.
* soft-fp/fixunshfti.c: Likewise.
* soft-fp/fixunssfdi.c: Likewise.
* soft-fp/fixunssfsi.c: Likewise.
* soft-fp/fixunssfti.c: Likewise.
* soft-fp/fixunstfdi.c: Likewise.
* soft-fp/fixunstfsi.c: Likewise.
* soft-fp/fixunstfti.c: Likewise.
* soft-fp/floatdidf.c: Likewise.
* soft-fp/floatdisf.c: Likewise.
* soft-fp/floatditf.c: Likewise.
* soft-fp/floatsidf.c: Likewise.
* soft-fp/floatsisf.c: Likewise.
* soft-fp/floatsitf.c: Likewise.
* soft-fp/floattidf.c: Likewise.
* soft-fp/floattihf.c: Likewise.
* soft-fp/floattisf.c: Likewise.
* soft-fp/floattitf.c: Likewise.
* soft-fp/floatundidf.c: Likewise.
* soft-fp/floatundisf.c: Likewise.
* soft-fp/floatunditf.c: Likewise.
* soft-fp/floatunsidf.c: Likewise.
* soft-fp/floatunsisf.c: Likewise.
* soft-fp/floatunsitf.c: Likewise.
* soft-fp/floatuntidf.c: Likewise.
* soft-fp/floatuntihf.c: Likewise.
* soft-fp/floatuntisf.c: Likewise.
* soft-fp/floatuntitf.c: Likewise.
* soft-fp/gedf2.c: Likewise.
* soft-fp/gesf2.c: Likewise.
* soft-fp/getf2.c: Likewise.
* soft-fp/half.h: Likewise.
* soft-fp/ledf2.c: Likewise.
* soft-fp/lesf2.c: Likewise.
* soft-fp/letf2.c: Likewise.
* soft-fp/muldf3.c: Likewise.
* soft-fp/mulsf3.c: Likewise.
* soft-fp/multf3.c: Likewise.
* soft-fp/negdf2.c: Likewise.
* soft-fp/negsf2.c: Likewise.
* soft-fp/negtf2.c: Likewise.
* soft-fp/op-1.h: Likewise.
* soft-fp/op-2.h: Likewise.
* soft-fp/op-4.h: Likewise.
* soft-fp/op-8.h: Likewise.
* soft-fp/op-common.h: Likewise.
* soft-fp/quad.h: Likewise.
* soft-fp/single.h: Likewise.
* soft-fp/soft-fp.h: Likewise.
* soft-fp/subdf3.c: Likewise.
* soft-fp/subsf3.c: Likewise.
* soft-fp/subtf3.c: Likewise.
* soft-fp/truncdfhf2.c: Likewise.
* soft-fp/truncdfsf2.c: Likewise.
* soft-fp/truncsfhf2.c: Likewise.
* soft-fp/trunctfdf2.c: Likewise.
* soft-fp/trunctfhf2.c: Likewise.
* soft-fp/trunctfsf2.c: Likewise.
* soft-fp/trunctfxf2.c: Likewise.
* soft-fp/truncxfhf2.c: Likewise.
* soft-fp/unorddf2.c: Likewise.
* soft-fp/unordsf2.c: Likewise.
* soft-fp/unordtf2.c: Likewise.

Stop backwards thread discovery when leaving a loop

The backward threader copier cannot deal with the situation of
copying blocks belonging to different loops and will reject those
paths late. The following uses this to prune path discovery,
saving on compile-time. Note the off-loop block is still considered
as entry edge origin.

* tree-ssa-threadbackward.cc (back_threader::find_paths_to_names):
Do not walk further if we are leaving the current loop.

driver: fix environ corruption after putenv() [PR106624]

The bug appeared afte r13-2010-g1270ccda70ca09 "Factor out
jobserver_active_p" slightly changed `putenv()` use from allocating
to non-allocating:

    -xputenv (concat ("MAKEFLAGS=", dup, NULL));
    +xputenv (jinfo.skipped_makeflags.c_str ());

`xputenv()` (and `putenv()`) don't copy strings and only store the
pointer in the `environ` global table. As a result `environ` got
corrupted as soon as `jinfo.skipped_makeflags` store got deallocated.

This started causing bootstrap crashes in `execv()` calls:

    xgcc: fatal error: cannot execute '/build/build/./prev-gcc/collect2': execv: Bad address

The change restores memory allocation for `xputenv()` argument.

gcc/

PR driver/106624
* gcc.cc (driver::detect_jobserver): Allocate storage xputenv()
argument using xstrdup().

c++: Implement P2327R1 - De-deprecating volatile compound operations

From what I can see, this has been voted in as a DR and as it means
we warn less often than before in -std={gnu,c}++2{0,3} modes or with
-Wvolatile, I wonder if it shouldn't be backported to affected release
branches as well.

2022-08-16 Jakub Jelinek <jakub@redhat.com>

* typeck.cc (cp_build_modify_expr): Implement
P2327R1 - De-deprecating volatile compound operations. Don't warn
for |=, &= or ^= with volatile lhs.
* expr.cc (mark_use) <case MODIFY_EXPR>: Adjust warning wording,
leave out simple.

* g++.dg/cpp2a/volatile1.C: Adjust for de-deprecation of volatile
compound |=, &= and ^= operations.
* g++.dg/cpp2a/volatile3.C: Likewise.
* g++.dg/cpp2a/volatile5.C: Likewise.

d: Update DIP links in gdc documentation to point at upstream repository

The wiki links probably worked at some point in the distant past, but
now the official location of tracking all D Improvement Proposals is on
the upstream dlang/DIPs GitHub repository.

PR d/106638

gcc/d/ChangeLog:

* gdc.texi: Update DIP links to point at upstream dlang/DIPs
repository.

Rename imports nomenclature in path_range_query to exit_dependencies.

The purpose of this change is to disambiguate the imports name with
its use in GORI.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::import_p): Rename to...
(path_range_query::exit_dependency_p): ...this.
(path_range_query::dump): Rename imports to exit dependencies.
(path_range_query::compute_ranges_in_phis): Same.
(path_range_query::compute_ranges_in_block): Same.
(path_range_query::adjust_for_non_null_uses): Same.
(path_range_query::compute_ranges): Same.
(path_range_query::compute_phi_relations): Same.
(path_range_query::add_to_imports): Rename to...
(path_range_query::add_to_exit_dependencies): ...this.
(path_range_query::compute_imports): Rename to...
(path_range_query::compute_exit_dependencies): ...this.
* gimple-range-path.h (class path_range_query): Rename imports to
exit dependencies.

VR: mitigate -Wfinal-dtor-non-final-class clang warnings

Fixes:

gcc/value-range-storage.h:129:40: warning: class with destructor marked 'final' cannot be inherited from [-Wfinal-dtor-non-final-class]
gcc/value-range-storage.h:146:36: warning: class with destructor marked 'final' cannot be inherited from [-Wfinal-dtor-non-final-class]

gcc/ChangeLog:

* value-range-storage.h (class obstack_vrange_allocator): Mark
the class as final.
(class ggc_vrange_allocator): Likewise.

VR: add more virtual dtors

Add 2 virtual destructors in order to address:

gcc/alloc-pool.h:522:5: warning: destructor called on non-final 'value_range_equiv' that has virtual functions but non-virtual destructor [-Wdelete-non-abstract-non-virtual-dtor]
gcc/ggc.h:166:3: warning: destructor called on non-final 'int_range<1>' that has virtual functions but non-virtual destructor [-Wdelete-non-abstract-non-virtual-dtor]

gcc/ChangeLog:

* value-range-equiv.h (class value_range_equiv): Add virtual
destructor.
* value-range.h: Likewise.

middle-end/106630 - avoid ping-pong between extract_muldiv and match.pd

The following avoids ping-pong between the match.pd pattern changing
(sizetype) ((a_9 + 1) * 48) to (sizetype)(a_9 + 1) * 48 and
extract_muldiv performing the reverse transform by restricting the
match.pd pattern to narrowing conversions as the comment indicates.

PR middle-end/106630
* match.pd ((T)(x * CST) -> (T)x * CST): Restrict to
narrowing conversions.

* gcc.dg/torture/pr106630.c: New testcase.

VR: add missing override keyworks

Address:

gcc/value-range-equiv.h:57:8: warning: 'set_undefined' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/value-range-equiv.h:58:8: warning: 'set_varying' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]

gcc/ChangeLog:

* value-range-equiv.h (class value_range_equiv):

analyzer: add more final override keywords

gcc/analyzer/ChangeLog:

* region-model.cc: Fix -Winconsistent-missing-override clang
warning.
* region.h: Likewise.

i386: add 'final' and 'override' to scalar_chain

In c3ed9e0d6e96d8697e4bab994f8acbc5506240ee, David added some
"final override" and since that there are 2 new warnings that
need the same treatment:

gcc/config/i386/i386-features.h:186:8: warning: 'convert_op' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/config/i386/i386-features.h:186:8: warning: 'convert_op' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/config/i386/i386-features.h:199:8: warning: 'convert_op' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/config/i386/i386-features.h:199:8: warning: 'convert_op' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]

gcc/ChangeLog:

* config/i386/i386-features.h (class general_scalar_chain): Add
final override for a method.
(class timode_scalar_chain): Likewise.

docs: fix link destination

gcc/fortran/ChangeLog:

* gfortran.texi: Fix link destination to a valid URL.

Adjust max-jump-thread-paths docs

The following fixes spelling and changes edge degree for number of
incoming edges.

* doc/invoke.texi (max-jump-thread-paths): Adjust.

jobserver: fix fifo mode by opening pipe in proper mode

The current jobserver_info relies on non-blocking FDs,
thus one the pipe in such mode.

gcc/ChangeLog:

* opts-common.cc (jobserver_info::connect): Open fifo
in non-blocking mode.

rs6000: Adjust mov optabs for opaque modes [PR103353]

As PR103353 shows, we may want to continue to expand built-in
function __builtin_vsx_lxvp, even if we have already emitted
error messages about some missing required conditions. As
shown in that PR, without one explicit mov optab on OOmode
provided, it would call emit_move_insn recursively.

So this patch is to allow the mov pattern to be generated during
expanding phase if compiler has already seen errors.

PR target/103353

gcc/ChangeLog:

* config/rs6000/mma.md (define_expand movoo): Move TARGET_MMA condition
check to preparation statements and add handlings for !TARGET_MMA.
(define_expand movxo): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr103353.c: New test.

vect: Don't allow vect_emulated_vector_p type in vectorizable_call [PR106322]

As PR106322 shows, in some cases for some vector type whose
TYPE_MODE is a scalar integral mode instead of a vector mode,
it's possible to obtain wrong target support information when
querying with the scalar integral mode.  For example, for the
test case in PR106322, on ppc64 32bit vectorizer gets vector
type "vector(2) short unsigned int" for scalar type "short
unsigned int", its mode is SImode instead of V2HImode.  The
target support querying checks umul_highpart optab with SImode
and considers it's supported, then vectorizer further generates
.MULH IFN call for that vector type.  Unfortunately it's wrong
to use SImode support for that vector type multiply highpart
here.

This patch is to teach vectorizable_call analysis not to allow
vect_emulated_vector_p type for both vectype_in and vectype_out
as Richi suggested.

PR tree-optimization/106322

gcc/ChangeLog:

* tree-vect-stmts.cc (vectorizable_call): Don't allow
vect_emulated_vector_p type for both vectype_in and vectype_out.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr106322.c: New test.
* gcc.target/powerpc/pr106322.c: New test.

xtensa: Turn on -fsplit-wide-types-early by default

Since GCC10, the "subreg2" optimization pass was no longer tied to enabling
"subreg1" unless -fsplit-wide-types-early was turned on (PR88233). However
on the Xtensa port, the lack of "subreg2" can degrade the quality of the
output code, especially for those that produce many D[FC]mode pseudos.

This patch turns on -fsplit-wide-types-early by default in order to restore
the previous behavior.

gcc/ChangeLog:

* common/config/xtensa/xtensa-common.cc
(xtensa_option_optimization_table): Add OPT_fsplit_wide_types_early
for OPT_LEVELS_ALL in order to restore pre-GCC10 behavior.

Daily bump.

d: Defer compiling inline definitions until after the module has finished.

This is to prevent the case of when generating the methods of a struct
type, we don't accidentally emit an inline function that references it,
as the outer struct itself would still be incomplete.

gcc/d/ChangeLog:

* d-tree.h (d_defer_declaration): Declare.
* decl.cc (function_needs_inline_definition_p): Defer checking
DECL_UNINLINABLE and DECL_DECLARED_INLINE_P.
(maybe_build_decl_tree): Call d_defer_declaration instead of
build_decl_tree.
* modules.cc (deferred_inline_declarations): New variable.
(build_module_tree): Set deferred_inline_declarations and a handle
declarations pushed to it.
(d_defer_declaration): New function.

d: Fix internal compiler error: Segmentation fault at gimple-expr.cc:88

Because complex types are deprecated in the language, the new way to
expose native complex types is by defining an enum with a basetype of a
library-defined struct that is implicitly treated as-if it is native.
As casts are not implicitly added by the front-end when downcasting from
enum to its underlying type, we must insert an explicit cast during the
code generation pass.

PR d/106623

gcc/d/ChangeLog:

* d-codegen.cc (underlying_complex_expr): New function.
(d_build_call): Handle passing native complex objects as the
library-defined equivalent.
* d-tree.h (underlying_complex_expr): Declare.
* expr.cc (ExprVisitor::visit (DotVarExp *)): Call
underlying_complex_expr instead of build_vconvert.

gcc/testsuite/ChangeLog:

* gdc.dg/torture/pr106623.d: New test.

d: Build internal TypeInfo types when module name is "object"

If for whatever reason the module declaration doesn't exist in the
object file, ensure that the internal definitions for TypeInfo and
TypeInfo_Class are still created, otherwise an ICE could occur later if
they are required for a run-time helper call.

gcc/d/ChangeLog:

* d-compiler.cc (Compiler::onParseModule): Call create_tinfo_types
when module name is object.
* typeinfo.cc (create_tinfo_types): Add guard for multiple
invocations.

d: Field names of anonymous delegates should be same as regular delegate types.

Doesn't change anything in the code generation or ABI, but makes it
consistent with regular delegates as names would match up when
inspecting tree dumps.

gcc/d/ChangeLog:

* d-codegen.cc (build_delegate_cst): Give anonymous delegate field
names same as per ABI spec.

analyzer: fix direction of -Wanalyzer-out-of-bounds note [PR106626]

Fix a read/write typo.

Also, add more test coverage of -Wanalyzer-out-of-bounds to help
establish a baseline for experiments on tweaking the wording of
the warning (PR analyzer/106626).

gcc/analyzer/ChangeLog:
PR analyzer/106626
* region-model.cc (buffer_overread::emit): Fix copy&paste error in
direction of the access in the note.

gcc/testsuite/ChangeLog:
PR analyzer/106626
* gcc.dg/analyzer/out-of-bounds-read-char-arr.c: New test.
* gcc.dg/analyzer/out-of-bounds-read-int-arr.c: New test.
* gcc.dg/analyzer/out-of-bounds-write-char-arr.c: New test.
* gcc.dg/analyzer/out-of-bounds-write-int-arr.c: New test.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

analyzer: better fix for -Wanalyzer-use-of-uninitialized-value [PR106573]

gcc/analyzer/ChangeLog:
PR analyzer/106573
* region-model.cc (region_model::on_call_pre): Use check_call_args
when ensuring that we call get_arg_svalue on all args. Remove
redundant call from handling for stdio builtins.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>

Check for undefined and varying first.

Rearrange order in irange:set to ensure all POLY_INTs map to varying.

PR tree-optimization/106621
gcc/
* value-range.cc (irange::set): Check for POLY_INT_CST early.

gcc/testsuite/
* gcc.dg/pr106621.c

analyzer: fix for ICE in sm-fd.cc [PR106551]

This patch fixes the ICE caused by valid_to_unchecked_state
in sm-fd.cc by exiting early if first argument of any "dup"
functions is invalid.

gcc/analyzer/ChangeLog:
PR analyzer/106551
* sm-fd.cc (check_for_dup): exit early if first
argument is invalid for all dup functions.

gcc/testsuite/ChangeLog:
PR analyzer/106551
* gcc.dg/analyzer/fd-dup-1.c: New testcase.

Signed-off-by: Immad Mir <mirimmad@outlook.com>

Support shifts and rotates by integer constants in TImode STV on x86_64.

This patch adds support for converting 128-bit TImode shifts and rotates
to SSE equivalents using V1TImode during the TImode STV pass.
Previously, only logical shifts by multiples of 8 were handled
(from my patch earlier this month).

As an example of the benefits, the following rotate by 32-bits:

unsigned __int128 a, b;
void rot32() { a = (b >> 32) | (b << 96); }

when compiled on x86_64 with -O2 previously generated:

        movq    b(%rip), %rax
        movq    b+8(%rip), %rdx
        movq    %rax, %rcx
        shrdq   $32, %rdx, %rax
        shrdq   $32, %rcx, %rdx
        movq    %rax, a(%rip)
        movq    %rdx, a+8(%rip)
        ret

with this patch, now generates:

        movdqa  b(%rip), %xmm0
        pshufd  $57, %xmm0, %xmm0
        movaps  %xmm0, a(%rip)
        ret

[which uses a V4SI permutation for those that don't read SSE].
This should help 128-bit cryptography codes, that interleave XORs
with rotations (but that don't use additions or subtractions).

2022-08-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc
(timode_scalar_chain::compute_convert_gain): Provide costs for
shifts and rotates.
(timode_scalar_chain::convert_insn): Handle ASHIFTRT, ROTATERT
and ROTATE just like existing ASHIFT and LSHIFTRT cases.
(timode_scalar_to_vector_candidate_p): Handle all shifts and
rotates by integer constants between 0 and 127.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse4_1-stv-9.c: New test case.

Improved gain calculation for COMPARE to 0 or -1 in TImode STV on x86_64.

This patch tweaks timode_scalar_chain::compute_convert_gain to provide
more accurate costs for converting TImode comparisons against zero or
minus 1 to V1TImode equivalents.

2022-08-15 Roger Sayle <roger@nextmovesoftware.com>

gcc/ChangeLog
* config/i386/i386-features.cc
(timode_scalar_chain::compute_convert_gain): Provide gains for
comparisons against 0/-1, including "*testti" patterns.

PR tree-optimization/64992: (B << 2) != 0 is B when B is Boolean.

This patch resolves both PR tree-optimization/64992 and PR
tree-optimization/98956 which are missed optimization enhancement
request, for which Andrew Pinski already has a proposed solution
(related to a fix for PR tree-optimization/98954).  Yesterday,
I proposed an alternate improved patch for PR98954, which although
superior in most respects, alas didn't address this case [which
doesn't include a BIT_AND_EXPR], hence this follow-up fix.

For many functions, F(B), of a (zero-one) Boolean value B, the
expression F(B) != 0 can often be simplified to just B.  Hence
"(B * 5) != 0" is B, "-B != 0" is B, "bswap(B) != 0" is B,
"(B >>r 3) != 0" is B.  These are all currently optimized by GCC,
with the strange exception of left shifts by a constant (possibly
due to the undefined/implementation defined behaviour when the
shift constant is larger than the first operand's precision).
This patch adds support for this particular case, when the shift
constant is valid.

2022-08-15  Roger Sayle  <roger@nextmovesoftware.com>

gcc/ChangeLog
PR tree-optimization/64992
PR tree-optimization/98956
* match.pd (ne (lshift @0 @1) 0): Simplify (X << C) != 0 to X
when X is zero_one_valued_p and the shift constant C is valid.
(eq (lshift @0 @1) 0): Likewise, simplify (X << C) == 0 to !X
when X is zero_one_valued_p and the shift constant C is valid.

gcc/testsuite/ChangeLog
PR tree-optimization/64992
* gcc.dg/pr64992.c: New test case.

PR tree-optimization/71343: Optimize (X<<C)&(Y<<C) as (X&Y)<<C.

This patch is the first part of a solution to PR tree-optimization/71343,
a missed-optimization enhancement request where GCC fails to see that
(a<<2)+(b<<2) == a*4+b*4.

This piece is that (X<<C) op (Y<<C) can be simplified to (X op Y) << C,
for many binary operators, including AND, IOR, XOR, and (if overflow
isn't an issue) PLUS and MINUS.  Likewise, the right shifts (both logical
and arithmetic) and bit-wise logical operators can be simplified in a
similar fashion.  These all reduce the number of GIMPLE binary operations
from 3 to 2, by combining/eliminating a shift operation.

2022-08-15  Roger Sayle  <roger@nextmovesoftware.com>
    Richard Biener  <rguenther@suse.de>

gcc/ChangeLog
PR tree-optimization/71343
* match.pd (op (lshift @0 @1) (lshift @2 @1)): Optimize the
expression (X<<C) + (Y<<C) to (X+Y)<<C for multiple operators.
(op (rshift @0 @1) (rshift @2 @1)): Likewise, simplify (X>>C)^(Y>>C)
to (X^Y)>>C for binary logical operators, AND, IOR and XOR.

gcc/testsuite/ChangeLog
PR tree-optimization/71343
* gcc.dg/pr71343-1.c: New test case.

c++: Fix module line no testcase

Not all systems have the same injected headers, leading to line
location table differences that are immaterial to the test. Fix the
regexp more robustly.

gcc/testsuite/
* g++.dg/modules/loc-prune-4.C: Adjust regexp

c++: Extend -Wpessimizing-move for class prvalues [PR106276]

We already have a warning that warns about pessimizing std::move
in a return statement, when it prevents the NRVO:

  T fn()
  {
    T t;
    return std::move (t); // warning \o/
  }

However, the warning doesn't warn when what we are returning is a class
prvalue, that is, when std::move prevents the RVO:

  T fn()
  {
    T t;
    return std::move (T{}); // no warning :-(
  }

This came up recently in GCC:
<https://gcc.gnu.org/pipermail/gcc-patches/2022-July/598177.html>.

This patch fixes that.  I would like to extend the warning further, so
that it warns in more contexts, e.g.:

  T t = std::move(T());

or

  void foo (T);
  foo (std::move(T()));

PR c++/106276

gcc/cp/ChangeLog:

* typeck.cc (can_do_rvo_p): New.
(maybe_warn_pessimizing_move): Warn when moving a temporary object
in a return statement prevents copy elision.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/Wpessimizing-move7.C: New test.

Simplify range_on_path_entry

I've noticed that range_on_path_entry does mightly complicated things
that don't make sense to me and the commentary might just be
out of date. For the sake of it I replaced it with range_on_entry
and statistics show we thread _more_ jumps with that, so better
not do magic there.

* gimple-range-path.cc (range_on_path_entry): Just
call range_on_entry.

i386 PIE: testsuite: cope with default pie on ia32

This patch continues the effort of cleaning up the testsuite for
--enable-default-pie; the focus herein is mostly 32-bit x86.

As much as I tried to avoid it, most of the changes to the testsuite
simply disable PIC/PIE, for reasons I'm going to detail below.

static-cdtor1.C gets new patterns to match PIE output.  Some
avx512fp16 tests change only in register allocation, because of the
register used to hold the GOT base address.  Interrupt tests changed
in this regard as well, but here it also affected register saving and
restoring.

The previous patch modified cet-sjlj tests, mentioning a single regexp
covering PIC and nonPIC got incorrect match counts.  I found out that
adding ?: to parenthesized subpatterns avoids miscounting matches.
Other tests that count certain kinds of insns needed adjustment over
insns in get_pc_thunk, extra loads from the GOT, or extra adds to
compute addresses.  In one case, namely stack-check-12, it is nonPIC
that had extra insns, that PIC gets rid of, or rather, pushing and
popping the PIC register obviates the dummy push and matching pop used
for stack probing in nonpic.

pr95126 tests were supposed to optimize loads into known constants,
but the @GOTOFF addresses prevent that for reasons I have not
investigated, but that would be clearly desirable, so I've XFAILed
these.  pr95852 is another case of missed optimization: sibcalls are
not possible when the PIC register needs to be set up for the call,
which prevents the expected constant propagation to the return block;
I have adjusted the codegen expectations of these tests.

As for tests that disable PIE...  Some are judgment calls, that fail
for similar reasons as tests described above, but I chose not to
adjust their expectations; others are just not possible with PIC, or
not worth the effort of adjusting.

anon[14].C check for no global or comdat symbols, respectively, but
-fPIE outputs get_pc_thunk, as global hidden comdat.
initlist-const1.C wants .rodata and checks for no .data, but PIC
outputs constant data that needs relocations in .data.rel.ro.local.
no-stack-protector-attr-3.C and stackprotectexplicit2.C count
stack_check_fail matches; -fPIE calls stack_check_fail_local instead,
which matches the pattern, but this symbol is also marked as .hidden,
so the match count needs to be adjusted.

pr71694.C checks for no movl, but get_pc_thunk contains one.
pr102892-1.c is a missed optimization, ivopts creates an induction
variable because the array address can't be part of an indexing base
address with PIE, and that ends up stopping a load from being resolved
to a constant as expected.  sibcall-11.c needs @PLT for the call,
which requires the PIC register, which makes sibcalling impossible.
builtin-self.c, in turn, expects no calls, but PIC calls get_pc_thunk.

avx* vector tests that had PIE disabled were affected in that the need
for GOT-based addressing modes changed instruction selection in ways
that deviated from the expectations of the tests.  Ditto other vector
tests: pr100865*, pr101796-1, pr101846, pr101989-broadcast-1, and
pr102021, pr54855-[37], and pr90773-17.

pr15184* tests need a PIC register to access global variables, which
affects register allocation, so the patterns would have to be
adjusted.  pr27971 can't use the expected addressing mode to
dereference the array with PIC, so it ends up selecting an indexed
addressing mode, obviating the expected separate shift insn.

pr70263-2 is another case that implicitly expects a sibcall,
impossible because of the need for the PIC register; without a
sibcall, the expected REG_EQUIV for the reuse of the stack slot of an
incoming argument does not occur.  pr78035 duplicates the final
compare in both then and else blocks with PIE, which deviates from the
expected cmp count.  pr81736-[57] test for no frame pointer, but the
PIC register assignment to a call-saved register forces a frame; the
former ends up not using the PIC register, but it's only optimized out
after committing to a stack frame to preserve it.  pr85620-6 also
expects a tail call in a situation that is impossible on ia32 PIC.

pr85667-6 doesn't expect the movl in get_pc_thunk.  pr93492-5 tests
-mfentry, not available with PIC on ia32.  pr96539 expects a
tail-call, to avoid copying a large-ish struct argument, but the call
requires the PIC register, so no tail-call.  stack-prot-sym.c expects
a nonpic addressing mode.

for  gcc/testsuite/ChangeLog

* g++.dg/abi/anon1.C: Disable pie on ia32.
* g++.dg/abi/anon4.C: Likewise.
* g++.dg/cpp0x/initlist-const1.C: Likewise.
* g++.dg/no-stack-protector-attr-3.C: Likewise.
* g++.dg/stackprotectexplicit2.C: Likewise.
* g++.dg/pr71694.C: Likewise.
* gcc.dg/pr102892-1.c: Likewise.
* gcc.dg/sibcall-11.c: Likewise.
* gcc.dg/torture/builtin-self.c: Likewise.
* gcc.target/i386/avx2-dest-false-dep-for-glc.c: Likewise.
* gcc.target/i386/avx512bf16-cvtsbh2ss-1.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-1.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-3.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/avx512f-broadcast-pr87767-7.c: Likewise.
* gcc.target/i386/avx512fp16-broadcast-1.c: Likewise.
* gcc.target/i386/avx512fp16-pr101846.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-1.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-3.c: Likewise.
* gcc.target/i386/avx512vl-broadcast-pr87767-5.c: Likewise.
* gcc.target/i386/pr100865-2.c: Likewise.
* gcc.target/i386/pr100865-3.c: Likewise.
* gcc.target/i386/pr100865-4a.c: Likewise.
* gcc.target/i386/pr100865-4b.c: Likewise.
* gcc.target/i386/pr100865-5a.c: Likewise.
* gcc.target/i386/pr100865-5b.c: Likewise.
* gcc.target/i386/pr100865-6a.c: Likewise.
* gcc.target/i386/pr100865-6b.c: Likewise.
* gcc.target/i386/pr100865-6c.c: Likewise.
* gcc.target/i386/pr100865-7b.c: Likewise.
* gcc.target/i386/pr101796-1.c: Likewise.
* gcc.target/i386/pr101846-2.c: Likewise.
* gcc.target/i386/pr101989-broadcast-1.c: Likewise.
* gcc.target/i386/pr102021.c: Likewise.
* gcc.target/i386/pr90773-17.c: Likewise.
* gcc.target/i386/pr54855-3.c: Likewise.
* gcc.target/i386/pr54855-7.c: Likewise.
* gcc.target/i386/pr15184-1.c: Likewise.
* gcc.target/i386/pr15184-2.c: Likewise.
* gcc.target/i386/pr27971.c: Likewise.
* gcc.target/i386/pr70263-2.c: Likewise.
* gcc.target/i386/pr78035.c: Likewise.
* gcc.target/i386/pr81736-5.c: Likewise.
* gcc.target/i386/pr81736-7.c: Likewise.
* gcc.target/i386/pr85620-6.c: Likewise.
* gcc.target/i386/pr85667-6.c: Likewise.
* gcc.target/i386/pr93492-5.c: Likewise.
* gcc.target/i386/pr96539.c: Likewise.
PR target/81708 (%gs:my_guard)
* gcc.target/i386/stack-prot-sym.c: Likewise.
* g++.dg/init/static-cdtor1.C: Add alternate patterns for PIC.
* gcc.target/i386/avx512fp16-vcvtsh2si-1a.c: Extend patterns
for PIC/PIE register allocation.
* gcc.target/i386/pr100704-3.c: Likewise.
* gcc.target/i386/avx512fp16-vcvtsh2usi-1a.c: Likewise.
* gcc.target/i386/avx512fp16-vcvttsh2si-1a.c: Likewise.
* gcc.target/i386/avx512fp16-vcvttsh2usi-1a.c: Likewise.
* gcc.target/i386/avx512fp16-vmovsh-1a.c: Likewise.
* gcc.target/i386/interrupt-11.c: Likewise, allowing for
preservation of the PIC register.
* gcc.target/i386/interrupt-12.c: Likewise.
* gcc.target/i386/interrupt-13.c: Likewise.
* gcc.target/i386/interrupt-15.c: Likewise.
* gcc.target/i386/interrupt-16.c: Likewise.
* gcc.target/i386/interrupt-17.c: Likewise.
* gcc.target/i386/interrupt-8.c: Likewise.
* gcc.target/i386/cet-sjlj-6a.c: Combine patterns from
previous change.
* gcc.target/i386/cet-sjlj-6b.c: Likewise.
* gcc.target/i386/pad-10.c: Accept insns in get_pc_thunk.
* gcc.target/i386/pr70321.c: Likewise.
* gcc.target/i386/pr81563.c: Likewise.
* gcc.target/i386/pr84278.c: Likewise.
* gcc.target/i386/pr90773-2.c: Likewise, plus extra loads from
the GOT.
* gcc.target/i386/pr90773-3.c: Likewise.
* gcc.target/i386/pr94913-2.c: Accept additional PIC insns.
* gcc.target/i386/stack-check-17.c: Likewise.
* gcc.target/i386/stack-check-12.c: Do not require dummy stack
probing obviated with PIC.
* gcc.target/i386/pr95126-m32-1.c: Expect missed optimization
with PIC.
* gcc.target/i386/pr95126-m32-2.c: Likewise.
* gcc.target/i386/pr95852-2.c: Accept different optimization
with PIC.
* gcc.target/i386/pr95852-4.c: Likewise.

ifcvt: Fix up noce_convert_multiple_sets [PR106590]

The following testcase is miscompiled on x86_64-linux.
The problem is in the noce_convert_multiple_sets optimization.
We essentially have:
if (g == 1)
  {
    g = 1;
    f = 23;
  }
else
  {
    g = 2;
    f = 20;
  }
and for each insn try to create a conditional move sequence.
There is code to detect overlap with the regs used in the condition
and the destinations, so we actually try to construct:
tmp_g = g == 1 ? 1 : 2;
f = g == 1 ? 23 : 20;
g = tmp_g;
which is fine.  But, we actually try to create two different
conditional move sequences in each case, seq1 with the whole
(eq (reg/v:HI 82 [ g ]) (const_int 1 [0x1]))
condition and seq2 with cc_cmp
(eq (reg:CCZ 17 flags) (const_int 0 [0]))
to rely on the earlier present comparison.  In each case, we
compare the rtx costs and choose the cheaper sequence (seq1 if both
have the same cost).
The problem is that with the skylake tuning,
tmp_g = g == 1 ? 1 : 2;
is actually expanded as
tmp_g = (g == 1) + 1;
in seq1 (which clobbers (reg 17 flags)) and as a cmov in seq2
(which doesn't).  The tuning says both have the same cost, so we
pick seq1.  Next we check sequences for
f = g == 1 ? 23 : 20; and here the seq2 cmov is cheaper, but it
uses (reg 17 flags) which has been clobbered earlier.

The following patch fixes that by detecting if we in the chosen
sequence clobber some register mentioned in cc_cmp or rev_cc_cmp,
and if yes, arranges for only seq1 (i.e. sequences that emit the
comparison itself) to be used after that.

2022-08-15  Jakub Jelinek  <jakub@redhat.com>

PR rtl-optimization/106590
* ifcvt.cc (check_for_cc_cmp_clobbers): New function.
(noce_convert_multiple_sets_1): If SEQ sets or clobbers any regs
mentioned in cc_cmp or rev_cc_cmp, don't consider seq2 for any
further conditional moves.

* gcc.dg/torture/pr106590.c: New test.

x86: Enable __bf16 type for TARGET_SSE2 and above

gcc/ChangeLog:

* config/i386/i386-builtin-types.def (BFLOAT16): New primitive type.
* config/i386/i386-builtins.cc : Support __bf16 type for i386 backend.
(ix86_register_bf16_builtin_type): New function.
(ix86_bf16_type_node): New.
(ix86_bf16_ptr_type_node): Ditto.
(ix86_init_builtin_types): Add ix86_register_bf16_builtin_type function call.
* config/i386/i386-modes.def (FLOAT_MODE): Add BFmode.
(ADJUST_FLOAT_FORMAT): Ditto.
* config/i386/i386.cc (classify_argument): Handle BFmode.
(construct_container): Ditto.
(function_value_32): Return __bf16 by %xmm0.
(function_value_64): Return __bf16 by SSE register.
(ix86_output_ssemov): Handle BFmode.
(ix86_legitimate_constant_p): Disable BFmode constant double.
(ix86_secondary_reload): Require gpr as intermediate register
to store __bf16 from sse register when sse4 is not available.
(ix86_scalar_mode_supported_p): Enable __bf16 under sse2.
(ix86_mangle_type): Add manlging for __bf16 type.
(ix86_invalid_conversion): New function for target hook.
(ix86_invalid_unary_op): Ditto.
(ix86_invalid_binary_op): Ditto.
(TARGET_INVALID_CONVERSION): New define for target hook.
(TARGET_INVALID_UNARY_OP): Ditto.
(TARGET_INVALID_BINARY_OP): Ditto.
* config/i386/i386.h (host_detect_local_cpu): Add BFmode.
* config/i386/i386.md ("mode"): Add BFmode.
(MODE_SIZE): Ditto.
(X87MODEFH): Ditto.
(HFBF): Add new define_mode_iterator.
(*pushhf_rex64): Change for BFmode.
(*push<mode>_rex64): Ditto.
(*pushhf): Ditto.
(*push<mode>): Ditto.
(MODESH): Ditto.
(hfbfconstf): Add new define_mode_attr.
(*mov<mode>_internal): Add BFmode.

gcc/testsuite/ChangeLog:

* g++.target/i386/bfloat_cpp_typecheck.C: New test.
* gcc.target/i386/bfloat16-1.c: Ditto.
* gcc.target/i386/sse2-bfloat16-1.c: Ditto.
* gcc.target/i386/sse2-bfloat16-2.c: Ditto.
* gcc.target/i386/sse2-bfloat16-scalar-typecheck.c: Ditto.

Daily bump.

Move V1TI shift/rotate lowering from expand to pre-reload split on x86_64.

This patch moves the lowering of 128-bit V1TImode shifts and rotations by
constant bit counts to sequences of SSE operations from the RTL expansion
pass to the pre-reload split pass.  Postponing this splitting of shifts
and rotates enables (will enable) the TImode equivalents of these operations/
instructions to be considered as candidates by the (TImode) STV pass.
Technically, this patch changes the existing expanders to continue to
lower shifts by variable amounts, but constant operands become RTL
instructions, specified by define_insn_and_split that are triggered by
x86_pre_reload_split.  The one minor complication is that logical shifts
by multiples of eight, don't get split, but are handled by existing insn
patterns, such as sse2_ashlv1ti3 and sse2_lshrv1ti3.  There should be no
changes in generated code with this patch, which just adjusts the pass
in which transformations get applied.

2022-08-13  Roger Sayle  <roger@nextmovesoftware.com>
    Uroš Bizjak  <ubizjak@gmail.com>

gcc/ChangeLog
* config/i386/predicates.md (const_0_to_255_not_mul_8_operand):
New predicate for values between 0/1 and 255, not multiples of 8.
* config/i386/sse.md (ashlv1ti3): Delay lowering of logical left
shifts by constant bit counts.
(*ashlvti3_internal): New define_insn_and_split that lowers
logical left shifts by constant bit counts, that aren't multiples
of 8, before reload.
(lshrv1ti3): Delay lowering of logical right shifts by constant.
(*lshrv1ti3_internal): New define_insn_and_split that lowers
logical right shifts by constant bit counts, that aren't multiples
of 8, before reload.
(ashrv1ti3):: Delay lowering of arithmetic right shifts by
constant bit counts.
(*ashrv1ti3_internal): New define_insn_and_split that lowers
arithmetic right shifts by constant bit counts before reload.
(rotlv1ti3): Delay lowering of rotate left by constant.
(*rotlv1ti3_internal): New define_insn_and_split that lowers
rotate left by constant bits counts before reload.
(rotrv1ti3): Delay lowering of rotate right by constant.
(*rotrv1ti3_internal): New define_insn_and_split that lowers
rotate right by constant bits counts before reload.

testsuite: Disable out-of-bounds checker in analyzer/torture/pr93451.c

This patch disables Wanalyzer-out-of-bounds for analyzer/torture/pr93451.c
and makes the test case pass when compiled with -m32.

The emitted warning is a true positive but only occurs if
sizeof (long int) is less than sizeof (double). I've already discussed a
similar case with Dave in the context of pr96764.c and we came to the
conclusion that we just disable the checker in such cases.

Committed under the "obvious fix" rule.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/torture/pr93451.c:
Disable Wanalyzer-out-of-bounds.

Daily bump.

[Committed] arm: Document +no options for Cortex-M55 CPU.

This patch documents the following options for Arm Cortex-M55 CPU
under -mcpu= list.

+nomve.fp (disables MVE single precision floating point instructions)
+nomve (disables MVE integer and single precision floating point instructions)
+nodsp (disables dsp, MVE integer and single precision floating point instructions)
+nofp (disables floating point instructions)
Committed as obvious to master.

gcc/ChangeLog:

2022-08-12 Srinath Parvathaneni <srinath.parvathaneni@arm.com>

* doc/invoke.texi (Arm Options): Document -mcpu=cortex-m55 options.

Fix invalid devirtualization when combining final keyword and anonymous types

this patch fixes a wrong code issue where we incorrectly devirtualize to
__builtin_unreachable.  The problem occurs in combination of anonymous
namespaces and final keyword used on methods.  We do two optimizations here
1) when reacing final method we cut the search for possible new targets
2) if the type is anonymous we detect whether it is ever instatiated by
    looking if its vtable is referred to.
Now this goes wrong when thre is an anonymous type with final method that
is not instantiated while its derived type is.  So if 1 triggers we need
to make 2 to look for vtables of all derived types as done by this patch.

Bootstrpaped/regtested x86_64-linux

Honza

gcc/ChangeLog:

2022-08-10  Jan Hubicka  <hubicka@ucw.cz>

PR middle-end/106057
* ipa-devirt.cc (type_or_derived_type_possibly_instantiated_p): New
function.
(possible_polymorphic_call_targets): Use it.

gcc/testsuite/ChangeLog:

2022-08-10  Jan Hubicka  <hubicka@ucw.cz>

PR middle-end/106057
* g++.dg/tree-ssa/pr101839.C: New test.

Improve comment for tree_niter_desc.{control,bound,cmp}

Fix typos and explain ERROR_MARK usage.

gcc/ChangeLog:

* tree-ssa-loop.h: Improve comment

phiopt: Remove unnecessary checks from spaceship_replacement [PR106506]

Those 2 checks were just me trying to be extra careful, the
(phires & 1) == phires and variants it is folded to of course make only sense
for the -1/0/1/2 result spaceship, for -1/0/1 one can just use comparisons of
phires.  We only floating point spaceship if nans aren't honored, so the
2 case is ignored, and if it is, with Aldy's changes we can simplify the
2 case away from the phi but the (phires & 1) == phires stayed.  It is safe
to treat the phires comparison as phires >= 0 even then.

2022-08-12  Jakub Jelinek  <jakub@redhat.com>

PR tree-optimization/106506
* tree-ssa-phiopt.cc (spaceship_replacement): Don't punt for
is_cast or orig_use_lhs cases if phi_bb has 3 predecessors.

* g++.dg/opt/pr94589-2.C: New test.

tree-optimization/106593 - fix ICE with backward threading

With the last re-org I failed to make sure to not add SSA names
nor supported by ranger into m_imports which then triggers an
ICE in range_on_path_entry because range_of_expr returns false.

PR tree-optimization/106593
* tree-ssa-threadbackward.cc (back_threader::find_paths):
If the imports from the conditional do not satisfy
gimple_range_ssa_p don't try to thread anything.

sve: Fix fcmuo combine patterns [PR106524]

There's no encoding for fcmuo with zero. This restricts the combine patterns
from accepting zero registers.

gcc/ChangeLog:

PR target/106524
* config/aarch64/aarch64-sve.md (*fcmuo<mode>_nor_combine,
*fcmuo<mode>_bic_combine): Don't accept comparisons against zero.

gcc/testsuite/ChangeLog:

PR target/106524
* gcc.target/aarch64/sve/pr106524.c: New test.

analyzer: out-of-bounds checker [PR106000]

This patch adds an experimental out-of-bounds checker to the analyzer.

The checker was tested on coreutils, curl, httpd and openssh. It is mostly
accurate but does produce false-positives on yacc-generated files and
sometimes when the analyzer misses an invariant. These cases will be
documented in bugzilla.
Regression-tested on Linux x86-64, further ran the analyzer tests with
the -m32 option.

2022-08-11 Tim Lange <mail@tim-lange.me>

gcc/analyzer/ChangeLog:

PR analyzer/106000
* analyzer.opt: Add Wanalyzer-out-of-bounds.
* region-model.cc (class out_of_bounds): Diagnostics base class
for all out-of-bounds diagnostics.
(class past_the_end): Base class derived from out_of_bounds for
the buffer_overflow and buffer_overread diagnostics.
(class buffer_overflow): Buffer overflow diagnostics.
(class buffer_overread): Buffer overread diagnostics.
(class buffer_underflow): Buffer underflow diagnostics.
(class buffer_underread): Buffer overread diagnostics.
(region_model::check_region_bounds): New function to check region
bounds for out-of-bounds accesses.
(region_model::check_region_access):
Add call to check_region_bounds.
(region_model::get_representative_tree): New function that accepts
a region instead of an svalue.
* region-model.h (class region_model):
Add region_model::check_region_bounds.
* region.cc (region::symbolic_p): New predicate.
(offset_region::get_byte_size_sval): Only return the remaining
byte size on offset_regions.
* region.h: Add region::symbolic_p.
* store.cc (byte_range::intersects_p):
Add new function equivalent to bit_range::intersects_p.
(byte_range::exceeds_p): New function.
(byte_range::falls_short_of_p): New function.
* store.h (struct byte_range): Add byte_range::intersects_p,
byte_range::exceeds_p and byte_range::falls_short_of_p.

gcc/ChangeLog:

PR analyzer/106000
* doc/invoke.texi: Add Wanalyzer-out-of-bounds.

gcc/testsuite/ChangeLog:

PR analyzer/106000
* g++.dg/analyzer/pr100244.C: Disable out-of-bounds warning.
* gcc.dg/analyzer/allocation-size-3.c:
Disable out-of-bounds warning.
* gcc.dg/analyzer/memcpy-2.c: Disable out-of-bounds warning.
* gcc.dg/analyzer/pr101962.c: Add dg-warning.
* gcc.dg/analyzer/pr96764.c: Disable out-of-bounds warning.
* gcc.dg/analyzer/pr97029.c:
Add dummy buffer to prevent an out-of-bounds warning.
* gcc.dg/analyzer/realloc-5.c: Add dg-warning.
* gcc.dg/analyzer/test-setjmp.h:
Add dummy buffer to prevent an out-of-bounds warning.
* gcc.dg/analyzer/zlib-3.c: Add dg-bogus.
* g++.dg/analyzer/out-of-bounds-placement-new.C: New test.
* gcc.dg/analyzer/out-of-bounds-1.c: New test.
* gcc.dg/analyzer/out-of-bounds-2.c: New test.
* gcc.dg/analyzer/out-of-bounds-3.c: New test.
* gcc.dg/analyzer/out-of-bounds-container_of.c: New test.
* gcc.dg/analyzer/out-of-bounds-coreutils.c: New test.
* gcc.dg/analyzer/out-of-bounds-curl.c: New test.

analyzer: consider that realloc could shrink the buffer [PR106539]

This patch adds the "shrinks buffer" case to the success_with_move
modelling of realloc.

Regression-tested on Linux x86-64, further ran the analyzer tests with
the -m32 option.

2022-08-11 Tim Lange <mail@tim-lange.me>

gcc/analyzer/ChangeLog:

PR analyzer/106539
* region-model-impl-calls.cc (region_model::impl_call_realloc):
Use the result of get_copied_size as the size for the
sized_regions in realloc.
(success_with_move::get_copied_size): New function.

gcc/testsuite/ChangeLog:

PR analyzer/106539
* gcc.dg/analyzer/pr106539.c: New test.
* gcc.dg/analyzer/realloc-5.c: New test.

[AARCH64] Remove reference to MD_INCLUDES

The comment reference to MD_INCLUDES is not needed
as it is auto generated for long time now even before
aarch64 target was added.

MD_INCLUDES has been auto generated since r0-64489.
Note some targets still manually set MD_INCLUDES and
I suspect those can be changed but I don't have access
to those targets.

Committed as obvious.

Thanks,
Andrew Pinski

gcc/ChangeLog:

* config/aarch64/aarch64.md: Remove comment
about MD_INCLUDES as it is out of date and not needed.

Daily bump.

testsuite: fd-4.c redefines mode_t on AIX.

AIX stdio.h includes sys/types.h, which defines mode_t. The
analyzer/fd-4.c testcase provides a definition of mode_t for creat()
call, which conflicts with the AIX definition. This patch defines an
AIX macro to prevent multiple-definition of the type.

gcc/testsuite/ChangeLog:

* gcc.dg/analyzer/fd-4.c: Define _MODE_T on AIX.

testcase: Fix AIX testsuite failures

Recent testsuite additions trip over AIX-specific features.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/initlist-const1.C: XFAIL on AIX.

analyzer: fix ICE casued by dup2 in sm-fd.cc[PR106551]

This patch fixes the ICE caused by valid_to_unchecked_state,
at analyzer/sm-fd.cc by handling the m_start state in
check_for_dup.

Tested lightly on x86_64.

gcc/analyzer/ChangeLog:
PR analyzer/106551
* sm-fd.cc (check_for_dup): handle the m_start
state when transitioning the state of LHS
of dup, dup2 and dup3 call.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/fd-dup-1.c: New testcases.
* gcc.dg/analyzer/fd-uninit-1.c: Remove bogus
warning.
Signed-off-by: Immad Mir <mirimmad@outlook.com>

c-family: Honor -Wno-init-self for cv-qual vars [PR102633]

Since r11-5188-g32934a4f45a721, we drop qualifiers during l-to-r
conversion by creating a NOP_EXPR.  For e.g.

  const int i = i;

that means that the DECL_INITIAL is '(int) i' and not 'i' anymore.
Consequently, we don't suppress_warning here:

711     case DECL_EXPR:
715       if (VAR_P (DECL_EXPR_DECL (*expr_p))
716           && !DECL_EXTERNAL (DECL_EXPR_DECL (*expr_p))
717           && !TREE_STATIC (DECL_EXPR_DECL (*expr_p))
718           && (DECL_INITIAL (DECL_EXPR_DECL (*expr_p)) == DECL_EXPR_DECL (*expr_p))
719           && !warn_init_self)
720         suppress_warning (DECL_EXPR_DECL (*expr_p), OPT_Winit_self);

because of the check on line 718 -- (int) i is not i.  So -Wno-init-self
doesn't disable the warning as it's supposed to.

The following patch fixes it by moving the suppress_warning call from
c_gimplify_expr to the front ends, at points where we haven't created
the NOP_EXPR yet.

PR middle-end/102633

gcc/c-family/ChangeLog:

* c-gimplify.cc (c_gimplify_expr) <case DECL_EXPR>: Don't call
suppress_warning here.

gcc/c/ChangeLog:

* c-parser.cc (c_parser_initializer): Add new tree parameter.  Use it.
Call suppress_warning.
(c_parser_declaration_or_fndef): Pass d down to c_parser_initializer.
(c_parser_omp_declare_reduction): Pass omp_priv down to
c_parser_initializer.

gcc/cp/ChangeLog:

* decl.cc (cp_finish_decl): Call suppress_warning.

gcc/testsuite/ChangeLog:

* c-c++-common/Winit-self1.c: New test.
* c-c++-common/Winit-self2.c: New test.

Tame path_range_query::compute_imports

This avoids going BBs outside of the path when adding def chains
to the set of imports. It also syncs the code with
range_def_chain::get_def_chain to not miss out on some imports
this function would identify.

* gimple-range-path.cc (path_range_query::compute_imports):
Restrict walking SSA defs to blocks inside the path. Track
the same operands as range_def_chain::get_def_chain does.

tree-optimization/106514 - revisit m_import compute in backward threading

This revisits how we compute imports later used for the ranger path
query during backwards threading.  The compute_imports function
of the path solver ends up pulling the SSA def chain of regular
stmts without limit and since it starts with just the gori imports
of the path exit it misses some interesting names to translate
during path discovery.  In fact with a still empty path this
compute_imports function looks like not the correct tool.

The following instead implements what it does during the path discovery
and since we add the exit block we seed the initial imports and
interesting names from just the exit conditional.  When we then
process interesting names (aka imports we did not yet see the definition
of) we prune local defs but add their uses in a similar way as
compute_imports would have done.

compute_imports also is lacking in its walking of the def chain
compared to range_def_chain::get_def_chain which for example
handles &_1->x specially through range_op_handler and
gimple_range_operand1, so the code copies this.  A fix for
compute_imports will be done separately, also fixing the unbound
walk there.

The patch also properly unwinds m_imports during the path discovery
backtracking and from a debugging session I have verified the two
sets evolve as expected now while previously behaving slightly erratic.

Fortunately the m_imports set now also is shrunken significantly for
the PR69592 testcase (aka PR106514) so that there's overall speedup
when increasing --param max-jump-thread-duplication-stmts as
15 -> 30 -> 60 -> 120 from 1s -> 2s -> 13s -> 27s to with the patch
1s -> 2s -> 4s -> 8s.

This runs into a latent issue in X which doesn't seem to expect
any PHI nodes with a constant argument on an edge inside the path.
But we now have those as interesting, for example for the ICEing
g++.dg/torture/pr100925.C which just has sth like

  if (i)
    x = 1;
  else
    x = 5;
  if (x == 1)
    ...

where we now have the path from if (i) to if (x) and the PHI for x
in the set of imports to consider for resolving x == 1 which IMHO
looks exactly like what we want.  The path_range_query::ssa_range_in_phi
papers over the issue and drops the range to varying instead of
crashing.  I didn't want to mess with this any further in this patch
(but I couldn't resist replacing the loop over PHI args with
PHI_ARG_DEF_FROM_EDGE, so mind the re-indenting).

PR tree-optimization/106514
* tree-ssa-threadbackward.cc (back_threader::find_paths_to_names):
Compute and unwind both m_imports and interesting on the fly during
path discovery.
(back_threader::find_paths): Compute the original m_imports
from just the SSA uses of the exit conditional.  Drop
handling single_succ_to_potentially_threadable_block.
* gimple-range-path.cc (path_range_query::ssa_range_in_phi): Handle
constant PHI arguments without crashing.  Use PHI_ARG_DEF_FROM_EDGE.

* gcc.dg/tree-ssa/ssa-thread-19.c: Un-XFAIL.
* gcc.dg/tree-ssa/ssa-thread-20.c: New testcase.

testsuite: Fix up pr106243* tests on i686-linux [PR106243]

These 2 tests were FAILing on i686-linux or e.g. with
--target_board=unix/-m32/-mno-sse on x86_64-linux due to
-Wpsabi warnings.

2022-08-11 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/106243
* gcc.dg/pr106243.c: Add -Wno-psabi to dg-options.
* gcc.dg/pr106243-1.c: Likewise.

testsuite: Fix up pr104992* tests on i686-linux [PR104992]

These 2 tests were FAILing on i686-linux or e.g. with
--target_board=unix/-m32/-mno-sse on x86_64-linux due to
-Wpsabi warnings and also because dg-options in the latter
test has been ignored due to missing space, so even -O2
wasn't passed at all.

2022-08-11 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/104992
* gcc.dg/pr104992.c: Add -Wno-psabi to dg-options.
* g++.dg/pr104992-1.C: Likewise. Add space between " and } in
dg-options.

Fix path query compute_imports for external path

The following fixes the use of compute_imports from the backwards
threader which ends up accessing stale m_path from a previous
threading attempt.  The fix is to pass in the path explicitely
(and not the exit), and initializing it with the exit around this
call from the backwards threader.  That unfortunately exposed that
we rely on this broken behavior as the new testcase shows.  The
missed threading can be restored by registering all relations
from conditions on the path during solving, for the testcase the
particular important case is for relations provided by the path
entry conditional.

I've verified that the GORI query for imported ranges on edges
is not restricted this way.

This regresses the new ssa-thread-19.c testcase which is exactly
a case for the other patch re-doing how we compute imports since
this misses imports for defs that are not on the dominating path
from the exit.

That's one of the cases this regresses (it also progresses a few
due to more or the correct relations added).  Overall it
reduces the number of threads from 98649 to 98620 on my set of
cc1files.  I think it's a reasonable intermediate step to find
a stable, less random ground to compare stats to.

* gimple-range-path.h (path_range_query::compute_imports):
Take path as argument, not the exit block.
* gimple-range-path.cc (path_range_query::compute_imports):
Likewise, and adjust, avoiding possibly stale m_path.
(path_range_query::compute_outgoing_relations): Register
relations for all conditionals.
* tree-ssa-threadbackward.cc (back_threader::find_paths):
Adjust.

* gcc.dg/tree-ssa/ssa-thread-18.c: New testcase.
* gcc.dg/tree-ssa/ssa-thread-19.c: Likewise, but XFAILed.

rs6000: Simplify some code with rs6000_builtin_is_supported

In function rs6000_init_builtins, there is a oversight that
in one target debugging hunk with TARGET_DEBUG_BUILTIN we
missed to handle enum bif_enable ENB_CELL.  It's easy to
fix it by adding another if case.  But considering the long
term maintainability, this patch updates it with the existing
function rs6000_builtin_is_supported, which centralizes the
related conditions for different enum bif_enable, we only
need to update that function once some condition needs to
be changed later.  This also simplifies another usage in
function rs6000_expand_builtin.

gcc/ChangeLog:

* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Fix the
oversight on ENB_CELL by simplifying with rs6000_builtin_is_supported.
(rs6000_expand_builtin): Simplify with rs6000_builtin_is_supported.

rs6000: Remove stale rs6000_global_entry_point_needed_p

r10-631 had renamed rs6000_global_entry_point_needed_p to
rs6000_global_entry_point_prologue_needed_p. This is to
remove the stale function declaration.

gcc/ChangeLog:

* config/rs6000/rs6000-internal.h (rs6000_global_entry_point_needed_p):
Remove function declaration.

Daily bump.

tree-optimization/106513 - fix mistake in bswap symbolic number shifts

This fixes a mistake in typing a local variable in the symbolic
shift routine.

PR tree-optimization/106513
* gimple-ssa-store-merging.cc (do_shift_rotate): Use uint64_t
for head_marker.

* gcc.dg/torture/pr106513.c: New testcase.

lto: respect jobserver in parallel WPA streaming

PR lto/106328

gcc/ChangeLog:

* opts-jobserver.h (struct jobserver_info): Add pipefd.
(jobserver_info::connect): New.
(jobserver_info::disconnect): Likewise.
(jobserver_info::get_token): Likewise.
(jobserver_info::return_token): Likewise.
* opts-common.cc: Implement the new functions.

gcc/lto/ChangeLog:

* lto.cc (wait_for_child): Decrement nruns once a process
finishes.
(stream_out_partitions): Use job server if active.
(do_whole_program_analysis): Likewise.

lto: support --jobserver-style=fifo for recent GNU make

gcc/ChangeLog:

* opts-jobserver.h: Add one member.
* opts-common.cc (jobserver_info::jobserver_info): Parse FIFO
format of --jobserver-auth.

Factor out jobserver_active_p.

gcc/ChangeLog:

* gcc.cc (driver::detect_jobserver): Remove and move to
jobserver.h.
* lto-wrapper.cc (jobserver_active_p): Likewise.
(run_gcc): Likewise.
* opts-jobserver.h: New file.
* opts-common.cc (jobserver_info::jobserver_info): New function.