Andrew Pinski [Tue, 11 Jul 2023 04:53:24 +0000 (21:53 -0700)]
analyzer: Remove check of unsigned_char in maybe_undo_optimize_bit_field_compare.
The check for the type seems unnecessary and gets in the way sometimes.
Also with a patch I am working on for match.pd, it causes a failure to happen.
Before my patch the IR was:
_1 = BIT_FIELD_REF <s, 8, 16>;
_2 = _1 & 1;
_3 = _2 != 0;
_4 = (int) _3;
__analyzer_eval (_4);
Where _2 was an unsigned char type.
And After my patch we have:
_1 = BIT_FIELD_REF <s, 8, 16>;
_2 = (int) _1;
_3 = _2 & 1;
__analyzer_eval (_3);
But in this case, the BIT_AND_EXPR is in an int type.
OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.
gcc/analyzer/ChangeLog:
* region-model-manager.cc (maybe_undo_optimize_bit_field_compare): Remove
the check for type being unsigned_char_type_node.
Andrew Pinski [Sat, 9 Dec 2023 21:43:23 +0000 (13:43 -0800)]
expr: catch more `a*bool` while expanding [PR 112935]
After r14-1655-g52c92fb3f40050 (and the other commits
which touch zero_one_valued_p), we end up with a with
`bool * a` but where the bool is an SSA name that might not
have non-zero bits set on it (to 0x1) even though it
does the non-zero bits would be 0x1.
The case of coremarks, it is only phiopt4 which adds the new
ssa name and nothing afterwards updates the nonzero bits on it.
This fixes the regression by using gimple_zero_one_valued_p
rather than tree_nonzero_bits to match the cases where the
SSA_NAME didn't have the non-zero bits set.
gimple_zero_one_valued_p handles one level of cast and also
and an `&`.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
PR middle-end/112935
* expr.cc (expand_expr_real_2): Use
gimple_zero_one_valued_p instead of tree_nonzero_bits
to find boolean defined expressions.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
[PATCH] wrong code on m68k with -mlong-jump-table-offsets and -malign-int (PR target/112413)
On m68k the compiler assumes that the PC-relative jump-via-jump-table
instruction and the jump table are adjacent with no padding in between.
When -mlong-jump-table-offsets is combined with -malign-int, a 2-byte
nop may be inserted before the jump table, causing the jump to add the
fetched offset to the wrong PC base and thus jump to the wrong address.
Fixed by referencing the jump table via its label. On the test case
in the PR the object code change is (the moveal at 16 is the nop):
Bootstrapped and tested on m68k-linux-gnu, no regressions.
Note: I don't have commit rights to I would need assistance applying this.
PR target/112413
gcc/
* config/m68k/linux.h (ASM_RETURN_CASE_JUMP): For
TARGET_LONG_JUMP_TABLE_OFFSETS, reference the jump table
via its label.
* config/m68k/m68kelf.h (ASM_RETURN_CASE_JUMP): Likewise.
* config/m68k/netbsd-elf.h (ASM_RETURN_CASE_JUMP): Likewise.
Andre Vieira [Mon, 11 Dec 2023 14:24:41 +0000 (14:24 +0000)]
aarch64: enable mixed-types for aarch64 simdclones
This patch enables the use of mixed-types for simd clones for AArch64, adds
aarch64 as a target_vect_simd_clones and corrects the way the simdlen is chosen
for non-specified simdlen clauses according to the 'Vector Function Application
Binary Interface Specification for AArch64'.
Additionally this patch also restricts combinations of simdlen and
return/argument types that map to vectors larger than 128 bits as we currently
do not have a way to represent these types in a way that is consistent
internally and externally.
gcc/ChangeLog:
* config/aarch64/aarch64.cc (lane_size): New function.
(aarch64_simd_clone_compute_vecsize_and_simdlen): Determine simdlen according to NDS rule
and reject combination of simdlen and types that lead to vectors larger than 128bits.
Patrick Palka [Mon, 11 Dec 2023 14:48:04 +0000 (09:48 -0500)]
c++: alias CTAD and specializations table
A rewritten guide for alias CTAD isn't really a specialization of the
original guide, so we shouldn't register it as such. This avoids an ICE
in the below modules testcase for which we otherwise crash due to the
guide's empty DECL_CONTEXT when walking the specializations table. It
also preemptively avoids the same ICE in modules/concept-6 in C++23 mode
with the inherited CTAD patch.
gcc/cp/ChangeLog:
* pt.cc (alias_ctad_tweaks): Pass use_spec_table=false to
tsubst_decl.
gcc/testsuite/ChangeLog:
* g++.dg/modules/concept-8.h: New test.
* g++.dg/modules/concept-8_a.H: New test.
* g++.dg/modules/concept-8_b.C: New test.
Robin Dapp [Mon, 11 Dec 2023 13:16:04 +0000 (14:16 +0100)]
RISC-V: testsuite: Fix strcmp-run.c test.
This fixes expectations in the strcmp-run test which would sometimes
fail with newlib. The test expects libc strcmp return values and
asserts the vectorized result is similar to those. Therefore hard-code
the expected results instead of relying on a strcmp call.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: Adjust test
expectation and target selector.
* gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: Adjust
target selector.
* gcc.target/riscv/rvv/autovec/builtin/strncmp-run.c: Ditto.
Tobias Burnus [Mon, 11 Dec 2023 14:19:02 +0000 (15:19 +0100)]
OpenMP: Support acquires/release in 'omp require atomic_default_mem_order'
This is an OpenMP 5.2 feature.
gcc/c/ChangeLog:
* c-parser.cc (c_parser_omp_requires): Handle acquires/release
in atomic_default_mem_order clause.
(c_parser_omp_atomic): Update.
gcc/cp/ChangeLog:
* parser.cc (cp_parser_omp_requires): Handle acquires/release
in atomic_default_mem_order clause.
(cp_parser_omp_atomic): Update.
gcc/fortran/ChangeLog:
* gfortran.h (enum gfc_omp_requires_kind): Add
OMP_REQ_ATOMIC_MEM_ORDER_ACQUIRE and OMP_REQ_ATOMIC_MEM_ORDER_RELEASE.
(gfc_namespace): Add a 7th bit to omp_requires.
* module.cc (enum ab_attribute): Add AB_OMP_REQ_MEM_ORDER_ACQUIRE
and AB_OMP_REQ_MEM_ORDER_RELEASE
(mio_symbol_attribute): Handle it.
* openmp.cc (gfc_omp_requires_add_clause): Update for acquire/release.
(gfc_match_omp_requires): Likewise.
(gfc_match_omp_atomic): Handle them for atomic_default_mem_order.
* parse.cc: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/gomp/requires-3.c: Update for now valid code.
* gfortran.dg/gomp/requires-3.f90: Likewise.
* gfortran.dg/gomp/requires-2.f90: Update dg-error.
* gfortran.dg/gomp/requires-5.f90: Likewise.
* c-c++-common/gomp/requires-5.c: New test.
* c-c++-common/gomp/requires-6.c: New test.
* c-c++-common/gomp/requires-7.c: New test.
* c-c++-common/gomp/requires-8.c: New test.
* gfortran.dg/gomp/requires-10.f90: New test.
* gfortran.dg/gomp/requires-11.f90: New test.
Rainer Orth [Mon, 11 Dec 2023 12:38:19 +0000 (13:38 +0100)]
ada: Fix Ada bootstrap on FreeBSD
Ada bootstrap on FreeBSD/amd64 was also broken by the recent warning
changes:
terminals.c: In function 'allocate_pty_desc':
terminals.c:1200:12: error: implicit declaration of function 'openpty'; did you
mean 'openat'? [-Wimplicit-function-declaration]
1200 | status = openpty (&master_fd, &slave_fd, NULL, NULL, NULL);
| ^~~~~~~
| openat
terminals.c: At top level:
terminals.c:1268:9: warning: "TABDLY" redefined
1268 | #define TABDLY 0
| ^~~~~~
In file included from /usr/include/termios.h:38,
from terminals.c:1109:
/usr/include/sys/_termios.h:111:9: note: this is the location of the previous definition
111 | #define TABDLY 0x00000004 /* tab delay mask */
| ^~~~~~
make[7]: *** [../gcc-interface/Makefile:302: terminals.o] Error 1
Fixed by including the necessary header and guarding the fallback
definition of TABDLY.
This allowed a 64-bit-only bootstrap on x86_64-unknown-freebsd14.0 to
complete successfully.
if (HARD_REGISTER_NUM_P (regno)
&& partial_subreg_p (use->mode (), mode))
Assertion failed on partial_subreg_p which is:
inline bool
partial_subreg_p (machine_mode outermode, machine_mode innermode)
{
/* Modes involved in a subreg must be ordered. In particular, we must
always know at compile time whether the subreg is paradoxical. */
poly_int64 outer_prec = GET_MODE_PRECISION (outermode);
poly_int64 inner_prec = GET_MODE_PRECISION (innermode);
gcc_checking_assert (ordered_p (outer_prec, inner_prec)); -----> cause ICE.
return maybe_lt (outer_prec, inner_prec);
}
RISC-V VSETVL PASS is an advanced lazy vsetvl insertion PASS after RA (register allocation).
The rootcause is that we have a pattern (reduction instruction) that includes both VLA (length-agnostic) and VLS (fixed-length) modes.
In the Linux kernel, u64/s64 are [un]signed long long, not [un]signed
long. This means that when the `arm_neon.h' header is used by the
kernel, any use of the `uint64_t' / `in64_t' types needs to be
correctly cast to the correct `__builtin_aarch64_simd_di' /
`__builtin_aarch64_simd_df' types when calling the relevant ACLE
builtins.
This patch adds the necessary fixes to ensure that `vstl1_*' and
`vldap1_*' intrinsics are correctly defined for use by the kernel.
gcc/ChangeLog:
* config/aarch64/arm_neon.h (vldap1_lane_u64): Add
`const' to `__builtin_aarch64_simd_di *' cast.
(vldap1q_lane_u64): Likewise.
(vldap1_lane_s64): Cast __src to `const __builtin_aarch64_simd_di *'.
(vldap1q_lane_s64): Likewise.
(vldap1_lane_f64): Cast __src to `const __builtin_aarch64_simd_df *'.
(vldap1q_lane_f64): Cast __src to `const __builtin_aarch64_simd_df *'.
(vldap1_lane_p64): Add `const' to `__builtin_aarch64_simd_di *' cast.
(vldap1q_lane_p64): Add `const' to `__builtin_aarch64_simd_di *' cast.
(vstl1_lane_u64): remove stray `const'.
(vstl1_lane_s64): Cast __src to `__builtin_aarch64_simd_di *'.
(vstl1q_lane_s64): Likewise.
(vstl1_lane_f64): Cast __src to `const __builtin_aarch64_simd_df *'.
(vstl1q_lane_f64): Likewise.
Robin Dapp [Fri, 8 Dec 2023 11:50:01 +0000 (12:50 +0100)]
RISC-V: Recognize stepped series in expand_vec_perm_const.
We currently try to recognize various forms of stepped (const_vector)
sequence variants in expand_const_vector. Because of complications with
canonicalization and encoding it is easier to identify such patterns
in expand_vec_perm_const_1 already where perm.series_p () is available.
This patch introduces shuffle_series as new permutation pattern and
tries to recognize series like [base0 base1 base1 + step ...]. If such
a series is found the series is expanded by expand_vec_series and a
gather is emitted.
On top the patch fixes the step recognition in expand_const_vector
for stepped series where such a series would end up before.
This fixes several execution failures when running code compiled for a
scalable vector size of 128 on a target with vlen = 256 or higher.
The problem was only noticed there because the encoding for a reversed
[2 2]-element vector ("3 2 1 0") is { [1 2], [0 2], [1 4] }.
Some testcases that failed were:
vect-alias-check-18.c
vect-alias-check-1.F90
pr64365.c
On a 128-bit target, only the first two elements are used. The
third element causing the complications only comes into effect at
vlen = 256.
With this patch the testsuite results are similar with vlen = 128,
vlen = 256 as well as vlen = 512 (apart from the fixed-vlmax tests of
course).
gcc/ChangeLog:
PR target/112853
* config/riscv/riscv-v.cc (expand_const_vector): Fix step
calculation.
(modulo_sel_indices): Also perform modulo for variable-length
constants.
(shuffle_series): Recognize series permutations.
(expand_vec_perm_const_1): Add shuffle_series.
liuhongt [Thu, 9 Nov 2023 08:03:11 +0000 (16:03 +0800)]
Simplify vector ((VCE (a cmp b ? -1 : 0)) < 0) ? c : d to just (VCE ((a cmp b) ? (VCE c) : (VCE d))).
When I'm working on PR112443, I notice there's some misoptimizations:
after we fold _mm{,256}_blendv_epi8/pd/ps into gimple, the backend
fails to combine it back to v{,p}blendv{v,ps,pd} since the pattern is
too complicated, so I think maybe we should hanlde it in the gimple
level.
since _7 is either -1 or 0, the selection of _7 < 0 ? _8 : _9 should
be euqal to _1 ? b : a as long as TYPE_PRECISION of the component type
of the second VEC_COND_EXPR is less equal to the first one.
The patch add a gimple pattern to handle that.
gcc/ChangeLog:
* match.pd (VCE (a cmp b ? -1 : 0) < 0) ? c : d ---> (VCE ((a
cmp b) ? (VCE:c) : (VCE:d))): New gimple simplication.
gcc/testsuite/ChangeLog:
* gcc.target/i386/avx512vl-blendv-3.c: New test.
* gcc.target/i386/blendv-3.c: New test.
* config/riscv/vector.md: Support highest overlap for wv instructions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/pr112431-39.c: New test.
* gcc.target/riscv/rvv/base/pr112431-40.c: New test.
* gcc.target/riscv/rvv/base/pr112431-41.c: New test.
Jakub Jelinek [Mon, 11 Dec 2023 07:34:15 +0000 (08:34 +0100)]
extend.texi: Mark builtin arguments with @var{...}
In many cases we just specify types for the builtin arguments, in other cases
types and names with @var{name} syntax, and in other case with just name.
Shall we tweak that somehow? If the argument names are unimportant, perhaps
it is fine to leave that out, but shouldn't we always use @var{...} around
the parameter names when specified?
On Fri, Dec 01, 2023 at 10:43:57AM -0700, Sandra Loosemore wrote:
> Yup. The Texinfo manual says: "When using @deftypefn command and
> variations, you should mark parameter names with @var to distinguish these
> from data type names, keywords, and other parts of the literal syntax of the
> programming language."
Here is a patch which does that (but not adding types to where they were
missing, that will be harder to search for).
For example, a poly value [15, 16] is computed by csrr vlen + multiple scalar integer instructions.
However, such compile-time unknown value need to be computed when it is scalable vector, that is !BYTES_PER_RISCV_VECTOR.is_constant (),
since csrr vlenb = [16, 0] when -march=rv64gcv --param=riscv-autovec-preference=fixed-vlmax and we have no chance to compute compile-time POLY value.
Also, we never reach the situation to compute a compile time unknown value when it is FIXED-VLMAX vector. So disable POLY selftest for FIXED-VLMAX.
gcc/ChangeLog:
* config/riscv/riscv-selftests.cc (riscv_run_selftests):
Remove poly self test when FIXED-VLMAX.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/base/poly-selftest-1.c: New test.
Nathaniel Shead [Sat, 28 Oct 2023 05:04:52 +0000 (16:04 +1100)]
c++: Fix noexcept checking for trivial operations [PR96090]
This patch stops eager folding of trivial operations (construction and
assignment) from occurring when checking for noexceptness. This was
previously done in PR c++/53025, but only for copy/move construction,
and the __is_nothrow_xible builtins did not receive the same treatment
when they were added.
To handle `is_nothrow_default_constructible`, the patch also ensures
that when no parameters are passed we do value initialisation instead of
just building the constructor call: in particular, value-initialisation
doesn't necessarily actually invoke the constructor for trivial default
constructors, and so we need to handle this case as well.
This is contrary to the proposed resolution of CWG2820; for now we just
ensure it matches the behaviour of the `noexcept` operator and create
testcases formalising this, and if that issue gets accepted we can
revisit.
PR c++/96090
PR c++/100470
gcc/cp/ChangeLog:
* call.cc (build_over_call): Prevent folding of trivial special
members when checking for noexcept.
* method.cc (constructible_expr): Perform value-initialisation
for empty parameter lists.
(is_nothrow_xible): Treat as noexcept operator.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/noexcept81.C: New test.
* g++.dg/ext/is_nothrow_constructible7.C: New test.
* g++.dg/ext/is_nothrow_constructible8.C: New test.
Nathaniel Shead [Thu, 23 Nov 2023 12:15:19 +0000 (23:15 +1100)]
c++: Clear uninstantiated template friend when instantiating [PR104234]
Otherwise attempting to get the originating module declaration ICEs
because the DECL_CHAIN of an instantiated friend template is no longer
its context.
gcc/testsuite/
PR target/112707
* gcc.target/powerpc/pr112707.h: New.
* gcc.target/powerpc/pr112707-2.c: New.
* gcc.target/powerpc/pr112707-3.c: New.
* gcc.target/powerpc/pr88558-p7.c: Check fctid on ilp32 and
has_arch_ppc64 as it's now guarded by powerpc64.
* gcc.target/powerpc/pr88558-p8.c: Likewise.
* gfortran.dg/nint_p7.f90: Add powerpc64 target requirement as
lround<mode>di2 is now guarded by powerpc64.
Haochen Gui [Mon, 11 Dec 2023 00:40:34 +0000 (08:40 +0800)]
rs6000: Enable lrint<mode>si2 on old archs with stfiwx enabled
The powerpc 32-bit processors (e.g. 5470) supports "fctiw" instruction,
but the instruction can't be generated on such platforms as the insn is
guard by TARGET_POPCNTD. The root cause is SImode in float register is
supported from Power7. Actually implementation of "fctiw" only needs
stfiwx which is supported by the old 32-bit processors. This patch
enables "fctiw" expand for these processors.
Tom Tromey [Sat, 9 Dec 2023 16:19:30 +0000 (09:19 -0700)]
Add some new DW_IDX_* constants
I've reimplemented the .debug_names code in GDB -- it was quite far
from being correct, and the new implementation is much closer to what
is specified by DWARF.
However, the new writer in GDB needs to emit some symbol properties,
so that the reader can be fully functional. This patch adds a few new
DW_IDX_* constants, and tries to document the existing extensions as
well. (My patch series add more documentation of these to the GDB
manual as well.)
include/ChangeLog
2023-12-10 Tom Tromey <tom@tromey.com>
* dwarf2.def (DW_IDX_GNU_internal, DW_IDX_GNU_external): Comment.
(DW_IDX_GNU_main, DW_IDX_GNU_language, DW_IDX_GNU_linkage_name):
New constants.
aarch64: Fix invalid subregs for BE svread/write_za
Multi-register svread_za and svwrite_za are implemented using one
pattern per register count, with the register contents being bitcast
on entry (for writes) or return (for reads). Previously we relied
on subregs for this, with the subreg for reads being handled by
target-independent code. But using subregs isn't correct for many
big-endian cases, where following subreg rules often requires actual
instructions. The semantics are instead supposed to be those of
svreinterpret.
gcc/
PR target/112931
PR target/112933
* config/aarch64/aarch64-protos.h (aarch64_sve_reinterpret): Declare.
* config/aarch64/aarch64.cc (aarch64_sve_reinterpret): New function.
* config/aarch64/aarch64-sve-builtins-sme.cc (svread_za_impl::expand)
(svwrite_za_impl::expand): Use it to cast the SVE register to the
right mode.
VNx16QI (the SVE register byte mode) is the only SVE mode for which
LD1 and LDR result in the same register layout for big-endian. It is
therefore the only mode for which we allow LDR and STR to be used for
big-endian SVE moves.
The SME support sometimes needs to use LDR and STR to save and restore
Z register contents around an SMSTART/SMSTOP SM. It therefore needs
to use VNx16QI regardless of the type of value that is stored in the
Z registers.
gcc/
PR target/112930
* config/aarch64/aarch64.cc (aarch64_sme_mode_switch_regs::add_reg):
Force specific SVE modes for single registers as well as structures.
Big-endian targets need to save Z8-Z15 in the same order as
the registers would appear for D8-D15, because the layout is
mandated by the EH ABI. BE targets therefore use ST1D instead
of the normal STR for those registers (but not for others).
That difference is already tested elsewhere and isn't important
for the SME tests. This patch therefore restricts the affected
tests to LE.
gcc/testsuite/
* gcc.target/aarch64/sme/call_sm_switch_5.c: Restrict tests that
contain Z8-Z23 saves to little-endian.
* gcc.target/aarch64/sme/call_sm_switch_8.c: Likewise.
* gcc.target/aarch64/sme/locally_streaming_1.c: Likewise.
Harald Anlauf [Wed, 6 Dec 2023 19:42:27 +0000 (20:42 +0100)]
Fortran: function returning contiguous class array [PR105543]
gcc/fortran/ChangeLog:
PR fortran/105543
* resolve.cc (resolve_symbol): For a CLASS-valued function having a
RESULT clause, ensure that attr.class_ok is set for its symbol as
well as for its resolved result variable.
gcc/testsuite/ChangeLog:
PR fortran/105543
* gfortran.dg/contiguous_13.f90: New test.
Ken Matsui [Thu, 7 Dec 2023 05:32:58 +0000 (21:32 -0800)]
c++: Accept the use of built-in trait identifiers
This patch accepts the use of built-in trait identifiers when they are
actually not used as traits. Specifically, we check if the subsequent
token is '(' for ordinary built-in traits or is '<' only for the special
__type_pack_element built-in trait. If those identifiers are used
differently, the parser treats them as normal identifiers. This allows
us to accept code like: struct __is_pointer {};.
gcc/cp/ChangeLog:
* parser.cc (cp_lexer_lookup_trait): Rename to ...
(cp_lexer_peek_trait): ... this. Handle a subsequent token for
the corresponding built-in trait.
(cp_lexer_lookup_trait_expr): Rename to ...
(cp_lexer_peek_trait_expr): ... this.
(cp_lexer_lookup_trait_type): Rename to ...
(cp_lexer_peek_trait_type): ... this.
(cp_lexer_next_token_is_decl_specifier_keyword): Call
cp_lexer_peek_trait_type.
(cp_parser_simple_type_specifier): Likewise.
(cp_parser_primary_expression): Call cp_lexer_peek_trait_expr.
Ken Matsui [Thu, 7 Dec 2023 05:32:57 +0000 (21:32 -0800)]
c-family, c++: Look up built-in traits via identifier node
Since RID_MAX soon reaches 255 and all built-in traits are used
approximately once in a C++ translation unit, this patch removes
all RID values for built-in traits and uses the identifier node to
look up the specific trait. Rather than holding traits as keywords,
we set all trait identifiers as cik_trait, which is a new
cp_identifier_kind. As cik_reserved_for_udlit was unused and
cp_identifier_kind is 3 bits, we replaced the unused field with the new
cik_trait. Also, the later patch handles a subsequent token to the
built-in identifier so that we accept the use of non-function-like
built-in trait identifiers.
gcc/c-family/ChangeLog:
* c-common.cc (c_common_reswords): Remove all mappings of
built-in traits.
* c-common.h (enum rid): Remove all RID values for built-in
traits.
gcc/cp/ChangeLog:
* cp-objcp-common.cc (names_builtin_p): Remove all RID value
cases for built-in traits. Check for built-in traits via
the new cik_trait kind.
* cp-tree.h (enum cp_trait_kind): Set its underlying type to
addr_space_t.
(struct cp_trait): New struct to hold trait information.
(cp_traits): New array to hold a mapping to all traits.
(cik_reserved_for_udlit): Rename to ...
(cik_trait): ... this.
(IDENTIFIER_ANY_OP_P): Exclude cik_trait.
(IDENTIFIER_TRAIT_P): New macro to detect cik_trait.
* lex.cc (cp_traits): Define its values, declared in cp-tree.h.
(init_cp_traits): New function to set cik_trait and
IDENTIFIER_CP_INDEX for all built-in trait identifiers.
(cxx_init): Call init_cp_traits function.
* parser.cc (cp_lexer_lookup_trait): New function to look up a
built-in trait by IDENTIFIER_CP_INDEX.
(cp_lexer_lookup_trait_expr): Likewise, look up an
expression-yielding built-in trait.
(cp_lexer_lookup_trait_type): Likewise, look up a type-yielding
built-in trait.
(cp_keyword_starts_decl_specifier_p): Remove all RID value cases
for built-in traits.
(cp_lexer_next_token_is_decl_specifier_keyword): Handle
type-yielding built-in traits.
(cp_parser_primary_expression): Remove all RID value cases for
built-in traits. Handle expression-yielding built-in traits.
(cp_parser_trait): Handle cp_trait instead of enum rid.
(cp_parser_simple_type_specifier): Remove all RID value cases
for built-in traits. Handle type-yielding built-in traits.
Co-authored-by: Patrick Palka <ppalka@redhat.com> Signed-off-by: Ken Matsui <kmatsui@gcc.gnu.org>
Jeff Law [Sun, 10 Dec 2023 17:41:05 +0000 (10:41 -0700)]
[committed] Support uaddv and usubv on the H8
This patch adds uaddv/usubv support on the H8 port to speed up those pesky
builtin-overflow tests. It's a variant of something I'd been running for a
while -- the major change between the old approach I'd been using and this
patch is this version does not expose the CC register until after reload to be
consistent with the rest of the H8 port.
The general approach is to first clear the GPR that's going to hold the
overflow status, perform the arithmetic operation (add/sub), then use addx to
move the overflow indicator (in the C bit) into the GPR holding the overflow
status.
That's a significant improvement over the mess of logicals that's generated by
the generic code.
Handling signed overflow is possible and something I'll probably port to this
scheme at some point. It's a bit more complex because we can't trivially move
the bit from CCR into the right position in a GPR and other quirks of the H8.
This has been regression tested on the H8 without problems. Pushing to the
trunk.
gcc/
* config/h8300/addsub.md (uaddv<mode>4, usubv<mode>4): New expanders.
(uaddv): New define_insn_and_split plus post-reload pattern.
Jeff Law [Sun, 10 Dec 2023 17:29:23 +0000 (10:29 -0700)]
[committed] Provide patterns for signed bitfield extractions on H8
Inspired by Roger's work on the ARC port, this patch provides a
define_and_split pattern to optimize sign extended bitfields starting at
position 0 using an approach that doesn't require shifting.
It then builds on that to provide another define_and_split pattern to support
arbitrary signed bitfield extractions -- it uses a right logical shift to move
the bitfield into position 0, then the specialized pattern above to sign extend
the MSB of the field through the rest of the register.
This is often, but certainly not always, better than a two shift approach. The
code uses the sizes of the sequences to select between the two shift approach
and single shift with extension from an arbitrary location approach.
There's certainly further improvements that could be made here, but I think
we're getting the bulk of the improvements already.
Regression tested on the H8 port without errors. Installing on the trunk.
gcc/
* config/h8300/h8300-protos.h (use_extvsi): Prototype.
* config/h8300/combiner.md: Two new define_insn_and_split patterns
to implement signed bitfield extractions.
* config/h8300/h8300.cc (use_extvsi): New function.
Jeff Law [Sun, 10 Dec 2023 17:05:18 +0000 (10:05 -0700)]
[committed] Fix length computation of single bit bitfield extraction on H8
Various approaches are used to optimize extracting a sign extended single bit
bitfield. The length computation of 10 bytes was conservatively correct, but
inaccurate.
In particular when the bit we want is in the low half word we don't need the
move high half to low half instruction. Account for that in the length
computation.
This was spotted when looking at regressions in the generalized signed bitfield
extraction pattern.
This has been regression tested on the H8 port.
gcc/
* config/h8300/combiner.md (single bit signed bitfield extraction): Fix
length computation when the bit we want is in the low half word.
Jeff Law [Sun, 10 Dec 2023 16:32:55 +0000 (09:32 -0700)]
[committed] Fix length computation for logical shifts on H8
This fixes the length computation for logical shifts on the H8/SX.
The H8/SX has a richer set of logical shifts compared to early parts in the H8
family. It has special 2 byte instructions for shifts by power of two
immediate values as well as a special 4 byte shift by other immediate values.
These were never accounted for (AFIACT) in the length computation for shifts.
Until now that's mostly just affected branch shortening. But an upcoming patch
uses instruction lengths to select between two potential sequences and getting
these lengths wrong will cause it to miss optimization opportunities on the
H8/SX.
gcc
* config/h8300/h8300.cc (compute_a_shift_length): Fix computation
of logical shifts on the H8/SX.
Jakub Jelinek [Sat, 9 Dec 2023 20:41:00 +0000 (21:41 +0100)]
phiopt: Fix ICE with large --param l1-cache-line-size= [PR112887]
This function is never called when param_l1_cache_line_size is 0,
but it uses int and unsigned int variables to hold alignment in
bits, so for large param_l1_cache_line_size it is zero and e.g.
DECL_ALIGN () % param_align_bits can divide by zero.
Looking at the code, the function uses tree_fits_uhwi_p on the trees
before converting them using tree_to_uhwi to int variables, which
looks just wrong, either it would need to punt if it doesn't fit
into those and also check for overflows during the computation,
or use unsigned HOST_WIDE_INT for all of this. That also fixes
the division by zero, as param_l1_cache_line_size maximum is INT_MAX,
that multiplied by 8 will always fit.
2023-12-09 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/112887
* tree-ssa-phiopt.cc (hoist_adjacent_loads): Change type of
param_align, param_align_bits, offset1, offset2, size2 and align1
variables from int or unsigned int to unsigned HOST_WIDE_INT.
Jonathan Wakely [Fri, 8 Dec 2023 13:47:04 +0000 (13:47 +0000)]
libstdc++: Fix resolution of LWG 4016 for std::ranges::to [PR112876]
What I implemented in r14-6199-g45630fbcf7875b does not match what I
proposed for LWG 4016, and it imposes additional, unwanted requirements
on the emplace and insert member functions of the container being
populated.
libstdc++-v3/ChangeLog:
PR libstdc++/112876
* include/std/ranges (ranges::to): Do not try to use an iterator
returned by the container's emplace or insert member functions.
* testsuite/std/ranges/conv/1.cc (Cont4::emplace, Cont4::insert):
Use the iterator parameter. Do not return an iterator.
Jakub Jelinek [Sat, 9 Dec 2023 09:20:05 +0000 (10:20 +0100)]
c++: Don't diagnose ignoring of attributes if all ignored attributes are attribute_ignored_p
There is another thing I wonder about: with -Wno-attributes= we are
supposed to ignore the attributes altogether, but we are actually still
warning about them when we emit these generic warnings about ignoring
all attributes which appertain to this and that (perhaps with some
exceptions we first remove from the attribute chain), like:
void foo () { [[foo::bar]]; }
with -Wattributes -Wno-attributes=foo::bar
Shouldn't we call some helper function in cases like this and warn
not when std_attrs (or how the attribute chain var is called) is non-NULL,
but if it is non-NULL and contains at least one non-attribute_ignored_p
attribute?
I've kept warnings for cases where the C++ standard says explicitly any
attributes aren't ok -
"If an attribute-specifier-seq appertains to a friend declaration, that
declaration shall be a definition."
or
https://eel.is/c++draft/dcl.type.elab#3
or
https://eel.is/c++draft/temp.spec#temp.explicit-3
For some changes I haven't figured out how could I cover it in the
testsuite.
Note, C uses a different strategy, it has c_warn_unused_attributes
function which warns about all the attributes one by one unless they
are ignored (or allowed in certain position).
Though that is just a single diagnostic wording, while C++ FE just warns
that there are some ignored attributes and doesn't name them individually
(except for namespace and using namespace) and uses different wordings in
different spots.
testsuite: Remove gcc.dg/tree-ssa/scev-3.c -4.c and 5.c
These tests were recently xfailed on ilp32 targets though
passing on almost all ilp32 targets (known exceptions: ia32
and some arm subtargets). They've been changed around too
much to remain useful.
Alexandre Oliva [Sat, 9 Dec 2023 00:41:33 +0000 (21:41 -0300)]
strub: skip emutls after strubm errors
The emutls pass requires PROP_ssa, but if the strubm pass (or any
other pre-SSA pass) issues errors, all of the build_ssa_passes are
skipped, so the property is not set, but emutls still attempts to run,
on targets that use it, despite earlier errors, so it hits the
unsatisfied requirement.
Adjust emutls to be skipped in case of earlier errors.
for gcc/ChangeLog
* tree-emutls.cc: Include diagnostic-core.h.
(pass_ipa_lower_emutls::gate): Skip if errors were seen.
Patrick Palka [Fri, 8 Dec 2023 21:57:13 +0000 (16:57 -0500)]
c++: decltype of (non-captured variable) [PR83167]
For decltype((x)) within a lambda where x is not captured, we dubiously
require that the lambda has a capture default, unlike for decltype(x).
But according to [expr.prim.id.unqual]/3 we should just ignore the lambda
in this case. This patch narrowly fixes this issue by disabling the
capture_decltype handling and falling back to the ordinary handling when
the innermost lambda has no capture-default. In fact, we can restrict
the special handling to only by-copy lambdas since that's what
[expr.prim.id.unqual]/3 is concerned with; for by-ref implicit captures
both code paths should give the same result anyway.
During review some other issues were discovered which are documented in
a new FIXME.
PR c++/83167
gcc/cp/ChangeLog:
* semantics.cc (capture_decltype): Inline into its only caller ...
(finish_decltype_type): ... here. Update nearby comment to refer
to recent standard. Add FIXME. Restrict uncaptured variable type
transformation to happen only for lambdas with a by-copy
capture-default.
gcc/testsuite/ChangeLog:
* g++.dg/cpp0x/lambda/lambda-decltype4.C: New test.
David Malcolm [Fri, 8 Dec 2023 20:59:43 +0000 (15:59 -0500)]
analyzer: fix ICE on infoleak with poisoned size
gcc/analyzer/ChangeLog:
* region-model.cc (contains_uninit_p): Only check for
svalues that the infoleak warning can handle.
gcc/testsuite/ChangeLog:
* gcc.dg/plugin/infoleak-uninit-size-1.c: New test.
* gcc.dg/plugin/infoleak-uninit-size-2.c: New test.
* gcc.dg/plugin/plugin.exp: Add the new tests.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
[PR112875][LRA]: Fix an assert in lra elimination code
PR112875 test ran into a wrong assert (gcc_unreachable) in elimination
in a debug insn. The insn seems ok. So I change the assertion.
To be more accurate I made it the same as analogous reload pass code.
Jakub Jelinek [Fri, 8 Dec 2023 19:58:38 +0000 (20:58 +0100)]
c++: Fix parsing [[]][[]];
When working on the previous patch I put [[]] [[]] asm (""); into a
testcase, but was surprised it wasn't parsed.
The problem is that when cp_parser_std_attribute_spec returns NULL, it
can mean 2 different things, one is that the next token(s) are neither
[[ nor alignas (in that case the caller should break from the loop),
or when we parsed something like [[]] - it was valid attribute specifier,
but didn't specify any attributes in it.
The following patch fixes that by using a magic value of void_list_node
for the case where the first tokens are neither [[ nor alignas and so
where cp_parser_std_attribute_spec_seq should stop iterating to differentiate
it from NULL_TREE which stands for some attribute specifier has been parsed,
but it didn't contain any (or any valid) attributes.
2023-12-08 Jakub Jelinek <jakub@redhat.com>
* parser.cc (cp_parser_std_attribute_spec): Return void_list_node
rather than NULL_TREE if token is neither CPP_OPEN_SQUARE nor
RID_ALIGNAS CPP_KEYWORD.
(cp_parser_std_attribute_spec_seq): For attr_spec == void_list_node
break, for attr_spec == NULL_TREE continue.
Jakub Jelinek [Fri, 8 Dec 2023 19:56:48 +0000 (20:56 +0100)]
c++: Unshare folded SAVE_EXPR arguments during cp_fold [PR112727]
The following testcase is miscompiled because two ubsan instrumentations
run into each other.
The first one is the shift instrumentation. Before the C++ FE calls
it, it wraps the 2 shift arguments with cp_save_expr, so that side-effects
in them aren't evaluated multiple times. And, ubsan_instrument_shift
itself uses unshare_expr on any uses of the operands to make sure further
modifications in them don't affect other copies of them (the only not
unshared ones are the one the caller then uses for the actual operation
after the instrumentation, which means there is no tree sharing).
Now, if there are side-effects in the first operand like say function
call, cp_save_expr wraps it into a SAVE_EXPR, and ubsan_instrument_shift
in this mode emits something like
if (..., SAVE_EXPR <foo ()>, SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR <foo ()>, ...);
and caller adds
SAVE_EXPR <foo ()> << SAVE_EXPR <op1>
after it in a COMPOUND_EXPR. So far so good.
If there are no side-effects and cp_save_expr doesn't create SAVE_EXPR,
everything is ok as well because of the unshare_expr.
We have
if (..., SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., ptr->something[i], ...);
and
ptr->something[i] << SAVE_EXPR <op1>
where ptr->something[i] is unshared.
In the testcase below, the !x->s[j] ? 1 : 0 expression is wrapped initially
into a SAVE_EXPR though, and unshare_expr doesn't unshare SAVE_EXPRs nor
anything used in them for obvious reasons, so we end up with:
if (..., SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0>, SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0>, ...);
and
SAVE_EXPR <!(bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 1 : 0> << SAVE_EXPR <op1>
So far good as well. But later during cp_fold of the SAVE_EXPR we find
out that VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1 is actually
invariant (has TREE_READONLY set) and so cp_fold simplifies the above to
if (..., SAVE_EXPR <op1> > const)
__ubsan_handle_shift_out_of_bounds (..., (bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1, ...);
and
((bool) VIEW_CONVERT_EXPR<const struct S *>(x)->s[j] ? 0 : 1) << SAVE_EXPR <op1>
with the s[j] ARRAY_REFs and other expressions shared in between the two
uses (and obviously the expression optimized away from the COMPOUND_EXPR in
the if condition.
Then comes another ubsan instrumentation at genericization time,
this time to instrument the ARRAY_REFs with strict bounds checking,
and replaces the s[j] in there with s[.UBSAN_BOUNDS (0B, SAVE_EXPR<j>, 8), SAVE_EXPR<j>]
As the trees are shared, it does that just once though.
And as the if body is gimplified first, the SAVE_EXPR<j> is evaluated inside
of the if body and when it is used again after the if, it uses a potentially
uninitialized value of j.1 (always uninitialized if the shift count isn't
out of bounds).
The following patch fixes that by unshare_expr unsharing the folded argument
of a SAVE_EXPR if we've folded the SAVE_EXPR into an invariant and it is
used more than once.
2023-12-08 Jakub Jelinek <jakub@redhat.com>
PR sanitizer/112727
* cp-gimplify.cc (cp_fold): If SAVE_EXPR has been previously
folded, unshare_expr what is returned.
Patrick Palka [Fri, 8 Dec 2023 18:34:04 +0000 (13:34 -0500)]
c++: guard more against undiagnosed error_mark_node [PR112658]
This adds a sanity check to cp_parser_expression_statement similar to
the one in finish_expr_stmt added by r6-6795-g0fd9d4921f7ba2, which
effectively downgrades accepts-invalid/wrong-code bugs like this one
into ice-on-invalid/ice-on-valid ones.
PR c++/112658
gcc/cp/ChangeLog:
* parser.cc (cp_parser_expression_statement): If the statement
is error_mark_node, make sure we've seen_error().
Patrick Palka [Fri, 8 Dec 2023 18:33:55 +0000 (13:33 -0500)]
c++: undiagnosed error_mark_node from cp_build_c_cast [PR112658]
When cp_build_c_cast commits to an erroneous const_cast, we neglect to
replay errors from build_const_cast_1 which can lead to us incorrectly
accepting (and "miscompiling") the cast, or triggering the assert in
finish_expr_stmt.
This patch fixes this oversight. This was the original fix for the ICE
in PR112658 before r14-5941-g305a2686c99bf9 made us accept the testcase
there after all. I wasn't able to come up with an alternate testcase for
which this fix has an effect anymore, but below is a reduced version of
the PR112658 testcase (accepted ever since r14-5941) for good measure.
PR c++/112658
PR c++/94264
gcc/cp/ChangeLog:
* typeck.cc (cp_build_c_cast): If we're committed to a const_cast
and the result is erroneous, call build_const_cast_1 a second
time to issue errors. Use complain=tf_none instead of =false.
Robin Dapp [Fri, 1 Dec 2023 09:07:23 +0000 (10:07 +0100)]
RISC-V: Add vectorized strcmp and strncmp.
This patch adds vectorized strcmp and strncmp implementations and
tests. Similar to strlen, expansion is still guarded by
-minline-str(n)cmp.
gcc/ChangeLog:
PR target/112109
* config/riscv/riscv-protos.h (expand_strcmp): Declare.
* config/riscv/riscv-string.cc (riscv_expand_strcmp): Add
strategy handling and delegation to scalar and vector expanders.
(expand_strcmp): Vectorized implementation.
* config/riscv/riscv.md: Add TARGET_VECTOR to strcmp and strncmp
expander.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/builtin/strcmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strcmp.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strncmp-run.c: New test.
* gcc.target/riscv/rvv/autovec/builtin/strncmp.c: New test.
Robin Dapp [Fri, 1 Dec 2023 08:57:15 +0000 (09:57 +0100)]
RISC-V: Add vectorized strlen.
This patch implements a vectorized strlen by re-using and slightly
adjusting the rawmemchr implementation. Rawmemchr returns the address
of the needle while strlen returns the difference between needle address
and start address.
As before, strlen expansion is guarded by -minline-strlen.
While testing with -minline-strlen I encountered a vsetvl problem in
memcpy-chk.c where we didn't insert a vsetvl at the proper spot (after
a setjmp). This needs to be fixed separately and I figured I'd post
this patch as-is.
early-ra's likely_operand_match_p didn't handle relaxed and special
memory constraints, which meant that the pass wasn't able to match
LD1RQ instructions to their constraints, and so backed out of
trying to allocate. This patch fixes that by switching the sense
of the match: does the rtx seem appropriate for the constraint?,
rather than: does the constraint seem appropriate for the rtx?
Also, I came across a case that needed more general equivalence
detection. Previously we would only record equivalences after
the last definition of the source register, but it's worth trying
to handle cases where the destination register's live range is
restricted to a block, and the next definition of the source
occurs only after the end of the destination register's live range.
The patch also fixes a cut-&-pasto that Alex noticed (thanks).
gcc/
* config/aarch64/aarch64-early-ra.cc (allocno_info::chain_next):
Put into an enum with...
(allocno_info::last_def_point): ...new member variable.
(allocno_info::m_current_bb_point): New member variable.
(likely_operand_match_p): Switch based on get_constraint_type,
rather than based on rtx code. Handle relaxed and special memory
constraints.
(early_ra::record_copy): Allow the source of an equivalence to be
assigned to more than once.
(early_ra::record_allocno_use): Invalidate any previous equivalence.
Initialize last_def_point.
(early_ra::record_allocno_def): Set last_def_point.
(early_ra::valid_equivalence_p): New function, split out from...
(early_ra::record_copy): ...here. Use last_def_point to handle
source registers that have a later definition.
(make_pass_aarch64_early_ra): Fix comment.
gcc/testsuite/
* gcc.target/aarch64/sme/strided_2.c: New test.
Tobias Burnus [Fri, 8 Dec 2023 14:18:25 +0000 (15:18 +0100)]
OpenMP/Fortran: Implement omp allocators/allocate for ptr/allocatables
This commit adds -fopenmp-allocators which enables support for
'omp allocators' and 'omp allocate' that are associated with a Fortran
allocate-stmt. If such a construct is encountered, an error is shown,
unless the -fopenmp-allocators flag is present.
With -fopenmp -fopenmp-allocators, those constructs get turned into
GOMP_alloc allocations, while -fopenmp-allocators (also without -fopenmp)
ensures deallocation and reallocation (via intrinsic assignments) are
properly directed to GOMP_free/omp_realloc - while normal Fortran
allocations are processed by free/realloc.
In order to distinguish a 'malloc'ed from a 'GOMP_alloc'ed memory, the
version field of the Fortran array discriptor is (mis)used: 0 indicates
the normal Fortran allocation while 1 denotes GOMP_alloc. For scalars,
there is record keeping in libgomp: GOMP_add_alloc(ptr) will add the
pointer address to a splay_tree while GOMP_is_alloc(ptr) will return
true it was previously added but also removes it from the list.
Besides Fortran FE work, BUILT_IN_GOMP_REALLOC is no part of
omp-builtins.def and libgomp gains the mentioned two new function.
Szabolcs Nagy [Fri, 29 Sep 2023 12:55:51 +0000 (13:55 +0100)]
libgcc: aarch64: Add SME unwinder support
To support the ZA lazy save scheme, the PCS requires the unwinder to
reset the SME state to PSTATE.SM=0, PSTATE.ZA=0, TPIDR2_EL0=0 on entry
to an exception handler. We use the __arm_za_disable SME runtime call
unconditionally to achieve this.
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#exceptions
The hidden alias is used to avoid a PLT and avoid inconsistent VPCS
marking (we don't rely on special PCS at the call site). In case of
static linking the SME runtime init code is linked in code that raises
exceptions.
libgcc/ChangeLog:
* config/aarch64/__arm_za_disable.S: Add hidden alias.
* config/aarch64/aarch64-unwind.h: Reset the SME state before
EH return via the _Unwind_Frames_Extra hook.
Szabolcs Nagy [Tue, 15 Nov 2022 14:08:55 +0000 (14:08 +0000)]
libgcc: aarch64: Add SME runtime support
The call ABI for SME (Scalable Matrix Extension) requires a number of
helper routines which are added to libgcc so they are tied to the
compiler version instead of the libc version. See
https://github.com/ARM-software/abi-aa/blob/main/aapcs64/aapcs64.rst#sme-support-routines
The routines are in shared libgcc and static libgcc eh, even though
they are not related to exception handling. This is to avoid linking
a copy of the routines into dynamic linked binaries, because TPIDR2_EL0
block can be extended in the future which is better to handle in a
single place per process.
The support routines have to decide if SME is accessible or not. Linux
tells userspace if SME is accessible via AT_HWCAP2, otherwise a new
__aarch64_sme_accessible symbol was introduced that a libc can define.
Due to libgcc and libc build order, the symbol availability cannot be
checked so for __aarch64_sme_accessible an unistd.h feature test macro
is used while such detection mechanism is not available for __getauxval
so we rely on configure checks based on the target triplet.
Asm helper code is added to make writing the routines easier.
libgcc/ChangeLog:
* config/aarch64/t-aarch64: Add sources to the build.
* config/aarch64/__aarch64_have_sme.c: New file.
* config/aarch64/__arm_sme_state.S: New file.
* config/aarch64/__arm_tpidr2_restore.S: New file.
* config/aarch64/__arm_tpidr2_save.S: New file.
* config/aarch64/__arm_za_disable.S: New file.
* config/aarch64/aarch64-asm.h: New file.
* config/aarch64/libgcc-sme.ver: New file.
Szabolcs Nagy [Mon, 4 Dec 2023 10:52:52 +0000 (10:52 +0000)]
libgcc: aarch64: Configure check for __getauxval
Add configure check for the __getauxval ABI symbol, which is always
available on aarch64 glibc, and may be available on other linux C
runtimes. For now only enabled on glibc, others have to override it
target_configargs=libgcc_cv_have___getauxval=yes
This is deliberately obscure as it should be auto detected, ideally
via a feature test macro in unistd.h (link time detection is not
possible since the libc may not be installed at libgcc build time),
but currently there is no such feature test mechanism.
Without __getauxval, libgcc cannot do runtime CPU feature detection
and has to assume only the build time known features are available.