Andrew Pinski [Thu, 29 Feb 2024 06:39:32 +0000 (22:39 -0800)]
aarch64: Fix memtag builtins vs GC [PR108174]
The memtag builtins were being GC'ed away so we end up
with a crash sometimes (maybe even wrong code).
This fixes that issue by adding GTY on the variable/struct
aarch64_memtag_builtin_data.
Committed as obvious after a build/test for aarch64-linux-gnu.
PR target/108174
gcc/ChangeLog:
* config/aarch64/aarch64-builtins.cc (aarch64_memtag_builtin_data): Make
static and mark with GTY.
Eric Botcazou [Tue, 27 Feb 2024 17:01:00 +0000 (18:01 +0100)]
Fix internal error on non-byte-aligned reference in GIMPLE DSE
This is a regression present on the mainline, 13 and 12 branches. For the
attached Ada case, it's a tree checking failure on the mainline at -O:
+===========================GNAT BUG DETECTED==============================+
| 14.0.1 20240226 (experimental) [master r14-9171-g4972f97a265] GCC error:|
| tree check: expected tree that contains 'decl common' structure, |
| have 'component_ref' in tree_could_trap_p, at tree-eh.cc:2733 |
| Error detected around /home/eric/cvs/gcc/gcc/testsuite/gnat.dg/opt104.adb:
Time is a 10-byte record and Packed_Rec.T is placed at bit-offset 65 because
of the packing. so tree-ssa-dse.cc:setup_live_bytes_from_ref has computed a
const_size of 88 from ref->offset of 65 and ref->max_size of 80.
Then in tree-ssa-dse.cc:compute_trims:
411 int last_live = bitmap_last_set_bit (live);
(gdb) next
412 if (ref->size.is_constant (&const_size))
(gdb)
414 int last_orig = (const_size / BITS_PER_UNIT) - 1;
(gdb)
418 *trim_tail = last_orig - last_live;
(gdb) call debug_bitmap (live)
n_bits = 256, set = {0 1 2 3 4 5 6 7 8 9 10 }
(gdb) p last_live
$33 = 10
(gdb) p const_size
$34 = 80
(gdb) p last_orig
$35 = 9
(gdb) p *trim_tail
$36 = -1
In other words, compute_trims is overlooking the alignment adjustments that
setup_live_bytes_from_ref applied earlier. Moveover it reads:
/* We use sbitmaps biased such that ref->offset is bit zero and the bitmap
extends through ref->size. So we know that in the original bitmap
bits 0..ref->size were true. We don't actually need the bitmap, just
the REF to compute the trims. */
but setup_live_bytes_from_ref used ref->max_size instead of ref->size.
It appears that all the callers of compute_trims assume that ref->offset is
byte aligned and that the trimmed bytes are relative to ref->size, so the
patch simply adds an early return if either condition is not fulfilled.
gcc/
* tree-ssa-dse.cc (compute_trims): Fix description. Return early
if either ref->offset is not byte aligned or ref->size is not known
to be equal to ref->max_size.
(maybe_trim_complex_store): Fix description.
(maybe_trim_constructor_store): Likewise.
(maybe_trim_partially_dead_store): Likewise.
gcc/testsuite/
* gnat.dg/opt104.ads, gnat.dg/opt104.adb: New test.
and the memory operand size is 1 byte. As the result, the rest of 511
bytes is ignored by GCC. Implement ldtilecfg and sttilecfg intrinsics
with a pointer to XImode to honor the 512-byte memory block.
gcc/ChangeLog:
PR target/114098
* config/i386/amxtileintrin.h (_tile_loadconfig): Use
__builtin_ia32_ldtilecfg.
(_tile_storeconfig): Use __builtin_ia32_sttilecfg.
* config/i386/i386-builtin.def (BDESC): Add
__builtin_ia32_ldtilecfg and __builtin_ia32_sttilecfg.
* config/i386/i386-expand.cc (ix86_expand_builtin): Handle
IX86_BUILTIN_LDTILECFG and IX86_BUILTIN_STTILECFG.
* config/i386/i386.md (ldtilecfg): New pattern.
(sttilecfg): Likewise.
gcc/testsuite/ChangeLog:
PR target/114098
* gcc.target/i386/amxtile-4.c: New test.
Eric Botcazou [Mon, 26 Feb 2024 12:13:34 +0000 (13:13 +0100)]
Finalization of object allocated by anonymous access designating local type
The finalization of objects dynamically allocated through an anonymous
access type is deferred to the enclosing library unit in the current
implementation and a warning is given on each of them.
However this cannot be done if the designated type is local, because this
would generate dangling references to the local finalization routine, so
the finalization needs to be dropped in this case and the warning adjusted.
gcc/ada/
PR ada/113893
* exp_ch7.adb (Build_Anonymous_Master): Do not build the master
for a local designated type.
* exp_util.adb (Build_Allocate_Deallocate_Proc): Force Needs_Fin
to false if no finalization master is attached to an access type
and assert that it is anonymous in this case.
* sem_res.adb (Resolve_Allocator): Mention that the object might
not be finalized at all in the warning given when the type is an
anonymous access-to-controlled type.
Richard Earnshaw [Thu, 22 Feb 2024 16:47:20 +0000 (16:47 +0000)]
arm: fix ICE with vectorized reciprocal division [PR108120]
The expand pattern for reciprocal division was enabled for all math
optimization modes, but the patterns it was generating were not
enabled unless -funsafe-math-optimizations were enabled, this leads to
an ICE when the pattern we generate cannot be recognized.
Fixed by only enabling vector division when doing unsafe math.
gcc:
PR target/108120
* config/arm/neon.md (div<VCVTF:mode>3): Rename from div<mode>3.
Gate with ARM_HAVE_NEON_<MODE>_ARITH.
gcc/testsuite:
PR target/108120
* gcc.target/arm/neon-recip-div-1.c: New file.
Xi Ruoyao [Thu, 22 Feb 2024 09:58:45 +0000 (17:58 +0800)]
LoongArch: Don't default to -mno-explicit-relocs if -mno-relax
To improve Binutils compatibility we've had to backport relaxation
support. But if a user just updates to GCC 13.3 and sticks with
Binutils 2.41, there is no reason to use -mno-explicit-relocs as the
default because we are turning off relaxation for Binutils 2.41 (it
lacks conditional branch relaxation support) anyway.
So like GCC 14, make the default of -m[no-]explicit-relocs depend on
-m[no-]relax instead of HAVE_AS_MRELAX_OPTION. Also update the doc to
reflect the behavior change.
gcc/ChangeLog:
* config/loongarch/genopts/loongarch.opt.in
(TARGET_EXPLICIT_RELOCS): Init to M_OPTION_NOT_SEEN.
* config/loongarch/loongarch.opt: Regenerate.
* config/loongarch/loongarch.cc
(loongarch_option_override_internal): Set the default of
TARGET_EXPLICIT_RELOCS to HAVE_AS_EXPLICIT_RELOCS
&& !loongarch_mrelax.
* doc/invoke.texi (-m[no-]explicit-relocs): Update for
LoongArch.
Andrew Pinski [Thu, 22 Feb 2024 04:12:21 +0000 (20:12 -0800)]
warn-access: Fix handling of unnamed types [PR109804]
This looks like an oversight of handling DEMANGLE_COMPONENT_UNNAMED_TYPE.
DEMANGLE_COMPONENT_UNNAMED_TYPE only has the u.s_number.number set while
the code expected newc.u.s_binary.left would be valid.
So this treats DEMANGLE_COMPONENT_UNNAMED_TYPE like we treat function paramaters
(DEMANGLE_COMPONENT_FUNCTION_PARAM) and template paramaters (DEMANGLE_COMPONENT_TEMPLATE_PARAM).
Note the code in the demangler does this when it sets DEMANGLE_COMPONENT_UNNAMED_TYPE:
ret->type = DEMANGLE_COMPONENT_UNNAMED_TYPE;
ret->u.s_number.number = num;
Committed as obvious after bootstrap/test on x86_64-linux-gnu
Lewis Hyatt [Thu, 22 Feb 2024 12:50:10 +0000 (07:50 -0500)]
testsuite: Remove test that should not have been backported [PR105608]
This test (backported as part of r13-8257, from r14-8465) was not meant to
be backported, since it fails on some platforms without other GCC 14 patches
that will not be backported. Remove it from the 13 branch.
Xi Ruoyao [Fri, 3 Nov 2023 13:19:59 +0000 (21:19 +0800)]
LoongArch: Disable relaxation if the assembler don't support conditional branch relaxation [PR112330]
As the commit message of r14-4674 has indicated, if the assembler does
not support conditional branch relaxation, a relocation overflow may
happen on conditional branches when relaxation is enabled because the
number of NOP instructions inserted by the assembler will be more than
the number estimated by GCC.
To work around this issue, disable relaxation by default if the
assembler is detected incapable to perform conditional branch relaxation
at GCC build time. We also need to pass -mno-relax to the assembler to
really disable relaxation. But, if the assembler does not support
-mrelax option at all, we should not pass -mno-relax to the assembler or
it will immediately error out. Also handle this with the build time
assembler capability probing, and add a pair of options
-m[no-]pass-mrelax-to-as to allow using a different assembler from the
build-time one.
With this change, if GCC is built with GAS 2.41, relaxation will be
disabled by default. So the default value of -mexplicit-relocs= is also
changed to 'always' if -mno-relax is specified or implied by the
build-time default, because using assembler macros for symbol addresses
produces no benefit when relaxation is disabled.
gcc/ChangeLog:
PR target/112330
* config/loongarch/genopts/loongarch.opt.in: Add
-m[no]-pass-relax-to-as. Change the default of -m[no]-relax to
account conditional branch relaxation support status.
* config/loongarch/loongarch.opt: Regenerate.
* configure.ac (gcc_cv_as_loongarch_cond_branch_relax): Check if
the assembler supports conditional branch relaxation.
* configure: Regenerate.
* config.in: Regenerate. Note that there are some unrelated
changes introduced by r14-5424 (which does not contain a
config.in regeneration).
* config/loongarch/loongarch-opts.h
(HAVE_AS_COND_BRANCH_RELAXATION): Define to 0 if not defined.
* config/loongarch/loongarch.h (ASM_MRELAX_DEFAULT): Define.
(ASM_MRELAX_SPEC): Define.
(ASM_SPEC): Use ASM_MRELAX_SPEC instead of "%{mno-relax}".
* doc/invoke.texi: Document -m[no-]relax and
-m[no-]pass-mrelax-to-as for LoongArch.
LoongArch: Check whether binutils supports the relax function. If supported, explicit relocs are turned off by default.
gcc/ChangeLog:
* config.in: Regenerate.
* config/loongarch/genopts/loongarch.opt.in: Add compilation option
mrelax. And set the initial value of explicit-relocs according to the
detection status.
* config/loongarch/gnu-user.h: When compiling with -mno-relax, pass the
--no-relax option to the linker.
* config/loongarch/loongarch-opts.h (HAVE_AS_MRELAX_OPTION): Define macro.
* config/loongarch/loongarch.opt: Regenerate.
* configure: Regenerate.
* configure.ac: Add detection of support for binutils relax function.
There are two reasons for removing this macro definition:
1. The default in the assembler is to use the nop instruction for filling.
2. For assembly directives: .align [abs-expr[, abs-expr[, abs-expr]]]
The third expression it is the maximum number of bytes that should be
skipped by this alignment directive.
Therefore, it will affect the display of the specified alignment rules
and affect the operating efficiency.
This modification relies on binutils commit 1fb3cdd87ec61715a5684925fb6d6a6cf53bb97c.
(Since the assembler will add nop based on the .align information when doing relax,
it will cause the conditional branch to go out of bounds during the assembly process.
This submission of binutils solves this problem.)
Andre Vieira [Wed, 20 Dec 2023 16:41:52 +0000 (16:41 +0000)]
veclower: improve selection of vector mode when lowering [PR 112787]
This patch addresses the issue reported in PR target/112787 by improving the
compute type selection. We do this by not considering types with more elements
than the type we are lowering since we'd reject such types anyway.
gcc/ChangeLog:
PR target/112787
* tree-vect-generic.cc (type_for_widest_vector_mode): Change function to
use original vector type and check widest vector mode has at most the
same number of elements.
(get_compute_type): Pass original vector type rather than the element
type to type_for_widest_vector_mode and remove now obsolete check for
the number of elements.
Fix a typo in __gthr_win32_abs_to_rel_time that caused it to return a
relative time in seconds instead of milliseconds. As a consequence,
__gthr_win32_cond_timedwait called SleepConditionVariableCS with a
1000x shorter timeout; this caused ~1000x more spurious wakeups in
CV timed waits such as std::condition_variable::wait_for or wait_until,
resulting generally in much higher CPU usage.
{
// timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, true);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock<std::mutex> ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
{
// non-timed wait, wake up explicitly after 1 second
std::thread t(thread_fn, false);
std::this_thread::sleep_for(std::chrono::seconds(1));
{
std::unique_lock<std::mutex> ml(mx);
pass = true;
}
cv.notify_all();
t.join();
}
return 0;
}
```
On builds based on non-affected threading models (e.g. POSIX on Linux,
or winpthreads or MCF on Win32) the output is something like
```
pass: 0; wakeups: 2; elapsed: 2000 ms
pass: 1; wakeups: 2; elapsed: 991 ms
pass: 1; wakeups: 2; elapsed: 996 ms
```
while with the Win32 threading model we get
```
pass: 0; wakeups: 1418; elapsed: 2000 ms
pass: 1; wakeups: 479; elapsed: 988 ms
pass: 1; wakeups: 2; elapsed: 992 ms
```
(notice the huge number of wakeups in the timed wait cases only).
This commit fixes the conversion, adjusting the final division by
NSEC100_PER_SEC to use NSEC100_PER_MSEC instead (already defined in the
file and not used in any other place, so probably just a typo).
libgcc/ChangeLog:
PR libgcc/113850
* config/i386/gthr-win32-cond.c (__gthr_win32_abs_to_rel_time):
fix absolute timespec to relative milliseconds count
conversion (it incorrectly returned seconds instead of
milliseconds); this avoids spurious wakeups in
__gthr_win32_cond_timedwait
Paul Keir [Mon, 12 Feb 2024 18:15:49 +0000 (18:15 +0000)]
libstdc++: Fix constexpr basic_string union member [PR113294]
A call to `basic_string::clear()` in the std::string move assignment
operator leads to a constexpr error from an access of inactive union
member `_M_local_buf` in the added test (`test_move()`). Changing
`__str._M_local_buf` to `__str._M_use_local_data()` in
`operator=(basic_string&& __str)` fixes this.
PR libstdc++/113294
libstdc++-v3/ChangeLog:
* include/bits/basic_string.h (basic_string::operator=): Use
_M_use_local_data() instead of _M_local_buf on the moved-from
string.
* testsuite/21_strings/basic_string/modifiers/constexpr.cc
(test_move): New test.
Signed-off-by: Paul Keir <paul.keir@uws.ac.uk> Reviewed-by: Patrick Palka <ppalka@redhat.com> Reviewed-by: Jonathan Wakely <jwakely@redhat.com>
(cherry picked from commit 065dddc6e07a917c57c7955db13b1fe77abbcabc)
Jonathan Wakely [Thu, 8 Feb 2024 13:59:42 +0000 (13:59 +0000)]
libstdc++: Avoid aliasing violation in std::valarray [PR99117]
The call to __valarray_copy constructs an _Array object to refer to
this->_M_data but that means that accesses to this->_M_data are through
a restrict-qualified pointer. This leads to undefined behaviour when
copying from an _Expr object that actually aliases this->_M_data.
Replace the call to __valarray_copy with a plain loop. I think this
removes the only use of that overload of __valarray_copy, so it could
probably be removed. I haven't done that here.
libstdc++-v3/ChangeLog:
PR libstdc++/99117
* include/std/valarray (valarray::operator=(const _Expr&)):
Use loop to copy instead of __valarray_copy with _Array.
* testsuite/26_numerics/valarray/99117.cc: New test.
Jonathan Wakely [Fri, 12 Jan 2024 16:57:41 +0000 (16:57 +0000)]
libstdc++: Update tzdata to 2024a
Import the new 2024a tzdata.zi file. The leapseconds file was also
updated to have a new expiry (no new leap seconds were added).
libstdc++-v3/ChangeLog:
* src/c++20/tzdata.zi: Import new file from 2024a release.
* src/c++20/tzdb.cc (tzdb_list::_Node::_S_read_leap_seconds)
Update expiry date for leap seconds list.
Jakub Jelinek [Thu, 15 Feb 2024 19:04:01 +0000 (20:04 +0100)]
testsuite: Require lra effective target for pr107385.c
Old reload doesn't support asm goto with output operands.
We have lra effective target (though, strangely it returns
0 just for 2 targets out of at least 16 targets with no LRA support),
so this patch uses it, similarly how it is done in other asm goto
tests with output operands.
Jakub Jelinek [Thu, 15 Feb 2024 14:53:01 +0000 (15:53 +0100)]
expand: Fix handling of asm goto outputs vs. PHI argument adjustments [PR113921]
The Linux kernel and the following testcase distilled from it is
miscompiled, because tree-outof-ssa.cc (eliminate_phi) emits some
fixups on some of the edges (but doesn't commit edge insertions).
Later expand_asm_stmt emits further instructions on the same edge.
Now the problem is that expand_asm_stmt uses insert_insn_on_edge
to add its own fixups, but that function appends to the existing
sequence on the edge if any. And the bug triggers when the
fixup sequence emitted by eliminate_phi uses a pseudo which the
fixup sequence emitted by expand_asm_stmt later on sets.
So, we end up with
(set (reg A) (asm_operands ...))
and on one of the edges queued sequence
(set (reg C) (reg B)) // added by eliminate_phi
(set (reg B) (reg A)) // added by expand_asm_stmt
That is wrong, what we emit by expand_asm_stmt needs to be as close
to the asm_operands as possible (they aren't known until expand_asm_stmt
is called, the PHI fixup code assumes it is reg B which holds the right
value) and the PHI adjustments need to be done after it.
So, the following patch introduces a prepend_insn_to_edge function and
uses it from expand_asm_stmt, so that we queue
(set (reg B) (reg A)) // added by expand_asm_stmt
(set (reg C) (reg B)) // added by eliminate_phi
instead and so the value from the asm_operands output propagates correctly
to the PHI result.
2024-02-15 Jakub Jelinek <jakub@redhat.com>
PR middle-end/113921
* cfgrtl.h (prepend_insn_to_edge): New declaration.
* cfgrtl.cc (insert_insn_on_edge): Clarify behavior in function
comment.
(prepend_insn_to_edge): New function.
* cfgexpand.cc (expand_asm_stmt): Use prepend_insn_to_edge instead of
insert_insn_on_edge.
The -mmcu=avrtiny cores have no ADIW and SBIW instructions. This was
implemented by clearing all regs out of regclass ADDW_REGS so that
constraint "w" never matched. This corrupted the subset relations of
the register classes as they appear in enum reg_class.
This patch keeps ADDW_REGS like for all other cores, i.e. it contains
R24...R31. Instead of tests like test_hard_reg_class (ADDW_REGS, *)
the code now uses avr_adiw_reg_p (*). And all insns with constraint "w"
get "isa" insn attribute value of "adiw".
Plus, a new built-in macro __AVR_HAVE_ADIW__ is provided, which is more
specific than __AVR_TINY__.
gcc/
PR target/113927
* config/avr/avr.h (AVR_HAVE_ADIW): New macro.
* config/avr/avr-protos.h (avr_adiw_reg_p): New proto.
* config/avr/avr.cc (avr_adiw_reg_p): New function.
(avr_conditional_register_usage) [AVR_TINY]: Don't clear ADDW_REGS.
Replace test_hard_reg_class (ADDW_REGS, ...) with calls to
* config/avr/avr.md: Same.
(attr "isa") <tiny, no_tiny>: Remove.
<adiw, no_adiw>: Add.
(define_insn, define_insn_and_split): When an alternative has
constraint "w", then set attribute "isa" to "adiw".
* config/avr/avr-c.cc (avr_cpu_cpp_builtins) [AVR_HAVE_ADIW]:
Built-in define __AVR_HAVE_ADIW__.
* doc/invoke.texi (AVR Options): Document it.
If register_specialization finds a previous declaration and throws the new
one away, we shouldn't still add the new one to
DECL_TEMPLATE_SPECIALIZATIONS.
PR c++/113612
gcc/cp/ChangeLog:
* pt.cc (process_partial_specialization): Return early
on redeclaration.
Jerry DeLisle [Mon, 12 Feb 2024 21:12:08 +0000 (13:12 -0800)]
libgfortran: Adjust bytes_left and pos for access="STREAM".
During tab edits, the pos (position) and bytes_used
Variables were not being set correctly for stream I/O.
Since stream I/O does not have 'real' records, the
format buffer active length must be used instead of
the record length variable.
PR libgfortran/109358
libgfortran/ChangeLog:
* io/transfer.c (formatted_transfer_scalar_write): Adjust
bytes_used and pos variable for stream access.
Jerry DeLisle [Sat, 3 Feb 2024 02:12:33 +0000 (18:12 -0800)]
libgfortran: EN0.0E0 and ES0.0E0 format editing.
F2018 and F2023 standards added zero width exponents. This required
additional special handing in the process of building formatted
floating point strings.
G formatting uses either F or E formatting as documented in
write_float.def comments. This logic changes the format token from FMT_G
to FMT_F or FMT_E. The new formatting requirements interfere with this
process when a FMT_G float string is being built. To avoid this, a new
component called 'pushed' is added to the fnode structure to save this
condition. The 'pushed' condition is then used to bypass portions of
the new ES,E,EN, and D formatting, falling through to the existing
default formatting which is retained.
libgfortran/ChangeLog:
PR libfortran/111022
* io/format.c (get_fnode): Update initialization of fnode.
(parse_format_list): Initialization.
* io/format.h (struct fnode): Added the new 'pushed' component.
* io/write.c (select_buffer): Whitespace.
(write_real): Whitespace.
(write_real_w0): Adjust logic for the d == 0 condition.
* io/write_float.def (determine_precision): Whitespace.
(build_float_string): Calculate width of ..E0 exponents and
adjust logic accordingly.
(build_infnan_string): Whitespace.
(CALCULATE_EXP): Whitespace.
(quadmath_snprintf): Whitespace.
(determine_en_precision): Whitespace.
gcc/testsuite/ChangeLog:
PR libfortran/111022
* gfortran.dg/fmt_error_10.f: Show D+0 exponent.
* gfortran.dg/pr96436_4.f90: Show E+0 exponent.
* gfortran.dg/pr96436_5.f90: Show E+0 exponent.
* gfortran.dg/pr111022.f90: New test.
The warning was raised on accessing SFRs at addresses below the default
page size, as gcc considers accessing addresses in the first page of
memory as suspicious. This doesn't apply to an embedded target like the
avr, where both flash and RAM have zero as a valid address. Zero is also
a valid address in named address spaces (__memx, flash<n> etc..).
This commit implements TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID for the avr
target and reports to gcc that zero is a valid address on all
address spaces. It also disables flag_delete_null_pointer_checks
based on the target hook, and modifies target-supports.exp to add avr
to the list of targets that always keep null pointer checks. This fixes
a bunch of DejaGNU failures that occur otherwise.
PR target/105523
gcc/ChangeLog:
* common/config/avr/avr-common.cc: Remove setting
of OPT_fdelete_null_pointer_checks.
* config/avr/avr.cc (avr_option_override): Clear
flag_delete_null_pointer_checks if zero_address_valid.
(avr_addr_space_zero_address_valid): New function.
(TARGET_ADDR_SPACE_ZERO_ADDRESS_VALID): Provide target
hook.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp
(check_effective_target_keeps_null_pointer_checks): Add
avr.
* gcc.target/avr/pr105523.c: New test.
The linking of libgcc is already present in %(liborig), so the current
situation duplicates libraries. This was not an issue until macOS's new
linker started giving warnings for such cases.
Alexandre Oliva [Wed, 29 Nov 2023 07:00:35 +0000 (04:00 -0300)]
call maybe_return_this in build_clone
__dt_base doesn't get its body from a maybe_return_this caller, it's
rather cloned with the full body within build_clone, and then it's
left alone, without going through finish_function_body or
build_delete_destructor_body, that call maybe_return_this.
Now, this is correct as far as the generated code is concerned, since
the cloned body of a cdtor that returns this is also a cdtor body that
returns this. The problem is that the decl for THIS is also cloned,
and it doesn't get the warning suppression introduced by
maybe_return_this, so Wuse-after-free3.C fails with an excess warning
at the closing brace of the dtor body.
I've split out the warning suppression from maybe_return_this, and
arranged to call that bit from the relevant build_clone case.
Unfortunately, because the warning is silenced for all uses of the
THIS decl, rather than only for the ABI-mandated return stmt, this
also silences the very warning that the testcase checks for.
I'm not revamping the warning suppression approach to overcome this,
so I'm xfailing the expected warning on ARM EABI, hoping that's the
only target with cdtor_return_this, and leaving it at that.
for gcc/cp/ChangeLog
* decl.cc (maybe_prepare_return_this): Split out of...
(maybe_return_this): ... this.
* cp-tree.h (maybe_prepare_return_this): Declare.
* class.cc (build_clone): Call it.
for gcc/testsuite/ChangeLog
* g++.dg/warn/Wuse-after-free3.C: xfail on arm_eabi.
Alexandre Oliva [Wed, 24 May 2023 06:07:46 +0000 (03:07 -0300)]
[testsuite] tsvc: skip include malloc.h when unavailable
tsvc tests all fail on systems that don't offer a malloc.h, other than
those that explicitly rule that out. Use the preprocessor to test for
malloc.h's availability.
tsvc.h also expects a definition for struct timeval, but it doesn't
include sys/time.h. Add a conditional include thereof.
for gcc/testsuite/ChangeLog
* gcc.dg/vect/tsvc/tsvc.h: Test for and conditionally include
malloc.h and sys/time.h.
testsuite: Pattern does not match when using --specs=nano.specs
When running the testsuite for newlib nano, the --specs=nano.specs
option is used. This option prepends cpp_unique_options with
"-isystem =/include/newlib-nano" so that the newlib nano header files
override the newlib standard ones. As the -isystem option is prepended,
the -quiet option is no longer the first option to cc1. Adjust the test
accordingly.
Patch has been verified on Windows and Linux.
gcc/testsuite/ChangeLog:
* gcc.misc-tests/options.exp: Allow other options before the
-quite option for cc1.
Alexandre Oliva [Wed, 29 Nov 2023 07:00:28 +0000 (04:00 -0300)]
c++: for contracts, cdtors never return this
When targetm.cxx.cdtor_return_this() holds, cdtors have a
non-VOID_TYPE_P result, but IMHO this ABI implementation detail
shouldn't leak to the abstract language conceptual framework, in which
cdtors don't have return values. For contracts, specifically those
that establish postconditions on results, such a leakage is present,
and the present patch puts an end to it: with it, cdtors get an error
for result postconditions regardless of the ABI. This fixes
g++.dg/contracts/contracts-ctor-dtor2.C on arm-eabi.
for gcc/cp/ChangeLog
* contracts.cc (check_postcondition_result): Cope with
cdtor_return_this.
Jonathan Wakely [Fri, 10 Nov 2023 21:06:15 +0000 (21:06 +0000)]
libstdc++: Do not use assume attribute for Clang [PR112467]
Clang has an 'assume' attribute, but it's a function attribute not a
statement attribute. The recently-added use of the statement form causes
an error with Clang.
libstdc++-v3/ChangeLog:
PR libstdc++/112467
* include/bits/stl_bvector.h (_M_assume_normalized): Do not use
statement form of assume attribute for Clang.
Alexandre Oliva [Thu, 9 Nov 2023 03:01:37 +0000 (00:01 -0300)]
libstdc++: optimize bit iterators assuming normalization [PR110807]
The representation of bit iterators, using a pointer into an array of
words, and an unsigned bit offset into that word, makes for some
optimization challenges: because the compiler doesn't know that the
offset is always in a certain narrow range, beginning at zero and
ending before the word bitwidth, when a function loads an offset that
it hasn't normalized itself, it may fail to derive certain reasonable
conclusions, even to the point of retaining useless calls that elicit
incorrect warnings.
Case at hand: The 110807.cc testcase for bit vectors assigns a 1-bit
list to a global bit vector variable. Based on the compile-time
constant length of the list, we decide in _M_insert_range whether to
use the existing storage or to allocate new storage for the vector.
After allocation, we decide in _M_copy_aligned how to copy any
preexisting portions of the vector to the newly-allocated storage.
When copying two or more words, we use __builtin_memmove.
However, because we compute the available room using bit offsets
without range information, even comparing them with constants, we fail
to infer ranges for the preexisting vector depending on word size, and
may thus retain the memmove call despite knowing we've only allocated
one word.
Other parts of the compiler then detect the mismatch between the
constant allocation size and the much larger range that could
theoretically be copied into the newly-allocated storage if we could
reach the call.
Ensuring the compiler is aware of the constraints on the offset range
enables it to do a much better job at optimizing. Using attribute
assume (_M_offset <= ...) didn't work, because gimple lowered that to
something that vrp could only use to ensure 'this' was non-NULL.
Exposing _M_offset as an automatic variable/gimple register outside
the unevaluated assume operand enabled the optimizer to do its job.
Rather than placing such load-then-assume constructs all over, I
introduced an always-inline member function in bit iterators that does
the job of conveying to the compiler the information that the
assumption is supposed to hold, and various calls throughout functions
pertaining to bit iterators that might not otherwise know that the
offsets have to be in range, so that the compiler no longer needs to
make conservative assumptions that prevent optimizations.
With the explicit assumptions, the compiler can correlate the test for
available storage in the vector with the test for how much storage
might need to be copied, and determine that, if we're not asking for
enough room for two or more words, we can omit entirely the code to
copy two or more words, without any runtime overhead whatsoever: no
traces remain of the undefined behavior or of the tests that inform
the compiler about the assumptions that must hold.
for libstdc++-v3/ChangeLog
PR libstdc++/110807
* include/bits/stl_bvector.h (_Bit_iterator_base): Add
_M_assume_normalized member function. Call it in _M_bump_up,
_M_bump_down, _M_incr, operator==, operator<=>, operator<, and
operator-.
(_Bit_iterator): Also call it in operator*.
(_Bit_const_iterator): Likewise.
Jonathan Wakely [Tue, 9 Jan 2024 15:22:46 +0000 (15:22 +0000)]
libstdc++: Prefer posix_memalign for aligned-new [PR113258]
As described in PR libstdc++/113258 there are old versions of tcmalloc
which replace malloc and related APIs, but do not repalce aligned_alloc
because it didn't exist at the time they were released. This means that
when operator new(size_t, align_val_t) uses aligned_alloc to obtain
memory, it comes from libc's aligned_alloc not from tcmalloc. But when
operator delete(void*, size_t, align_val_t) uses free to deallocate the
memory, that goes to tcmalloc's replacement version of free, which
doesn't know how to free it.
If we give preference to the older posix_memalign instead of
aligned_alloc then we're more likely to use a function that will be
compatible with the replacement version of free. Because posix_memalign
has been around for longer, it's more likely that old third-party malloc
replacements will also replace posix_memalign alongside malloc and free.
libstdc++-v3/ChangeLog:
PR libstdc++/113258
* libsupc++/new_opa.cc: Prefer to use posix_memalign if
available.
Jonathan Wakely [Thu, 11 Jan 2024 15:09:12 +0000 (15:09 +0000)]
libstdc++: Fix non-portable results from 64-bit std::subtract_with_carry_engine [PR107466]
I implemented the resolution of LWG 3809 in r13-4364-ga64775a0edd469 but
it was recently noted in the MSVC STL github repo that the change causes
possible truncation for 64-bit seeds. Whether the truncation occurs (and
to what value) depends on the width of uint_least32_t which is not
portable, so the output of the PRNG for 64-bit seed values is no longer
the same as in C++20, and no longer portable across platforms.
That new issue was filed as LWG 4014. I proposed a new change which
reduces the seed by the LCG's modulus before the conversion to
uint_least32_t. This ensures that 64-bit seed values are consistently
reduced by the modulus before any truncation. This removes the
platform-dependent behaviour and restores the old behaviour for
std::subtract_with_carry_engine specializations using a 64-bit result
type (such as std::ranlux48_base).
libstdc++-v3/ChangeLog:
PR libstdc++/107466
* include/bits/random.tcc (subtract_with_carry_engine::seed):
Implement proposed resolution of LWG 4014.
* testsuite/26_numerics/random/pr60037-neg.cc: Adjust dg-error
line number.
* testsuite/26_numerics/random/subtract_with_carry_engine/cons/lwg3809.cc:
Check for expected result of 64-bit engine with seed that
doesn't fit in 32-bits.
Jonathan Wakely [Thu, 1 Feb 2024 10:08:05 +0000 (10:08 +0000)]
libstdc++: Implement some missing functions for net::ip::network_v6
libstdc++-v3/ChangeLog:
* include/experimental/internet (network_v6::network): Define.
(network_v6::hosts): Finish implementing.
(network_v6::to_string): Do not concatenate std::string to
arbitrary std::basic_string specialization.
* testsuite/experimental/net/internet/network/v6/cons.cc: New
test.
Jonathan Wakely [Wed, 31 Jan 2024 10:41:49 +0000 (10:41 +0000)]
libstdc++: Avoid reusing moved-from iterators in PSTL tests [PR90276]
The reverse_invoker utility for PSTL tests uses forwarding references for
all parameters, but some of those parameters get forwarded to move
constructors which then leave the objects in a moved-from state. When
the parameters are forwarded a second time that results in making new
copies of moved-from iterators. For libstdc++ debug mode iterators, the
moved-from state is singular, which means copying them will abort at
runtime.
The fix is to make copies of iterator arguments instead of forwarding
them.
The callers of reverse_invoker::operator() also forward the iterators
multiple times, but that's OK because reverse_invoker accepts them by
forwarding reference but then breaks the chain of forwarding and copies
them as lvalues.
libstdc++-v3/ChangeLog:
PR libstdc++/90276
* testsuite/util/pstl/test_utils.h (reverse_invoker): Do not use
perfect forwarding for iterator arguments.
Jonathan Wakely [Thu, 1 Feb 2024 21:40:33 +0000 (21:40 +0000)]
libstdc++: Allow explicit conversion of string views with different traits
This was changed by LWG 3857.
libstdc++-v3/ChangeLog:
* include/std/string_view (basic_string_view(R&&)): Remove
constraint that traits_type must be the same, as per LWG 3857.
* testsuite/21_strings/basic_string_view/cons/char/range_c++20.cc:
Explicit conversion between different specializations should be
allowed.
* testsuite/21_strings/basic_string_view/cons/wchar_t/range_c++20.cc:
Likewise.
Andrew Pinski [Fri, 24 Nov 2023 02:55:30 +0000 (18:55 -0800)]
Fix contracts-tmpl-spec2.C on targets where plain char is unsigned by default
Since contracts-tmpl-spec2.C is just testing contracts, I thought it would be better
to just add `-fsigned-char` to the options rather than change the testcase to support
both cases.
Committed after testing on aarch64-linux-gnu.
gcc/testsuite/ChangeLog:
PR testsuite/108321
* g++.dg/contracts/contracts-tmpl-spec2.C: Add -fsigned-char
to options.
testsuite: Replace many dg-require-thread-fence with dg-require-atomic-cmpxchg-word
These tests actually use a form of atomic compare and exchange
operation, not just atomic loading and storing. Some targets (not
supported by e.g. libatomic) have atomic loading and storing, but not
compare and exchange, yielding linker errors for missing library
functions.
This change is just for existing uses of
dg-require-thread-fence. It does not fix any other tests
that should also be gated on dg-require-atomic-cmpxchg-word.
Some targets (armv6-m) support inline atomic load and store,
i.e. dg-require-thread-fence matches, but not atomic operations like
compare and exchange.
This directive can be used to replace uses of dg-require-thread-fence
where an atomic operation is actually used.
* testsuite/lib/dg-options.exp (dg-require-atomic-cmpxchg-word):
New proc.
* testsuite/lib/libstdc++.exp (check_v3_target_atomic_cmpxchg_word):
Ditto.
libstdc++: Add dg-require-thread-fence in several tests
Some targets like arm-eabi with newlib and default settings rely on
__sync_synchronize() to ensure synchronization. Newlib does not
implement it by default, to make users aware they have to take special
care.
This makes a few tests fail to link.
This patch requires the missing thread-fence effective target in the
tests that need it, making them UNSUPPORTED instead of FAIL and
UNRESOLVED.
The PR shows us ICEing due to an unrecognizable TFmode save emitted by
aarch64_process_components. The problem is that for T{I,F,D}mode we
conservatively require mems to be in range for x-register ldp/stp. That
is because (at least for TImode) it can be allocated to both GPRs and
FPRs, and in the GPR case that is an x-reg ldp/stp, and the FPR case is
a q-register load/store.
As Richard pointed out in the PR, aarch64_get_separate_components
already checks that the offsets are suitable for a single load, so we
just need to choose a mode in aarch64_reg_save_mode that gives the full
q-register range. In this patch, we choose V16QImode as an alternative
16-byte "bag-of-bits" mode that doesn't have the artificial range
restrictions imposed on T{I,F,D}mode.
Unlike for GCC 14 we need additional handling in the load/store pair
code as various cases are not expecting to see V16QImode (particularly
the writeback patterns, but also aarch64_gen_load_pair).
Richard Biener [Mon, 20 Nov 2023 10:12:43 +0000 (11:12 +0100)]
tree-optimization/112618 - unused .MASK_CALL
We have to make sure to remove unused .MASK_CALL internal function
calls after vectorization.
PR tree-optimization/112618
* tree-vect-loop.cc (vect_transform_loop_stmt): For not
relevant and unused .MASK_CALL make sure we remove the
scalar stmt.
Richard Biener [Fri, 10 Nov 2023 11:39:11 +0000 (12:39 +0100)]
tree-optimization/110221 - SLP and loop mask/len
The following fixes the issue that when SLP stmts are internal defs
but appear invariant because they end up only using invariant defs
then they get scheduled outside of the loop. This nice optimization
breaks down when loop masks or lens are applied since those are not
explicitly tracked as dependences. The following makes sure to never
schedule internal defs outside of the vectorized loop when the
loop uses masks/lens.
PR tree-optimization/110221
* tree-vect-slp.cc (vect_schedule_slp_node): When loop
masking / len is applied make sure to not schedule
intenal defs outside of the loop.
The following fixes a wrong pattern that didn't match the behavior
of the original fold_widened_comparison in that get_unwidened
returned a constant always in the wider type. But here we're
using (int) 4294967295u without the conversion applied. Fixed
by doing as earlier in the pattern - matching constants only
if the conversion was actually applied.
PR middle-end/110176
* match.pd (zext (bool) <= (int) 4294967295u): Make sure
to match INTEGER_CST only without outstanding conversion.
When running the DejaGNU testsuite on a toolchain built for native
Windows, the path /dev/null can't be used to open a stream to void.
On native Windows, the resource is instead named "nul".
In 17_intro/tag_type_explicit_ctor.cc, the following statement would
fail to match when the DejaGNU testsuite is running in cygwin with a
native toolchain.
// dg-error 53 "explicit" "" { target hosted }
The "target hosted"-check is using cpp to verify if _GLIBCXX_HOSTED is
defined and discards the output by simply redirecting it to /dev/null.
In v3_target_compile, it's overridden to "nul" for MinGW targets, but
the same rule applies when host is cygwin, so replace the condition
with a check for Windows.
The error in the log would look like this for the "target hosted" check:
cc1plus.exe: fatal error: opening output file /dev/null: No such file or directory
The tag_type_explicit_ctor.cc test fails with this on Windows:
.../tag_type_explicit_ctor.cc:53: error: converting to 'std::defer_lock_t' from initializer list would use explicit constructor 'constexpr std::defer_lock_t::defer_lock_t()'
.../tag_type_explicit_ctor.cc:54: error: converting to 'std::try_to_lock_t' from initializer list would use explicit constructor 'constexpr std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:55: error: converting to 'std::try_to_lock_t' from initializer list would use explicit constructor 'constexpr std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:67: error: converting to 'std::defer_lock_t' from initializer list would use explicit constructor 'constexpr std::defer_lock_t::defer_lock_t()'
.../tag_type_explicit_ctor.cc:68: error: converting to 'std::try_to_lock_t' from initializer list would use explicit constructor 'constexpr std::try_to_lock_t::try_to_lock_t()'
.../tag_type_explicit_ctor.cc:69: error: converting to 'std::adopt_lock_t' from initializer list would use explicit constructor 'constexpr std::adopt_lock_t::adopt_lock_t()'
Patch has been verified on Windows and Linux.
libstdc++-v3:
* testsuite/lib/libstdc++.exp: Use "nul" for Windows, "/dev/null"
for other environments.
Jason Merrill [Mon, 5 Feb 2024 18:54:22 +0000 (13:54 -0500)]
c++: prvalue of array type [PR111286]
Here we want to build a prvalue array to bind to the T reference, but we
were wrongly trying to strip cv-quals from the array prvalue, which should
be treated the same as a class prvalue.
PR c++/111286
gcc/cp/ChangeLog:
* tree.cc (rvalue): Don't drop cv-quals from an array.
Jason Merrill [Thu, 1 Jun 2023 18:41:07 +0000 (14:41 -0400)]
varasm: check float size [PR109359]
In PR95226, the testcase was failing because we tried to output_constant a
NOP_EXPR to float from a double REAL_CST, and so we output a double where
the caller wanted a float. That doesn't happen anymore, but with the
output_constant hunk we will ICE in that situation rather than emit the
wrong number of bytes.
Part of the problem was that initializer_constant_valid_p_1 returned true
for that NOP_EXPR, because it compared the sizes of integer types but not
floating-point types. So the C++ front end assumed it didn't need to fold
the initializer.
Xi Ruoyao [Fri, 2 Feb 2024 19:35:07 +0000 (03:35 +0800)]
MIPS: Fix wrong MSA FP vector negation
We expanded (neg x) to (minus const0 x) for MSA FP vectors, this is
wrong because -0.0 is not 0 - 0.0. This causes some Python tests to
fail when Python is built with MSA enabled.
Use the bnegi.df instructions to simply reverse the sign bit instead.
gcc/ChangeLog:
* config/mips/mips-msa.md (elmsgnbit): New define_mode_attr.
(neg<mode>2): Change the mode iterator from MSA to IMSA because
in FP arithmetic we cannot use (0 - x) for -x.
(neg<mode>2): New define_insn to implement FP vector negation,
using a bnegi instruction to negate the sign bit.
Jason Merrill [Thu, 1 Feb 2024 22:23:53 +0000 (17:23 -0500)]
c++: no_unique_address and constexpr [PR112439]
Here, because we don't build a CONSTRUCTOR for an empty base, we were
wrongly marking the Foo CONSTRUCTOR as complete after initializing the Empty
member. Fixed by checking empty_base here as well.
PR c++/112439
gcc/cp/ChangeLog:
* constexpr.cc (cxx_eval_store_expression): Check empty_base
before marking a CONSTRUCTOR readonly.