Iain Buclaw [Tue, 7 Nov 2023 13:04:07 +0000 (14:04 +0100)]
libphobos: Fix regression d21 loops in getCpuInfo0B in Solaris/x86 kernel zone
This function assumes that cpuid would return "invalid domain" when a
sub-leaf index greater than what's supported is requested. This turned
out not to always be the case when running on some virtual machines.
As the loop only does anything for levels 0 and 1, make that a hard
limit for number of times the loop is ran.
PR d/112408
libphobos/ChangeLog:
* libdruntime/core/cpuid.d (getCpuInfo0B): Limit number of times loop
runs.
Kewen Lin [Thu, 12 Oct 2023 05:05:03 +0000 (00:05 -0500)]
rs6000: Make 32 bit stack_protect support prefixed insn [PR111367]
As PR111367 shows, with prefixed insn supported, some of
checkings consider it's able to leverage prefixed insn
for stack protect related load/store, but since we don't
actually change the emitted assembly for 32 bit, it can
cause the assembler error as exposed.
Mike's commit r10-4547-gce6a6c007e5a98 has already handled
the 64 bit case (DImode), this patch is to treat the 32
bit case (SImode) by making use of mode iterator P and
ptrload attribute iterator, also fixes the constraints
to match the emitted operand formats.
PR target/111367
gcc/ChangeLog:
* config/rs6000/rs6000.md (stack_protect_setsi): Support prefixed
instruction emission and incorporate to stack_protect_set<mode>.
(stack_protect_setdi): Rename to ...
(stack_protect_set<mode>): ... this, adjust constraint.
(stack_protect_testsi): Support prefixed instruction emission and
incorporate to stack_protect_test<mode>.
(stack_protect_testdi): Rename to ...
(stack_protect_test<mode>): ... this, adjust constraint.
rax is used to save and restore DFmode value. In RA both GENERAL_REGS
and SSE_REGS cost zero since we didn't disparage the
alternative in movdf_internal pattern, according to register
allocation order, GENERAL_REGS is allocated. The patch add ? for
alternative (r,v) and (v,r) just like we did for movsf/hf/bf_internal
pattern, after that we get optimal RA.
Andrew Pinski [Thu, 5 Oct 2023 19:21:19 +0000 (12:21 -0700)]
MATCH: Fix infinite loop between `vec_cond(vec_cond(a,b,0), c, d)` and `a & b`
Match has a pattern which converts `vec_cond(vec_cond(a,b,0), c, d)`
into `vec_cond(a & b, c, d)` but since in this case a is a comparison
fold will change `a & b` back into `vec_cond(a,b,0)` which causes an
infinite loop.
The best way to fix this is to enable the patterns for vec_cond(*,vec_cond,*)
only for GIMPLE so we don't get an infinite loop for fold any more.
Jonathan Wakely [Wed, 4 Oct 2023 11:07:11 +0000 (12:07 +0100)]
libstdc++: Fix testsuite failures with -O0
Backport the prune.exp change from r12-4425-g1595fe44e11a96 to fix two
testsuite failures when testing with -O0:
FAIL: 20_util/uses_allocator/69293_neg.cc (test for excess errors)
FAIL: 20_util/uses_allocator/cons_neg.cc (test for excess errors)
Also force some 20_util/integer_comparisons/ xfail tests to use -O2 so
that the errors match the dg-error directives.
Jonathan Wakely [Tue, 1 Feb 2022 14:02:56 +0000 (14:02 +0000)]
libstdc++: Add more tests for filesystem directory iterators
The PR 97731 test was added to verify a fix to the Filesystem TS code,
but we should also have the same test to avoid similar regressions in
the C++17 std::filesystem code.
Also add tests for directory_options::follow_directory_symlink
Jonathan Wakely [Tue, 21 Mar 2023 12:29:08 +0000 (12:29 +0000)]
libstdc++: Make std::filesystem::copy_file work for procfs [PR108178]
The size reported by stat is always zero for some special files such as
those under /proc, which means the current copy_file implementation
thinks there is nothing to copy. Instead of trusting the stat value, try
to read a character from a streambuf and check for EOF.
For the backport, we also need to avoid trying to use sendfile when stat
reports a zero size, so that we use streambufs to copy the file.
libstdc++-v3/ChangeLog:
PR libstdc++/108178
* src/filesystem/ops-common.h (do_copy_file): Check for empty
files by trying to read a character.
* testsuite/27_io/filesystem/operations/copy_file_108178.cc:
New test.
Jonathan Wakely [Thu, 2 Feb 2023 16:00:21 +0000 (16:00 +0000)]
libstdc++: Use ENOSYS for unsupported filesystem ops on AVR
Because avr-libc <errno.h> defines most error numbers with duplicate
values it's not sufficient to check #ifdef ENOTSUP when deciding which
std::errc constant to use for the filesystem library's __unsupported()
helper. Add a special case for AVR to always use the ENOSYS value.
libstdc++-v3/ChangeLog:
* src/filesystem/ops-common.h [AVR] (__unsupported): Always use
errc::function_not_supported instead of errc::not_supported.
Jonathan Wakely [Tue, 28 Jun 2022 08:26:12 +0000 (09:26 +0100)]
libstdc++: Do not optimize away storing pathname if it's needed
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (_Dir::_Dir) [!_GLIBCXX_HAVE_OPENAT]:
Always store pathname if we don't have openat or unlinkat,
because the full path is needed to open sub-directories and
remove entries.
Alexandre Oliva [Fri, 24 Jun 2022 02:20:53 +0000 (23:20 -0300)]
libstdc++: check for openat
rtems6.0 has fdopendir, and fcntl.h defines AT_FDCWD and declares
openat, but there's no openat in libc. Adjust dir-common.h to not
assume ::openat just because of AT_FDCWD.
for libstdc++-v3/ChangeLog
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for
openat.
* configure, config.h.in: Rebuilt.
* src/filesystem/dir-common.h (openat): Use ::openat if
_GLIBCXX_HAVE_OPENAT.
* src/filesystem/dir.cc (dir_and_pathname): Use dirfd if
_GLIBCXX_HAVE_OPENAT.
Jonathan Wakely [Tue, 8 Feb 2022 21:05:30 +0000 (21:05 +0000)]
libstdc++: Fix directory iterator build for newlib
When building for newlib HAVE_OPENAT and HAVE_UNLINKAT are (sometimes?)
defined, but <fcntl.h> is only included when HAVE_DIRENT_H is defined.
Since directory iterators are completely useless without <dirent.h>,
just override the HAVE_OPENAT and HAVE_UNLINKAT detection when we don't
have <dirent.h>.
libstdc++-v3/ChangeLog:
* src/filesystem/dir-common.h (_GLIBCXX_HAVE_DIRFD): Undefine
when <dirent.h> is not available.
(_GLIBCXX_HAVE_UNLINKAT): Likewise.
Jonathan Wakely [Tue, 8 Feb 2022 15:57:58 +0000 (15:57 +0000)]
libstdc++: Simplify resource management in directory iterators
This replaces the _Dir constructor that takes ownership of an existing
DIR* resource with one that takes a _Dir_base rvalue instead. This means
a raw DIR* is never passed around, but is always owned by a _Dir_base
object.
libstdc++-v3/ChangeLog:
* src/c++17/fs_dir.cc (_Dir(DIR*, const path&)): Change first
parameter to _Dir_base&&.
* src/filesystem/dir-common.h (_Dir_base(DIR*)): Remove.
* src/filesystem/dir.cc (_Dir(DIR*, const path&)): Change first
parameter to _Dir_base&&.
Jonathan Wakely [Mon, 7 Feb 2022 23:36:47 +0000 (23:36 +0000)]
libstdc++: Fix filesystem::remove_all for Windows [PR104161]
The recursive_directory_iterator::__erase member was failing for
Windows, because the entry._M_type value is always file_type::none
(because _Dir_base::advance doesn't populate it for Windows) and
top.unlink uses fs::remove which sets an error using the
system_category. That meant that ec.value() was a Windows error code and
not an errno value, so the comparisons to EPERM and EISDIR failed.
Instead of depending on a specific Windows error code for attempting to
remove a directory, just use directory_entry::refresh() to query the
type first. This doesn't avoid the TOCTTOU races with directory
symlinks, but we can't avoid them on Windows without openat and
unlinkat, and creating symlinks requires admin privs on Windows anyway.
This also fixes the fs::remove_all(const path&) overload, which was
supposed to use the same logic as the other overload, but I forgot to
change it before my previous commit.
libstdc++-v3/ChangeLog:
PR libstdc++/104161
* src/c++17/fs_dir.cc (fs::recursive_directory_iterator::__erase):
[i_GLIBCXX_FILESYSTEM_IS_WINDOWS]: Refresh entry._M_type member,
instead of checking for errno values indicating a directory.
* src/c++17/fs_ops.cc (fs::remove_all(const path&)): Use similar
logic to non-throwing overload.
(fs::remove_all(const path&, error_code&)): Add comments.
* src/filesystem/ops-common.h: Likewise.
This fixes the remaining filesystem::remove_all race condition by using
POSIX openat to recurse into sub-directories and using POSIX unlinkat to
remove files. This avoids the remaining race where the directory being
removed is replaced with a symlink after the directory has been opened,
so that the filesystem::remove("subdir/file") resolves to "target/file"
instead, because "subdir" has been removed and replaced with a symlink.
The previous patch only fixed the case where the directory was replaced
with a symlink before we tried to open it, but it still used the full
(potentially compromised) path as an argument to filesystem::remove.
The first part of the fix is to use openat when recursing into a
sub-directory with recursive_directory_iterator. This means that opening
"dir/subdir" uses the file descriptor for "dir", and so is sure to open
"dir/subdir" and not "symlink/subdir". (The previous patch to use
O_NOFOLLOW already ensured we won't open "dir/symlink/" here.)
The second part of the fix is to use unlinkat for the remove_all
operation. Previously we used a directory_iterator to get the name of
each file in a directory and then used filesystem::remove(iter->path())
on that name. This meant that any checks (e.g. O_NOFOLLOW) done by the
iterator could be invalidated before the remove operation on that
pathname. The directory iterator contains an open DIR stream, which we
can use to obtain a file descriptor to pass to unlinkat. This ensures
that the file being deleted really is contained within the directory
we're iterating over, rather than using a pathname that could resolve to
some other file.
The filesystem::remove_all function previously used a (non-recursive)
filesystem::directory_iterator for each directory, and called itself
recursively for sub-directories. The new implementation uses a single
filesystem::recursive_directory_iterator object, and calls a new __erase
member function on that iterator. That new __erase member function does
the actual work of removing a file (or a directory after its contents
have been iterated over and removed) using unlinkat. That means we don't
need to expose the DIR stream or its file descriptor to the remove_all
function, it's still encapuslated by the iterator class.
It would be possible to add a __rewind member to directory iterators
too, to call rewinddir after each modification to the directory. That
would make it more likely for filesystem::remove_all to successfully
remove everything even if files are being written to the directory tree
while removing it. It's unclear if that is actually prefereable, or if
it's better to fail and report an error at the first opportunity.
The necessary APIs (openat, unlinkat, fdopendir, dirfd) are defined in
POSIX.1-2008, and in Glibc since 2.10. But if the target doesn't provide
them, the original code (with race conditions) is still used.
This also reduces the number of small memory allocations needed for
std::filesystem::remove_all, because we do not store the full path to
every directory entry that is iterated over. The new filename_only
option means we only store the filename in the directory entry, as that
is all we need in order to use openat or unlinkat.
Finally, rather than duplicating everything for the Filesystem TS, the
std::experimental::filesystem::remove_all implementation now just calls
std::filesystem::remove_all to do the work.
libstdc++-v3/ChangeLog:
PR libstdc++/104161
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for dirfd
and unlinkat.
* config.h.in: Regenerate.
* configure: Regenerate.
* include/bits/fs_dir.h (recursive_directory_iterator): Declare
remove_all overloads as friends.
(recursive_directory_iterator::__erase): Declare new member
function.
* include/bits/fs_fwd.h (remove, remove_all): Declare.
* src/c++17/fs_dir.cc (_Dir): Add filename_only parameter to
constructor. Pass file descriptor argument to base constructor.
(_Dir::dir_and_pathname, _Dir::open_subdir, _Dir::do_unlink)
(_Dir::unlink, _Dir::rmdir): Define new member functions.
(directory_iterator): Pass filename_only argument to _Dir
constructor.
(recursive_directory_iterator::_Dir_stack): Adjust constructor
parameters to take a _Dir rvalue instead of creating one.
(_Dir_stack::orig): Add data member for storing original path.
(_Dir_stack::report_error): Define new member function.
(__directory_iterator_nofollow): Move here from dir-common.h and
fix value to be a power of two.
(__directory_iterator_filename_only): Define new constant.
(recursive_directory_iterator): Construct _Dir object and move
into _M_dirs stack. Pass skip_permission_denied argument to first
advance call.
(recursive_directory_iterator::increment): Use _Dir::open_subdir.
(recursive_directory_iterator::__erase): Define new member
function.
* src/c++17/fs_ops.cc (ErrorReporter, do_remove_all): Remove.
(fs::remove_all): Use new recursive_directory_iterator::__erase
member function.
* src/filesystem/dir-common.h (_Dir_base): Add int parameter to
constructor and use openat to implement nofollow semantics.
(_Dir_base::fdcwd, _Dir_base::set_close_on_exec, _Dir_base::openat):
Define new member functions.
(__directory_iterator_nofollow): Move to fs_dir.cc.
* src/filesystem/dir.cc (_Dir): Pass file descriptor argument to
base constructor.
(_Dir::dir_and_pathname, _Dir::open_subdir): Define new member
functions.
(recursive_directory_iterator::_Dir_stack): Adjust constructor
parameters to take a _Dir rvalue instead of creating one.
(recursive_directory_iterator): Check for new nofollow option.
Construct _Dir object and move into _M_dirs stack. Pass
skip_permission_denied argument to first advance call.
(recursive_directory_iterator::increment): Use _Dir::open_subdir.
* src/filesystem/ops.cc (fs::remove_all): Use C++17 remove_all.
Jonathan Wakely [Sun, 23 Jan 2022 21:45:16 +0000 (21:45 +0000)]
libstdc++: Avoid symlink race in filesystem::remove_all [PR104161]
This adds a new internal flag to the filesystem::directory_iterator
constructor that makes it fail if the path is a symlink that resolves to
a directory. This prevents filesystem::remove_all from following a
symlink to a directory, rather than deleting the symlink itself.
We can also use that new flag in recursive_directory_iterator to ensure
that we don't follow symlinks if the follow_directory_symlink option is
not set.
This also moves an error check in filesystem::remove_all after the while
loop, so that errors from the directory_iterator constructor are
reproted, instead of continuing to the filesystem::remove call below.
libstdc++-v3/ChangeLog:
PR libstdc++/104161
* acinclude.m4 (GLIBCXX_CHECK_FILESYSTEM_DEPS): Check for
fdopendir.
* config.h.in: Regenerate.
* configure: Regenerate.
* src/c++17/fs_dir.cc (_Dir): Add nofollow flag to constructor
and pass it to base class constructor.
(directory_iterator): Pass nofollow flag to _Dir constructor.
(fs::recursive_directory_iterator::increment): Likewise.
* src/c++17/fs_ops.cc (do_remove_all): Use nofollow option for
directory_iterator constructor. Move error check outside loop.
* src/filesystem/dir-common.h (_Dir_base): Add nofollow flag to
constructor and when it's set use ::open with O_NOFOLLOW and
O_DIRECTORY.
* src/filesystem/dir.cc (_Dir): Add nofollow flag to constructor
and pass it to base class constructor.
(directory_iterator): Pass nofollow flag to _Dir constructor.
(fs::recursive_directory_iterator::increment): Likewise.
* src/filesystem/ops.cc (remove_all): Use nofollow option for
directory_iterator constructor. Move error check outside loop.
Jonathan Wakely [Sat, 2 Oct 2021 20:18:19 +0000 (21:18 +0100)]
libstdc++: Fix typos in std::filesystem code
There were a couple of typos in r12-4070 and r12-4071 which don't show
up when building for POSIX targets.
libstdc++-v3/ChangeLog:
* src/c++17/fs_ops.cc (create_directory): Fix typo in enum name.
* src/filesystem/ops-common.h (__last_system_error): Add
explicit cast to avoid narrowing conversion.
(do_space): Fix type in function name.
Jonathan Wakely [Tue, 11 May 2021 17:47:18 +0000 (18:47 +0100)]
libstdc++: Avoid unconditional use of errc::not_supported [PR 99327]
The errc::not_supported constant is only defined if ENOTSUP is defined,
which is not true for all targets. Many uses of errc::not_supported in
the filesystem library do not actually match the intended meaning of
ENOTSUP described by POSIX. They should be using ENOSYS instead
(i.e. errc::function_not_supported).
This change ensures that appropriate error codes are used by the
filesystem library. The remaining uses of errc::not_supported are
replaced with a call to a new helper function so that an alternative
value will be used on targets that don't support errc::not_supported.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
PR libstdc++/99327
* src/filesystem/ops-common.h (__unsupported): New function to
return a suitable error code for missing functionality.
(posix::off_t): New typedef.
(posix::*): Set errno to ENOSYS instead of ENOTSUP for no-op
fallback implementations.
(do_copy_file): Replace uses of errc::not_supported.
* src/c++17/fs_ops.cc (fs::copy, fs::copy_file, create_dir)
(fs::create_directory, fs::create_directory_symlink)
(fs::create_hard_link, fs::create_symlink, fs::current_path)
(fs::equivalent, do_stat, fs::file_size, fs::hard_link_count)
(fs::last_write_time, fs::permissions, fs::read_symlink):
Replace uses of errc::not_supported.
(fs::resize_file): Qualify off_t.
* src/filesystem/ops.cc (fs::copy, fs::copy_file, create_dir)
(fs::create_directory, fs::create_directory_symlink)
(fs::create_hard_link, fs::create_symlink, fs::current_path)
(fs::equivalent, do_stat, fs::file_size, fs::last_write_time)
(fs::permissions, fs::read_symlink, fs::system_complete):
Replace uses of errc::not_supported.
(fs::resize_file): Qualify off_t and enable unconditionally.
* testsuite/19_diagnostics/system_error/cons-1.cc: Likewise.
Jonathan Wakely [Wed, 10 Feb 2021 18:00:00 +0000 (18:00 +0000)]
libstdc++: Add utility for creating std::error_code from OS errors
This adds a helper function to encapsulate obtaining an error code for
errors from OS calls. For Windows we want to use GetLastError() and the
system error category, but otherwise just use errno and the generic
error category.
This should not be used to replace existing uses of
ec.assign(errno, generic_category()) because in those cases we really do
want to get the value of errno, not a system-specific error. Only the
cases that currently use GetLastError() are replace by this new
function.
Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:
* src/filesystem/ops-common.h (last_error): New helper function.
(filesystem::do_space): Use last_error().
* src/c++17/fs_ops.cc (fs::absolute, fs::create_hard_link)
(fs::equivalent, fs::remove, fs::temp_directory_path): Use
last_error().
* src/filesystem/ops.cc (fs::create_hard_link)
(fs::remove, fs::temp_directory_path): Likewise.
Pat Haugen [Tue, 19 Sep 2023 18:19:59 +0000 (13:19 -0500)]
Disable generation of scalar modulo instructions.
It was recently discovered that the scalar modulo instructions can suffer
noticeable performance issues for certain input values. This patch disables
their generation since the equivalent div/mul/sub sequence does not suffer
the same problem.
gcc/
* config/rs6000/rs6000.c (rs6000_rtx_costs): Check whether the
modulo instruction is disabled.
* config/rs6000/rs6000.h (RS6000_DISABLE_SCALAR_MODULO): New.
* config/rs6000/rs6000.md (mod<mode>3, *mod<mode>3): Check it.
(define_expand umod<mode>3): New.
(define_insn umod<mode>3): Rename to *umod<mode>3 and check if the modulo
instruction is disabled.
(umodti3, modti3): Check if the modulo instruction is disabled.
Tim Song [Wed, 6 Sep 2023 17:31:55 +0000 (19:31 +0200)]
libstdc++: Force _Hash_node_value_base methods inline to fix abi (PR111050)
https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=1b6f0476837205932613ddb2b3429a55c26c409d
changed _Hash_node_value_base to no longer derive from _Hash_node_base, which means
that its member functions expect _M_storage to be at a different offset. So explosions
result if an out-of-line definition is emitted for any of the member functions (say,
in a non-optimized build) and the resulting object file is then linked with code built
using older version of GCC/libstdc++.
Jonathan Wakely [Wed, 9 Aug 2023 10:11:31 +0000 (11:11 +0100)]
libstdc++: Fix constexpr functions to conform to older standards
Some constexpr functions were inadvertently relying on relaxed constexpr
rules from later standards.
libstdc++-v3/ChangeLog:
* include/experimental/bits/fs_path.h (path::string): Use
_GLIBCXX17_CONSTEXPR not _GLIBCXX_CONSTEXPR for 'if constexpr'.
* include/std/charconv (__to_chars_8): Initialize variable for
C++17 constexpr rules.
aarch64: Make stack smash canary protect saved registers
AArch64 normally puts the saved registers near the bottom of the frame,
immediately above any dynamic allocations. But this means that a
stack-smash attack on those dynamic allocations could overwrite the
saved registers without needing to reach as far as the stack smash
canary.
The same thing could also happen for variable-sized arguments that are
passed by value, since those are allocated before a call and popped on
return.
This patch avoids that by putting the locals (and thus the canary) below
the saved registers when stack smash protection is active.
The patch fixes CVE-2023-4039.
gcc/
* config/aarch64/aarch64.c (aarch64_save_regs_above_locals_p):
New function.
(aarch64_layout_frame): Use it to decide whether locals should
go above or below the saved registers.
(aarch64_expand_prologue): Update stack layout comment.
Emit a stack tie after the final adjustment.
gcc/testsuite/
* gcc.target/aarch64/stack-protector-8.c: New test.
* gcc.target/aarch64/stack-protector-9.c: Likewise.
After previous patches, it's no longer necessary to store
saved_regs_size and below_hard_fp_saved_regs_size in the frame info.
All measurements instead use the top or bottom of the frame as
reference points.
aarch64: Explicitly record probe registers in frame info
The stack frame is currently divided into three areas:
A: the area above the hard frame pointer
B: the SVE saves below the hard frame pointer
C: the outgoing arguments
If the stack frame is allocated in one chunk, the allocation needs a
probe if the frame size is >= guard_size - 1KiB. In addition, if the
function is not a leaf function, it must probe an address no more than
1KiB above the outgoing SP. We ensured the second condition by
(1) using single-chunk allocations for non-leaf functions only if
the link register save slot is within 512 bytes of the bottom
of the frame; and
(2) using the link register save as a probe (meaning, for instance,
that it can't be individually shrink wrapped)
If instead the stack is allocated in multiple chunks, then:
* an allocation involving only the outgoing arguments (C above) requires
a probe if the allocation size is > 1KiB
* any other allocation requires a probe if the allocation size
is >= guard_size - 1KiB
* second and subsequent allocations require the previous allocation
to probe at the bottom of the allocated area, regardless of the size
of that previous allocation
The final point means that, unlike for single allocations,
it can be necessary to have both a non-SVE register probe and
an SVE register probe. For example:
* allocate A, probe using a non-SVE register save
* allocate B, probe using an SVE register save
* allocate C
The non-SVE register used in this case was again the link register.
It was previously used even if the link register save slot was some
bytes above the bottom of the non-SVE register saves, but an earlier
patch avoided that by putting the link register save slot first.
As a belt-and-braces fix, this patch explicitly records which
probe registers we're using and allows the non-SVE probe to be
whichever register comes first (as for SVE).
The patch also avoids unnecessary probes in sve/pcs/stack_clash_3.c.
gcc/
* config/aarch64/aarch64.h (aarch64_frame::sve_save_and_probe)
(aarch64_frame::hard_fp_save_and_probe): New fields.
* config/aarch64/aarch64.c (aarch64_layout_frame): Initialize them.
Rather than asserting that a leaf function saves LR, instead assert
that a leaf function saves something.
(aarch64_get_separate_components): Prevent the chosen probe
registers from being individually shrink-wrapped.
(aarch64_allocate_and_probe_stack_space): Remove workaround for
probe registers that aren't at the bottom of the previous allocation.
Previous patches ensured that the final frame allocation only needs
a probe when the size is strictly greater than 1KiB. It's therefore
safe to use the normal 1024 probe offset in all cases.
The main motivation for doing this is to simplify the code and
remove the number of special cases.
gcc/
* config/aarch64/aarch64.c (aarch64_allocate_and_probe_stack_space):
Always probe the residual allocation at offset 1024, asserting
that that is in range.
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-17.c: Expect the probe
to be at offset 1024 rather than offset 0.
* gcc.target/aarch64/stack-check-prologue-18.c: Likewise.
-fstack-clash-protection uses the save of LR as a probe for the next
allocation. The next allocation could be:
* another part of the static frame, e.g. when allocating SVE save slots
or outgoing arguments
* an alloca in the same function
* an allocation made by a callee function
However, when -fomit-frame-pointer is used, the LR save slot is placed
above the other GPR save slots. It could therefore be up to 80 bytes
above the base of the GPR save area (which is also the hard fp address).
aarch64_allocate_and_probe_stack_space took this into account when
deciding how much subsequent space could be allocated without needing
a probe. However, it interacted badly with:
/* If doing a small final adjustment, we always probe at offset 0.
This is done to avoid issues when LR is not at position 0 or when
the final adjustment is smaller than the probing offset. */
else if (final_adjustment_p && rounded_size == 0)
residual_probe_offset = 0;
which forces any allocation that is smaller than the guard page size
to be probed at offset 0 rather than the usual offset 1024. It was
therefore possible to construct cases in which we had:
* a probe using LR at SP + 80 bytes (or some other value >= 16)
* an allocation of the guard page size - 16 bytes
* a probe at SP + 0
which allocates guard page size + 64 consecutive unprobed bytes.
This patch requires the LR probe to be in the first 16 bytes of the
save area when stack clash protection is active. Doing it
unconditionally would cause code-quality regressions, but a later
patch deals with that.
The new comment doesn't say that the probe register is required
to be LR, since a later patch removes that restriction.
gcc/
* config/aarch64/aarch64.c (aarch64_layout_frame): Ensure that
the LR save slot is in the first 16 bytes of the register save area.
(aarch64_allocate_and_probe_stack_space): Remove workaround for
when LR was not in the first 16 bytes.
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-18.c: New test.
The AArch64 ABI says that, when stack clash protection is used,
there can be a maximum of 1KiB of unprobed space at sp on entry
to a function. Therefore, we need to probe when allocating
>= guard_size - 1KiB of data (>= rather than >). This is what
GCC does.
If an allocation is exactly guard_size bytes, it is enough to allocate
those bytes and probe once at offset 1024. It isn't possible to use a
single probe at any other offset: higher would conmplicate later code,
by leaving more unprobed space than usual, while lower would risk
leaving an entire page unprobed. For simplicity, the code probes all
allocations at offset 1024.
Some register saves also act as probes. If we need to allocate
more space below the last such register save probe, we need to
probe the allocation if it is > 1KiB. Again, this allocation is
then sometimes (but not always) probed at offset 1024. This sort of
allocation is currently only used for outgoing arguments, which are
rarely this big.
However, the code also probed if this final outgoing-arguments
allocation was == 1KiB, rather than just > 1KiB. This isn't
necessary, since the register save then probes at offset 1024
as required. Continuing to probe allocations of exactly 1KiB
would complicate later patches.
gcc/
* config/aarch64/aarch64.c (aarch64_allocate_and_probe_stack_space):
Don't probe final allocations that are exactly 1KiB in size (after
unprobed space above the final allocation has been deducted).
gcc/testsuite/
* gcc.target/aarch64/stack-check-prologue-17.c: New test.
After previous patches, it no longer really makes sense to allocate
the top of the frame in terms of varargs_and_saved_regs_size and
saved_regs_and_above.
gcc/
* config/aarch64/aarch64.c (aarch64_layout_frame): Simplify
the allocation of the top of the frame.
aarch64: Measure reg_offset from the bottom of the frame
reg_offset was measured from the bottom of the saved register area.
This made perfect sense with the original layout, since the bottom
of the saved register area was also the hard frame pointer address.
It became slightly less obvious with SVE, since we save SVE
registers below the hard frame pointer, but it still made sense.
However, if we want to allow different frame layouts, it's more
convenient and obvious to measure reg_offset from the bottom of
the frame. After previous patches, it's also a slight simplification
in its own right.
gcc/
* config/aarch64/aarch64.h (aarch64_frame): Add comment above
reg_offset.
* config/aarch64/aarch64.c (aarch64_layout_frame): Walk offsets
from the bottom of the frame, rather than the bottom of the saved
register area. Measure reg_offset from the bottom of the frame
rather than the bottom of the saved register area.
(aarch64_save_callee_saves): Update accordingly.
(aarch64_restore_callee_saves): Likewise.
(aarch64_get_separate_components): Likewise.
(aarch64_process_components): Likewise.
aarch64: Rename hard_fp_offset to bytes_above_hard_fp
Similarly to the previous locals_offset patch, hard_fp_offset
was described as:
/* Offset from the base of the frame (incomming SP) to the
hard_frame_pointer. This value is always a multiple of
STACK_BOUNDARY. */
poly_int64 hard_fp_offset;
which again took an “upside-down” view: higher offsets meant lower
addresses. This patch renames the field to bytes_above_hard_fp instead.
aarch64: Rename locals_offset to bytes_above_locals
locals_offset was described as:
/* Offset from the base of the frame (incomming SP) to the
top of the locals area. This value is always a multiple of
STACK_BOUNDARY. */
This is implicitly an “upside down” view of the frame: the incoming
SP is at offset 0, and anything N bytes below the incoming SP is at
offset N (rather than -N).
However, reg_offset instead uses a “right way up” view; that is,
it views offsets in address terms. Something above X is at a
positive offset from X and something below X is at a negative
offset from X.
Also, even on FRAME_GROWS_DOWNWARD targets like AArch64,
target-independent code views offsets in address terms too:
locals are allocated at negative offsets to virtual_stack_vars.
It seems confusing to have *_offset fields of the same structure
using different polarities like this. This patch tries to avoid
that by renaming locals_offset to bytes_above_locals.