git.ipfire.org Git - thirdparty/gcc.git/commit

libstdc++: Conditionally use floating-point fetch_add builtins

- Some hardware has support for floating point atomic fetch_add (and
  similar).
- There are existing compilers targetting this hardware that use
  libstdc++ -- e.g. NVC++.
- Since the libstdc++ atomic<float>::fetch_add and similar is written
  directly as a CAS loop these compilers can not emit optimal code when
  seeing such constructs.
- I hope to use __atomic_fetch_add builtins on floating point types
  directly in libstdc++ so these compilers can emit better code.
- Clang already handles some floating point types in the
  __atomic_fetch_add family of builtins.
- In order to only use this when available, I originally thought I could
  check against the resolved versions of the builtin in a manner
  something like `__has_builtin(__atomic_fetch_add_<fp-suffix>)`.
  I then realised that clang does not expose resolved versions of these
  atomic builtins to the user.
  From the clang discourse it was suggested we instead use SFINAE (which
  clang already supports).
- I have recently pushed a patch for allowing the use of SFINAE on
  builtins: r15-6042-g9ed094a817ecaf
  Now that patch is committed, this patch does not change what happens
  for GCC, while it uses the builtin for codegen with clang.
- I have previously sent a patchset upstream adding the ability to use
  __atomic_fetch_add and similar on floating point types.
  https://gcc.gnu.org/pipermail/gcc-patches/2024-November/668754.html
  Once that patchset is upstream (plus the automatic linking of
  libatomic as Joseph pointed out in the email below
  https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665408.html )
  then current GCC should start to use the builtin branch added in this
  patch.

So *currently*, this patch allows external compilers (NVC++ in
particular) to generate better code, and similarly lets clang understand
the operation better since it maps to a known builtin.

I hope that by GCC 16 this patch would also allow GCC to understand the
operation better via mapping to a known builtin.

libstdc++-v3/ChangeLog:

* include/bits/atomic_base.h (__atomic_fetch_addable): Define
new concept.
(__atomic_impl::__fetch_add_flt): Use new concept to make use of
__atomic_fetch_add when available.
(__atomic_fetch_subtractable, __fetch_sub_flt): Likewise.
(__atomic_add_fetchable, __add_fetch_flt): Likewise.
(__atomic_sub_fetchable, __sub_fetch_flt): Likewise.

Signed-off-by: Matthew Malcomson <mmalcomson@nvidia.com>
Co-authored-by: Jonathan Wakely <jwakely@redhat.com>

author	Matthew Malcomson <mmalcomson@nvidia.com>
	Fri, 7 Feb 2025 14:49:11 +0000 (14:49 +0000)
committer	Jonathan Wakely <redi@gcc.gnu.org>
	Fri, 14 Feb 2025 14:50:02 +0000 (14:50 +0000)
commit	5ced917508eee7eb499e19feeb3def1fa1842bb4
tree	61b034bf2987ece6564a04a0e7b4cd552a317739	tree
parent	b01664a6197f57615d3c62594037c575dfdd9035	commit \| diff