git.ipfire.org Git - thirdparty/gcc.git/log

openmp: Improve testsuite/libgomp.c/affinity-1.c testcase

I've noticed that while I have added hopefully sufficient test coverage
for the case where one uses simple number or !number as p-interval,
I haven't added any coverage for number:len:stride or number:len.

This patch adds that.

2021-10-15 Jakub Jelinek <jakub@redhat.com>

* testsuite/libgomp.c/affinity-1.c (struct places): Change name field
type from char [50] to const char *.
(places_array): Add a testcase for simplified syntax place followed
by length or length and stride.

(cherry picked from commit a10794eafb151b9274d673dfae93459d437cbe4a)

openmp: Handle OpenMP 5.1 simplified OMP_PLACES syntax

In addition to adding ll_caches and numa_domain abstract names
to OMP_PLACES syntax, OpenMP 5.1 also added one syntax simplification:
https://github.com/OpenMP/spec/issues/2080
https://github.com/OpenMP/spec/pull/2081
in particular that in the grammar place non-terminal is now
not only { res-list } but also res (i.e. a non-negative integer),
which stands as a shortcut for { res }
So, one can specify OMP_PLACES=0,4,8,12 with the meaning
OMP_PLACES={0},{4},{8},{12} or OMP_PLACES=0:4 instead of OMP_PLACES={0}:4
or OMP_PLACES={0},{1},{2},{3} etc.

This patch implements that.

2021-10-15 Jakub Jelinek <jakub@redhat.com>

* env.c (parse_one_place): Handle non-negative-number the same
as { non-negative-number }. Reject even !number:1 and
!number:1:stride or !place:1 or !place:1:stride instead of just
length other than 1.
* libgomp.texi (OpenMP 5.1): Document OMP_PLACES syntax extensions
and OMP_NUM_TEAMS/OMP_TEAMS_THREAD_LIMIT and
omp_{set_num,get_max}_teams/omp_{s,g}et_teams_thread_limit features
as implemented.
* testsuite/libgomp.c/affinity-1.c: Add a test for the 5.1 place
simplified syntax.

(cherry picked from commit 4a0fed0c0c7241562eaa5f1a4c916b689429ad86)

openmp: Fix up strtoul and strtoull uses in libgomp

Yesterday when working on numa_domains, I've noticed because of a bug
in my patch a hang on a large NUMA machine.  I've fixed the bug, but
also discovered that the hang was a result of making wrong assumptions
about strtoul/strtoull.  All the uses were for portability setting
errno = 0 before the calls and treating non-zero errno after the call
as invalid input, but for the case where there are no valid digits at
all strtoul may set errno to EINVAL, but doesn't have to and with
glibc doesn't do that.  So, this patch goes through all the strtoul calls
and next to errno != 0 checks adds also endptr == startptr check.
Haven't done it in places where we immediately reject strtoul returning 0
the same as we reject errno != 0, because strtoul must return 0 in the
case where it sets endptr to the start pointer.  In some spots the code
was using errno = 0; x = strtoul (p, &p, 10); if (errno) { /*invalid*/ }
and those spots had to be changed to
errno = 0; x = strtoul (p, &end, 10); if (errno || end == p) { /*invalid*/ }
p = end;

2021-10-15  Jakub Jelinek  <jakub@redhat.com>

* env.c (parse_schedule): For strtoul or strtoull calls which don't
clearly reject return value 0 as invalid handle the case where end
pointer is the same as first argument as invalid.
(parse_unsigned_long_1): Likewise.
(parse_one_place): Likewise.
(parse_places_var): Likewise.
(parse_stacksize): Likewise.
(parse_spincount): Likewise.
(parse_affinity): Likewise.
(parse_gomp_openacc_dim): Likewise.  Avoid strict aliasing violation.
Make code valid C89.
* config/linux/affinity.c (gomp_affinity_find_last_cache_level):
For strtoul calls which don't clearly reject return value 0 as
invalid handle the case where end pointer is the same as first
argument as invalid.
(gomp_affinity_init_level_1): Likewise.
(gomp_affinity_init_numa_domains): Likewise.
* config/rtems/proc.c (parse_thread_pools): Likewise.

(cherry picked from commit c057ed9c52c6a63a1a692268f916b1a9131cd4b7)

openmp: Fix up handling of OMP_PLACES=threads(1)

When writing the places-*.c tests, I've noticed that we mishandle threads
abstract name with specified num-places if num-places isn't a multiple of
number of hw threads in a core. It then happily ignores the maximum count
and overwrites for the remaining hw threads in a core further places that
haven't been allocated.

2021-10-15 Jakub Jelinek <jakub@redhat.com>

* config/linux/affinity.c (gomp_affinity_init_level_1): For level 1
after creating count places clean up and return immediately.
* testsuite/libgomp.c/places-6.c: New test.
* testsuite/libgomp.c/places-7.c: New test.
* testsuite/libgomp.c/places-8.c: New test.
* testsuite/libgomp.c/places-9.c: New test.
* testsuite/libgomp.c/places-10.c: New test.

(cherry picked from commit 4764049dd620affcd3e2658dc7f03a6616370a29)

openmp: Add support for OMP_PLACES=numa_domains

This adds support for numa_domains abstract name in OMP_PLACES, also new
in OpenMP 5.1.

Way to test this is
OMP_PLACES=numa_domains OMP_DISPLAY_ENV=true LD_PRELOAD=.libs/libgomp.so.1 /bin/true
and see what it prints on OMP_PLACES line.
For non-NUMA machines it should print a single place that covers all CPUs,
for NUMA machine one place for each NUMA node with corresponding CPUs.

2021-10-15 Jakub Jelinek <jakub@redhat.com>

* env.c (parse_places_var): Handle numa_domains as level 5.
* config/linux/affinity.c (gomp_affinity_init_numa_domains): New
function.
(gomp_affinity_init_level): Use it instead of
gomp_affinity_init_level_1 for level == 5.
* testsuite/libgomp.c/places-5.c: New test.

(cherry picked from commit e7ce32c783c8d38ef8a1ab227fd05cbab41da75b)

openmp: Add support for OMP_PLACES=ll_caches

This patch implements support for ll_caches abstract name in OMP_PLACES,
which stands for places where logical cpus in each place share the last
level cache.

This seems to work fine for me on x86 and kernel sources show that it is
in common code, but on some machines on CompileFarm the files I'm using,
i.e.
/sys/devices/system/cpu/cpuN/cache/indexN/level
/sys/devices/system/cpu/cpuN/cache/indexN/shared_cpu_list
don't exist, is that because they have too old kernel and newer kernels
are fine or should I implement some fallback methods (which)?
E.g. on gcc112.fsffrance.org I see just shared_cpu_map and not shared_cpu_list
(with shared_cpu_map being harder to parse) and on another box I didn't even
see the cache subdirectories.

Way to test this is
OMP_PLACES=ll_caches OMP_DISPLAY_ENV=true LD_PRELOAD=.libs/libgomp.so.1 /bin/true
and see what it prints on OMP_PLACES line.

2021-10-15 Jakub Jelinek <jakub@redhat.com>

* env.c (parse_places_var): Handle ll_caches as level 4.
* config/linux/affinity.c (gomp_affinity_find_last_cache_level): New
function.
(gomp_affinity_init_level_1): Handle level 4 as logical cpus sharing
last level cache.
(gomp_affinity_init_level): Likewise.
* testsuite/libgomp.c/places-1.c: New test.
* testsuite/libgomp.c/places-2.c: New test.
* testsuite/libgomp.c/places-3.c: New test.
* testsuite/libgomp.c/places-4.c: New test.

(cherry picked from commit 5809be05a2813f2a95d9787f388185fa31fbf3a2)

openmp: Mark declare variant directive in documentation as supported in Fortran

2021-10-14 Kwok Cheung Yeung <kcy@codesourcery.com>

libgomp/
* libgomp.texi (OpenMP 5.0): Update entry for declare variant
directive.

(cherry picked from commit 2c4666fb0686a8f5a55821f1527351dc71c018b4)

openmp, fortran: Add support for OpenMP declare variant directive in Fortran

2021-10-14  Kwok Cheung Yeung  <kcy@codesourcery.com>

gcc/c-family/

* c-omp.c (c_omp_check_context_selector): Rename to
omp_check_context_selector and move to omp-general.c.
(c_omp_mark_declare_variant): Rename to omp_mark_declare_variant and
move to omp-general.c.

gcc/c/

* c-parser.c (c_finish_omp_declare_variant): Change call from
c_omp_check_context_selector to omp_check_context_selector. Change
call from c_omp_mark_declare_variant to omp_mark_declare_variant.

gcc/cp/

* decl.c (omp_declare_variant_finalize_one): Change call from
c_omp_mark_declare_variant to omp_mark_declare_variant.
* parser.c (cp_finish_omp_declare_variant): Change call from
c_omp_check_context_selector to omp_check_context_selector.

gcc/fortran/

* gfortran.h (enum gfc_statement): Add ST_OMP_DECLARE_VARIANT.
(enum gfc_omp_trait_property_kind): New.
(struct gfc_omp_trait_property): New.
(gfc_get_omp_trait_property): New macro.
(struct gfc_omp_selector): New.
(gfc_get_omp_selector): New macro.
(struct gfc_omp_set_selector): New.
(gfc_get_omp_set_selector): New macro.
(struct gfc_omp_declare_variant): New.
(gfc_get_omp_declare_variant): New macro.
(struct gfc_namespace): Add omp_declare_variant field.
(gfc_free_omp_declare_variant_list): New prototype.
* match.h (gfc_match_omp_declare_variant): New prototype.
* openmp.c (gfc_free_omp_trait_property_list): New.
(gfc_free_omp_selector_list): New.
(gfc_free_omp_set_selector_list): New.
(gfc_free_omp_declare_variant_list): New.
(gfc_match_omp_clauses): Add extra optional argument.  Handle end of
clauses for context selectors.
(omp_construct_selectors, omp_device_selectors,
omp_implementation_selectors, omp_user_selectors): New.
(gfc_match_omp_context_selector): New.
(gfc_match_omp_context_selector_specification): New.
(gfc_match_omp_declare_variant): New.
* parse.c: Include tree-core.h and omp-general.h.
(decode_omp_directive): Handle 'declare variant'.
(case_omp_decl): Include ST_OMP_DECLARE_VARIANT.
(gfc_ascii_statement): Handle ST_OMP_DECLARE_VARIANT.
(gfc_parse_file): Initialize omp_requires_mask.
* symbol.c (gfc_free_namespace): Call
gfc_free_omp_declare_variant_list.
* trans-decl.c (gfc_get_extern_function_decl): Call
gfc_trans_omp_declare_variant.
(gfc_create_function_decl): Call gfc_trans_omp_declare_variant.
* trans-openmp.c (gfc_trans_omp_declare_variant): New.
* trans-stmt.h (gfc_trans_omp_declare_variant): New prototype.

gcc/

* omp-general.c (omp_check_context_selector):  Move from c-omp.c.
(omp_mark_declare_variant): Move from c-omp.c.
(omp_context_name_list_prop): Update for Fortran strings.
* omp-general.h (omp_check_context_selector): New prototype.
(omp_mark_declare_variant): New prototype.

gcc/testsuite/

* gfortran.dg/gomp/declare-variant-1.f90: New test.
* gfortran.dg/gomp/declare-variant-10.f90: New test.
* gfortran.dg/gomp/declare-variant-11.f90: New test.
* gfortran.dg/gomp/declare-variant-12.f90: New test.
* gfortran.dg/gomp/declare-variant-13.f90: New test.
* gfortran.dg/gomp/declare-variant-14.f90: New test.
* gfortran.dg/gomp/declare-variant-15.f90: New test.
* gfortran.dg/gomp/declare-variant-16.f90: New test.
* gfortran.dg/gomp/declare-variant-17.f90: New test.
* gfortran.dg/gomp/declare-variant-18.f90: New test.
* gfortran.dg/gomp/declare-variant-19.f90: New test.
* gfortran.dg/gomp/declare-variant-2.f90: New test.
* gfortran.dg/gomp/declare-variant-2a.f90: New test.
* gfortran.dg/gomp/declare-variant-3.f90: New test.
* gfortran.dg/gomp/declare-variant-4.f90: New test.
* gfortran.dg/gomp/declare-variant-5.f90: New test.
* gfortran.dg/gomp/declare-variant-6.f90: New test.
* gfortran.dg/gomp/declare-variant-7.f90: New test.
* gfortran.dg/gomp/declare-variant-8.f90: New test.
* gfortran.dg/gomp/declare-variant-9.f90: New test.

libgomp/

* testsuite/libgomp.fortran/declare-variant-1.f90: New test.

(cherry picked from commit 724ee5a0093da443563ae98ec5cb76164c36be80)

gomp/target-device-ancestor-*.f90: Fix testcase of OG11

Contrary to GCC 12 mainline, OG11 defers the error for
'omp requires reverse_offload' until runtime (via libgomp).
Update the testcases accordingly.

gcc/testsuite/ChangeLog:

* gfortran.dg/gomp/target-device-ancestor-2.f90: Remove dg-error
for the requires-reverse_offload sorry.
* gfortran.dg/gomp/target-device-ancestor-3.f90: Likewise.
* gfortran.dg/gomp/target-device-ancestor-4.f90: Likewise.

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9146-gb707ac10d5a9af6d77fd21d9d7996bdd0264d28e (Oct 13, 21)

Fortran: Fix Bind(C) Array-Descriptor Conversion

NOTE: This patch has been submitted for mainline at
https://gcc.gnu.org/pipermail/gcc-patches/2021-October/581575.html
but is not yet on mainline. Hence, this is not a cherry pick.

gfortran uses internally a different array descriptor ("gfc") as
Fortran 2018 alias TS291113 defines for C interoperability via
ISO_Fortran_binding.h ("CFI").  Hence, when calling a C function
from Fortran, it has to be converted in the callee - and if a
BIND(C) procedure is written in Fortran, the CFI argument has
to be converted to gfc in order work with the rest of the FE
code and the library calls.

Before this patch, part was handled in the FE generated code and
other parts in libgfortran.  With this patch, all code is generated
and CFI is defined as proper type - visible in the debugger and to
the middle end - avoiding both alias issues and missed optimization
issues.

This patch also fixes issues like: intent(out) deallocation in
the bind(C) callee, using the CFI descriptor also for allocatable
and pointer scalars and for len=* character strings.
For 'select rank', it also optimizes the code + avoid accessing
uninitialized memory if the dummy argument is allocatable/a pointer.
It additionally rejects passing a descriptorless type(*) to an
assumed-rank dummy argument. [F2018:C711]

PR fortran/102086
PR fortran/92189
PR fortran/92621
PR fortran/101308
PR fortran/101309
PR fortran/101635
PR fortran/92482

gcc/fortran/ChangeLog:

        * decl.c (gfc_verify_c_interop_param): Remove 'sorry' for
        scalar allocatable/pointer and len=*.
        * expr.c (is_CFI_desc): Return true for for those.
* gfortran.h (CFI_type_kind_shift, CFI_type_mask,
CFI_type_from_type_kind, CFI_VERSION, CFI_MAX_RANK,
CFI_attribute_pointer, CFI_attribute_allocatable,
CFI_attribute_other, CFI_type_Integer, CFI_type_Logical,
CFI_type_Real, CFI_type_Complex, CFI_type_Character,
CFI_type_ucs4_char, CFI_type_struct, CFI_type_cptr,
CFI_type_cfunptr, CFI_type_other): New #define.
* trans-array.c (CFI_FIELD_BASE_ADDR, CFI_FIELD_ELEM_LEN,
CFI_FIELD_VERSION, CFI_FIELD_RANK, CFI_FIELD_ATTRIBUTE,
CFI_FIELD_TYPE, CFI_FIELD_DIM, CFI_DIM_FIELD_LOWER_BOUND,
CFI_DIM_FIELD_EXTENT, CFI_DIM_FIELD_SM,
gfc_get_cfi_descriptor_field, gfc_get_cfi_desc_base_addr,
gfc_get_cfi_desc_elem_len, gfc_get_cfi_desc_version,
gfc_get_cfi_desc_rank, gfc_get_cfi_desc_type,
gfc_get_cfi_desc_attribute, gfc_get_cfi_dim_item,
gfc_get_cfi_dim_lbound, gfc_get_cfi_dim_extent, gfc_get_cfi_dim_sm):
New define/functions to access the CFI array descriptor.
(gfc_conv_descriptor_type): New function for the GFC descriptor.
(gfc_get_array_span): Handle expr of CFI descriptors and
assumed-type descriptors.
(gfc_trans_array_bounds): Remove 'static'.
(gfc_conv_expr_descriptor): For assumed type, use the dtype of
the actual argument.
(structure_alloc_comps): Remove ' ' inside tabs.
* trans-array.h (gfc_trans_array_bounds, gfc_conv_descriptor_type,
gfc_get_cfi_desc_base_addr, gfc_get_cfi_desc_elem_len,
gfc_get_cfi_desc_version, gfc_get_cfi_desc_rank,
gfc_get_cfi_desc_type, gfc_get_cfi_desc_attribute,
gfc_get_cfi_dim_lbound, gfc_get_cfi_dim_extent, gfc_get_cfi_dim_sm):
New prototypes.
* trans-decl.c (gfor_fndecl_cfi_to_gfc, gfor_fndecl_gfc_to_cfi):
Remove global vars.
(gfc_build_builtin_function_decls): Remove their initialization.
(gfc_get_symbol_decl, create_function_arglist,
(gfc_trans_deferred_vars): Update for CFI.
(convert_CFI_desc): Remove and replace by ...
(gfc_conv_cfi_to_gfc): ... this function
(gfc_generate_function_code): Call it; create local GFC var for CFI.
* trans-expr.c (gfc_maybe_dereference_var): Handle CFI.
(gfc_conv_subref_array_arg): Handle the if-noncontigous-only copy in
when the result should be a descriptor.
(gfc_conv_gfc_desc_to_cfi_desc): Completely rewritten.
(gfc_conv_procedure_call): CFI fixes.
* trans-openmp.c (gfc_omp_is_optional_argument,
gfc_omp_check_optional_argument): Handle optional
CFI.
* trans-stmt.c (gfc_trans_select_rank_cases): Cleanup, avoid invalid
code for allocatable/pointer dummies, which cannot be assumed size.
* trans-types.c (gfc_cfi_descriptor_base): New global var.
(gfc_get_dtype_rank_type): Skip rank init for rank < 0.
(gfc_sym_type): Handle CFI dummies.
(gfc_get_function_type): Update call.
(gfc_get_cfi_dim_type, gfc_get_cfi_type): New.
* trans-types.h (gfc_sym_type): Update prototype.
(gfc_get_cfi_type): New prototype.
* trans.c (gfc_trans_runtime_check): Make conditions more consistent
to avoid '<logical> AND_THEN <long int>' in conditions.
* trans.h (gfor_fndecl_cfi_to_gfc, gfor_fndecl_gfc_to_cfi): Remove
global-var declaration.

libgfortran/ChangeLog:

* ISO_Fortran_binding.h (CFI_type_cfunptr): Make unique type again.
* runtime/ISO_Fortran_binding.c (cfi_desc_to_gfc_desc,
gfc_desc_to_cfi_desc): Add comment that those are no longer called
by new code.

libgomp/ChangeLog:

* testsuite/libgomp.fortran/optional-bind-c.f90: New test.

gcc/testsuite/ChangeLog:

* gfortran.dg/ISO_Fortran_binding_4.f90: Extend testcase.
* gfortran.dg/PR100914.f90: Remove xfail.
* gfortran.dg/PR100915.c: Expect CFI_type_cfunptr.
* gfortran.dg/PR100915.f90: Handle CFI_type_cfunptr != CFI_type_cptr.
* gfortran.dg/PR93963.f90: Extend select-rank tests.
* gfortran.dg/bind-c-intent-out.f90: Change to dg-do run,
update scan-dump.
* gfortran.dg/bind_c_array_params_2.f90: Update/extend scan-dump.
* gfortran.dg/bind_c_char_10.f90: Update scan-dump.
* gfortran.dg/bind_c_char_8.f90: Remove dg-error "sorry".
* gfortran.dg/c-interop/allocatable-dummy.f90: Remove xfail.
* gfortran.dg/c-interop/c1255-1.f90: Likewise.
* gfortran.dg/c-interop/c407c-1.f90: Update dg-error.
* gfortran.dg/c-interop/cf-descriptor-5.f90: Remove xfail.
* gfortran.dg/c-interop/cf-out-descriptor-3.f90: Likewise.
* gfortran.dg/c-interop/cf-out-descriptor-4.f90: Likewise.
* gfortran.dg/c-interop/cf-out-descriptor-5.f90: Likewise.
* gfortran.dg/c-interop/contiguous-2.f90: Likewise.
* gfortran.dg/c-interop/contiguous-3.f90: Likewise.
* gfortran.dg/c-interop/deferred-character-1.f90: Likewise.
* gfortran.dg/c-interop/deferred-character-2.f90: Likewise.
* gfortran.dg/c-interop/fc-descriptor-3.f90: Likewise.
* gfortran.dg/c-interop/fc-descriptor-5.f90: Likewise.
* gfortran.dg/c-interop/fc-descriptor-6.f90: Likewise.
* gfortran.dg/c-interop/fc-out-descriptor-3.f90: Likewise.
* gfortran.dg/c-interop/fc-out-descriptor-4.f90: Likewise.
* gfortran.dg/c-interop/fc-out-descriptor-5.f90: Likewise.
* gfortran.dg/c-interop/fc-out-descriptor-6.f90: Likewise.
* gfortran.dg/c-interop/ff-descriptor-5.f90: Likewise.
* gfortran.dg/c-interop/ff-descriptor-6.f90: Likewise.
* gfortran.dg/c-interop/fc-descriptor-7.f90: Remove xfail + extend.
* gfortran.dg/c-interop/fc-descriptor-7-c.c: Update for changes.
* gfortran.dg/c-interop/shape.f90: Add implicit none.
* gfortran.dg/c-interop/typecodes-array-char-c.c: Add kind=4 char.
* gfortran.dg/c-interop/typecodes-array-char.f90: Likewise.
* gfortran.dg/c-interop/typecodes-array-float128.f90: Remove xfail.
* gfortran.dg/c-interop/typecodes-scalar-basic.f90: Likewise.
* gfortran.dg/c-interop/typecodes-scalar-float128.f90: Likewise.
* gfortran.dg/c-interop/typecodes-scalar-int128.f90: Likewise.
* gfortran.dg/c-interop/typecodes-scalar-longdouble.f90: Likewise.
* gfortran.dg/iso_c_binding_char_1.f90: Remove dg-error "sorry".
* gfortran.dg/pr93792.f90: Turn XFAIL into PASS.
* gfortran.dg/ISO_Fortran_binding_19.f90: New test.
* gfortran.dg/assumed_type_12.f90: New test.
* gfortran.dg/assumed_type_13.c: New test.
* gfortran.dg/assumed_type_13.f90: New test.
* gfortran.dg/bind-c-char-descr.f90: New test.
* gfortran.dg/bind-c-contiguous-1.c: New test.
* gfortran.dg/bind-c-contiguous-1.f90: New test.
* gfortran.dg/bind-c-contiguous-2.f90: New test.
* gfortran.dg/bind-c-contiguous-3.c: New test.
* gfortran.dg/bind-c-contiguous-3.f90: New test.
* gfortran.dg/bind-c-contiguous-4.c: New test.
* gfortran.dg/bind-c-contiguous-4.f90: New test.
* gfortran.dg/bind-c-contiguous-5.c: New test.
* gfortran.dg/bind-c-contiguous-5.f90: New test.

Fortran: dump-parse-tree.c fixes for OpenMP

gcc/fortran/ChangeLog:

* dump-parse-tree.c (show_omp_clauses): Handle ancestor modifier,
avoid ICE for GFC_OMP_ATOMIC_SWAP.
* gfortran.h (gfc_omp_clauses): Change 'anecestor' into a bitfield.

(cherry picked from commit 77c7abe3588d407ed820224f8d1b1a17a16831a0)

Add support for 32-bit hppa targets in muldi3 expander

2021-10-13 John David Anglin <danglin@gcc.gnu.org>

gcc/ChangeLog:

* config/pa/pa.md (muldi3): Add support for inlining 64-bit
multiplication on 32-bit PA 1.1 and 2.0 targets.

libstdc++: Fix various bugs in ranges_algo.h [PR100187, ...]

This fixes some bugs with our ranges algorithms in uncommon situations,
such as when the return type of a predicate is a non-copyable class type
that's implicitly convertible to bool (PR100187), when a comparison
predicate isn't invocable as an rvalue (PR100237), and when the return
type of a projection function is non-copyable (PR100249).

This also fixes PR100287, which reports that we're moving __first twice
when constructing with it an empty subrange in ranges::partition.

libstdc++-v3/ChangeLog:

PR libstdc++/100187
PR libstdc++/100237
PR libstdc++/100249
PR libstdc++/100287
* include/bits/ranges_algo.h (__search_n_fn::operator()): Give
the __value_comp lambda an explicit bool return type.
(__is_permutation_fn::operator()): Give the __proj_scan local
variable auto&& return type. Give the __comp_scan lambda an
explicit bool return type.
(__remove_fn::operator()): Give the __pred lambda an explicit
bool return type.
(__partition_fn::operator()): Don't std::move __first twice
when returning an empty subrange.
(__min_fn::operator()): Don't std::move __comp.
(__max_fn::operator()): Likewise.
(__minmax_fn::operator()): Likewise.

(cherry picked from commit d91e7eab3a2c3957c2220ad71e62d9fc78cccb9b)

Darwin, D: Fix bootstrap when target does not support -Bstatic/dynamic.

This fixes a bootstrap fail because saw_static_libcxx was unused for
targets without support for -Bstatic/dynamic.

The fix applied pushes the -static-libstdc++ back onto the command
line, which allows a target to substitute a static version of the
c++ standard library using specs.

Signed-off-by: Iain Sandoe <iain@sandoe.co.uk>
gcc/d/ChangeLog:

* d-spec.cc (lang_specific_driver): Push the -static-libstdc++
option back onto the command line for targets without support
for -Bstatic/dynamic.

(cherry picked from commit e24760533b62bb7068e63eb8da49dbca2837d38d)

Daily bump.

libstdc++: Fix ip::tcp::resolver test failure on Solaris

Solaris 11 does not have "http" in /etc/services, which causes this test
to fail. Try some other services until we find one that works.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* testsuite/experimental/net/internet/resolver/ops/lookup.cc:
Try other service if "http" fails.

(cherry picked from commit 48b20d46f9597a4b1e19e0e2d4a0c68d056d7662)

libstdc++: Make Networking TS headers more portable [PR100285]

Add more preprocessor conditions to check for constants being defined
before using them, so that the Networking TS headers can be compiled on
a wider range of platforms.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/100285
* configure.ac: Check for O_NONBLOCK.
* configure: Regenerate.
* include/experimental/internet: Include <ws2tcpip.h> for
Windows. Use preprocessor conditions around more constants.
* include/experimental/socket: Use preprocessor conditions
around more constants.
* testsuite/experimental/net/internet/resolver/base.cc: Only use
constants when the corresponding C macro is defined.
* testsuite/experimental/net/socket/basic_socket.cc: Likewise.
* testsuite/experimental/net/socket/socket_base.cc: Likewise.
Make preprocessor checks more fine-grained.

libstdc++: fix is_default_constructible for hash containers [PR 100863]

The recent change to _Hashtable_ebo_helper for this PR broke the
is_default_constructible trait for a hash container with a non-default
constructible allocator. That happens because the constructor needs to
be user-provided in order to initialize the member, and so is not
defined as deleted when the type is not default constructible.

By making _Hashtable derive from _Enable_special_members we can ensure
that the default constructor for the std::unordered_xxx containers is
deleted when it would be ill-formed. This makes the trait give the
correct answer.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/100863
* include/bits/hashtable.h (_Hashtable): Conditionally delete
default constructor by deriving from _Enable_special_members.
* testsuite/23_containers/unordered_map/cons/default.cc: New test.
* testsuite/23_containers/unordered_set/cons/default.cc: New test.

(cherry picked from commit 89ec3b67dbe856a447d068b053bc19559f136f43)

libstdc++: Value-initialize objects held by EBO helpers [PR 100863]

The allocator, hash function and equality function should all be
value-initialized by the default constructor of an unordered container.
Do it in the EBO helper, so we don't have to get it right in multiple
places.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/100863
PR libstdc++/65816
* include/bits/hashtable_policy.h (_Hashtable_ebo_helper):
Value-initialize subobject.
* testsuite/23_containers/unordered_map/allocator/default_init.cc:
Remove XFAIL.
* testsuite/23_containers/unordered_set/allocator/default_init.cc:
Remove XFAIL.

(cherry picked from commit f8f0193b5b83f6e85d65015e79c803295baf5166)

libstdc++: Allow lualatex to be used for Doxygen PDF

This allows the Doxygen PDF to be built using lualatex instead of
pdflatex, which solves a problem with pdflatex running out of memory
sometimes. This is done by adding a --latex_cmd option to the
run_doxygen script, which then sets the specified command in the
generated user.cfg file used by Doxygen. The makefile is adjusted to
pass --latex_cmd=$(LATEX_CMD) to the script, so using running make with
LATEX_CMD=lualatex will override the default.

Additionally, this does some refactoring of the doc/Makefile.am rules
and the run_doxygen script.

libstdc++-v3/ChangeLog:

* doc/Makefile.am: Simplify doxygen recipes and use --latex_cmd.
* doc/Makefile.in: Regenerate.
* doc/doxygen/user.cfg.in (LATEX_CMD_NAME): Add placeholder
value.
* scripts/run_doxygen (print_usage): Always print to stdout and
do not exit.
(fail): New function for exiting on error.
(parse_options): Handle --latex_cmd. Do not treat --help the
same as errors. Simplify handling of required arguments.

(cherry picked from commit e3b6d3a887fc0df09ea742c9c5a5acbc27c11ea7)

libstdc++: Reduce output of 'make doc-pdf-doxygen'

Use '@' to prevent Make from echoing the recipe, so that users don't see
this every time:

  if [ -f ${doxygen_pdf} ]; then
    mv ${doxygen_pdf} ${api_pdf} ;
    echo ":: PDF file is ${api_pdf}";
  else
    echo "... error";
    grep -F 'LaTeX Error' ${doxygen_outdir}/latex/refman.log;
    grep -F 'TeX capacity exceeded, sorry' ${doxygen_outdir}/latex/refman.log;
    exit 12;
  fi

The presence of the "error" strings in the output makes it look like an
error happened. By suppressing the echoing user's will only see "error"
if the 'else' branch is taken.

libstdc++-v3/ChangeLog:

* doc/Makefile.am (stamp-pdf-doxygen): Improve comment about
dealing with errors. Use '@' to prevent shell command being
echoed.
* doc/Makefile.in: Regenerate.

(cherry picked from commit 43a35b26e2fd2fab9c0c3ebac67e3a6c439daef4)

libstdc++: Add warnings for some C++23 deprecations

LWG 3036 deprecates std::pmr::polymorphic_allocator<T>::destroy in
favour of the equivalent member of std::allocator_traits.

LWG 3170 deprecates std::allocator<T>::is_always_equal in favour of
the equivalent member of std::allocator_traits.

This also updates a comment to note that we support the LWG 3541 change
(even before the issue was opened).

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/bits/allocator.h (allocator::is_always_equal): Deprecate.
* include/bits/iterator_concepts.h (indirectly_readable_traits):
Add LWG issue number to comment.
* include/std/memory_resource (polymorphic_allocator::release):
Deprecate.
* testsuite/20_util/allocator/requirements/typedefs.cc: Add
dg-warning for deprecation. Also check std::allocator<void>.

(cherry picked from commit 5bfcfe3087eb05b76395c9efbfc1abbf3f9e1a03)

libstdc++: Fix 17_intro/names.cc failures on Solaris

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* testsuite/17_intro/names.cc: Undefine some more names used
by Solaris system headers.

(cherry picked from commit 69b09c5599b201ac039db564c303f7b20d87e0df)

libstdc++: Remove __gnu_cxx::rope::erase(size_type) [PR102048]

This function claims to remove a single character at index p, but it
actually removes p+1 characters beginning at p. So r.erase(0) removes
the first character, but r.erase(1) removes the second and third, and
r.erase(2) removes the second, third and fourth. This is not a useful
API.

The overload is present in the SGI STL <stl_rope.h> header that we
imported, but it isn't documented in the API reference. The erase
overloads that are documented are:

erase(const iterator& p)
erase(const iterator& f, const iterator& l)
erase(size_type i, size_type n);

Having an erase(size_type p) overload that erases a single character (as
the comment says it does) might be useful, but would be inconsistent
with std::basic_string::erase(size_type p = 0, size_type n = npos),
which erases from p to the end of the string when called with a single
argument.

Since the function isn't part of the documented API, doesn't do what it
claims to do (or anything useful) and "fixing" it would leave it
inconsistent with basic_string, I'm just removing that overload.

libstdc++-v3/ChangeLog:

PR libstdc++/102048
* include/ext/rope (rope::erase(size_type)): Remove broken
function.

(cherry picked from commit 2cd229dec8d6716938de5052479d059d306969da)

libstdc++: Skip filesystem tests that depend on permissions [PR90787]

Tests that depend on filesystem permissions FAIL if run on Windows or as
root. Add a helper function to detect those cases, so the tests can skip
those checks gracefully.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/90787
* testsuite/27_io/filesystem/iterators/directory_iterator.cc:
Use new __gnu_test::permissions_are_testable() function.
* testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove_all.cc: Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/iterators/directory_iterator.cc:
Likewise.
* testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove_all.cc:
Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/util/testsuite_fs.h (__gnu_test::permissions_are_testable):
New function to guess whether testing permissions will work.

(cherry picked from commit 29b2fd371f18169141e20b90effa7205db68fb11)

libstdc++: Add missing std::move to ranges::copy/move/reverse_copy [PR101599]

In passing, this also renames the template parameter _O2 to _Out2 in
ranges::partition_copy and uglifies two of its function parameters,
out_true and out_false.

PR libstdc++/101599

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__reverse_copy_fn::operator()):
Add missing std::move in return statement.
(__partition_copy_fn::operator()): Rename templtae parameter
_O2 to _Out2. Uglify function parameters out_true and out_false.
* include/bits/ranges_algobase.h (__copy_or_move): Add missing
std::move to recursive call that unwraps a __normal_iterator
output iterator.
* testsuite/25_algorithms/copy/constrained.cc (test06): New test.
* testsuite/25_algorithms/move/constrained.cc (test05): New test.

(cherry picked from commit 14d8a5ae472ca5743016f37da2dd4770d83dea21)

libstdc++: Fix up implementation of LWG 3533 [PR101589]

In r12-569 I accidentally applied the LWG 3533 change to
elements_view::iterator::base instead to elements_view::base.

This patch corrects this, and also applies the corresponding LWG 3533
change to lazy_split_view::inner-iter::base now that we implement P2210.

PR libstdc++/101589

libstdc++-v3/ChangeLog:

* include/std/ranges (lazy_split_view::_InnerIter::base): Make
the const& overload unconstrained and return a const reference
as per LWG 3533. Make unconditionally noexcept.
(elements_view::base): Revert accidental r12-569 change.
(elements_view::_Iterator::base): Make the const& overload
unconstrained and return a const reference as per LWG 3533.
Make unconditionally noexcept.

(cherry picked from commit 4414057186b227edf5b5efa527732bfcdf39d575)

libstdc++: Add missing std::move to join_view::iterator ctor [PR101483]

PR libstdc++/101483

libstdc++-v3/ChangeLog:

* include/std/ranges (join_view::_Iterator::_Iterator): Add
missing std::move.

(cherry picked from commit 0e1bb3c88c7bd624bc34d6cebe3df9532f1858f0)

libstdc++: Define split_view::_InnerIter::base as per P2210

libstdc++-v3/ChangeLog:

* include/std/ranges (split_view::_InnerIter::base): Define as
per P2210.

(cherry picked from commit 85a594f7dc8ea5c765e136f162debb668139ebd4)

libstdc++: Implement LWG 3555 changes to transform/elements_view

libstdc++-v3/ChangeLog:

* include/std/ranges (transform_view::_Iterator::_S_iter_concept):
Consider _Base instead of _Vp as per LWG 3555.
(elements_view::_Iterator::_S_iter_concept): Likewise.

(cherry picked from commit bc046a60cfdd7145fd1e644184ced04d89feb871)

libstdc++: Implement LWG 3553 changes to split_view

libstdc++-v3/ChangeLog:

* include/std/ranges (split_view::_OuterIter::value_type::begin):
Remove the non-const overload, and remove the copyable constraint
on the const overload as per LWG 3553.

(cherry picked from commit 15736576df739fdcc5e795961dae30c7b0c87967)

libstdc++: Implement LWG 3546 changes to common_iterator

libstdc++-v3/ChangeLog:

* include/bits/stl_iterator.h
(__detail::__common_iter_use_postfix_proxy): Add
move_constructible constraint as per LWG 3546.
(common_iterator::__postfix_proxy): Adjust initializer of
_M_keep as per LWG 3546.

(cherry picked from commit 4123650bd0ae53153142949ab5305eb48ec86390)

libstdc++: Implement LWG 3557 change to convertible_to

libstdc++-v3/ChangeLog:

* include/std/concepts (convertible_to): Just use declval as per
LWG 3557.

(cherry picked from commit 83faf7eacd2081a373afb6069fd923c2dc497271)

libstdc++: Move ranges algos used by <ranges> into ranges_util.h

The <ranges> header defines simplified copies of some ranges algorithms
in order to avoid including the entirety of ranges_algo.h. A subsequent
patch is going to want to use ranges::search in <ranges> as well, and
that algorithm is more complicated compared to the other copied ones.

So rather than additionally copying ranges::search into <ranges>, this
patch splits out all the ranges algos used by <ranges> (including
ranges::search) from ranges_algo.h to ranges_util.h, and deletes the
simplified copies in <ranges>. This seems like the best place to
put these algorithms, as ranges_util.h is currently included only from
<ranges> and ranges_algo.h.

libstdc++-v3/ChangeLog:

* include/bits/ranges_algo.h (__find_fn, find, __find_if_fn)
(find_if, __find_if_not_fn, find_if_not, _in_in_result)
(__mismatch_fn, mismatch, __search_fn, search): Move to ...
* include/bits/ranges_util.h: ... here.
* include/std/ranges (__detail::find, __detail::find_if)
(__detail::find_if_not, __detail::mismatch): Remove.
(filter_view): Use ranges::find_if instead.
(drop_while_view): Use ranges::find_if_not instead.
(split_view): Use ranges::find and ranges::mismatch instead.

(cherry picked from commit 2786064d91f46cbdb35a543a883155a3982b9478)

libstdc++: Implement LWG 3490 change to drop_while_view::begin()

libstdc++-v3/ChangeLog:

PR libstdc++/100606
* include/std/ranges (drop_while_view::begin): Assert the
precondition added by LWG 3490.

(cherry picked from commit 11784fe27d879a10dc8a79212c37f50d4f7146f3)

libstdc++: Fix test that fails for C++20

Restore the test for 'a < a' that was removed by r12-2537 because
it is ill-formed. We still want to test operator< for tuple, we just
need to not use std::nullptr_t in that tuple type.

libstdc++-v3/ChangeLog:

* testsuite/20_util/tuple/comparison_operators/overloaded.cc:
Restore test for operator<.

(cherry picked from commit 727137d6ca6d3d401a0c1b4df6b9aae8b97dacd5)

libstdc++: Fix move construction of std::tuple with array elements [PR101960]

The r12-3022 commit only fixed the case where an array is the last
element of the tuple. This fixes the other cases too. We can just define
the move constructor as defaulted, which does the right thing. Changing
the move constructor to be trivial would be an ABI break, but since the
last base class still has a non-trivial move constructor, defining the
derived ones as defaulted doesn't change anything.

libstdc++-v3/ChangeLog:

PR libstdc++/101960
* include/std/tuple (_Tuple_impl(_Tuple_impl&&)): Define as
defauled.
* testsuite/20_util/tuple/cons/101960.cc: Check tuples with
array elements before the last element.

(cherry picked from commit 7481021364e75ba583972e15ed421a53988368ea)

[og11] nvptx: Revert "[nvptx] Expand OpenACC child function arguments to use CUDA params space"

This reverts commit 31e53aef12f574a8534f8aea219b5466edb75b32.

Re-measuring the effect of this patch across a set of benchmarks
shows that it appears to have an overall slightly negative effect on
performance. Thus, we are reverting it.

libstdc++: Fix testcase for newly-implemented C++20 semantics [PR102535]

libstdc++-v3/ChangeLog:

PR c++/102535
* testsuite/20_util/is_trivially_constructible/value.cc: Adjust
expected value for C++20.

(cherry picked from commit 7646847df71e57edca5ec5b8c3c3dc4550dcb49d)

libstdc++: Move test that depends on wchar_t I/O to wchar_t sub-directory

This fixes a FAIL when --disable-wchar_t is used.

libstdc++-v3/ChangeLog:

* testsuite/27_io/basic_filebuf/close/81256.cc: Moved to...
* testsuite/27_io/basic_filebuf/close/wchar_t/81256.cc: ...here.

(cherry picked from commit cfeff094e6410844d2324193610cb7a512d67713)

libstdc++: Ensure std::span and std::string_view are trivially copyable (P2251R1)

The recently approved P2251R1 paper requires these types to be trivially
copyable. They always have been in libstdc++, but add tests to check it.

libstdc++-v3/ChangeLog:

* testsuite/21_strings/basic_string_view/requirements/trivially_copyable.cc:
New test.
* testsuite/23_containers/span/trivially_copyable.cc: New test.

(cherry picked from commit 1f51e9af7b615838424214e6aaea0de793cb10fe)

libstdc++: Fix std::numeric_limits::lowest() test for strict modes

This test uses std::is_integral to decide whether we are testing an
integral or floating-point type. But that fails for __int128 because
is_integral<__int128> is false in strict modes. By using
numeric_limits::is_integer instead we get the right answer for all types
that have a numeric_limits specialization.

We can also simplify the test by removing the unnecessary tag
dispatching.

libstdc++-v3/ChangeLog:

* testsuite/18_support/numeric_limits/lowest.cc: Use
numeric_limits<T>::is_integer instead of is_integral<T>::value.

(cherry picked from commit 45ba5426c129993704a73e6ace4016eaa950d7ee)

libstdc++: Fix move construction of std::tuple with array elements [PR101960]

An array member cannot be direct-initialized in a ctor-initializer-list,
so use the base class' move constructor, which does the right thing for
both arrays and non-arrays.

This constructor could be defaulted, but that would make it trivial for
some specializations, which would change the argument passing ABI. Do
that for the versioned namespace only.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/101960
* include/std/tuple (_Tuple_impl(_Tuple_impl&&)): Use base
class' move constructor. Define as defaulted for versioned
namespace.
* testsuite/20_util/tuple/cons/101960.cc: New test.

(cherry picked from commit 0187e0d7360f327f88d8b2294668669306ae4630)

libstdc++: Fix CTAD for debug sequence containers

This fixes some 23_containers/*/cons/deduction.cc failures seen with
-std=c++17/-D_GLIBCXX_DEBUG, caused by non-immediate errors when
substituting template arguments into an incorrect specialization of the
std::__cxx1998 base class. This happens because the size_type member of
the debug container is _Base_type::size_type, so is non-deducible, and
the deduced types get substituted into _Base_type, triggering the
static_assert that checks the allocator's value_type matches the
container's.

The solution is to make the C(size_type, const T&, const Alloc&)
constructors of the debug sequence containers non-deducible. In order to
make CTAD work again deduction guides that use std::size_t for the first
argument are added.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/debug/deque (deque(size_type, const T&, const A&)):
Prevent class template argument deduction and replace with a
deduction guide.
* include/debug/forward_list (forward_list(size_type, const T&, const A&)):
Likewise.
* include/debug/list (list(size_type, const T&, const A&)):
Likewise.
* include/debug/vector (vector(size_type, const T&, const A&)):
Likewise.

(cherry picked from commit 085c2f8f0e13d7c1515ce86755a52a31faf0cf47)

libstdc++: Install GDB pretty printers for debug library

The additional libraries installed by --enable-libstdcxx-debug are built
without optimization to aid debugging, but the Python pretty printers
are not installed alongside them. This means that you can step through
the unoptimized library code, but at the expense of pretty printing the
library types.

This remedies the situation by installing another copy of the GDB hooks
alongside the debug version of libstdc++.so.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* python/Makefile.am [GLIBCXX_BUILD_DEBUG] (install-data-local):
Install another copy of the GDB hook.
* python/Makefile.in: Regenerate.

(cherry picked from commit db853ff78a34fef25bc16133e0367a64526f9f4e)

libstdc++: Add additional overload of std::lerp [PR101870]

The [cmath.syn] p1 wording about additional overloads sufficient to
handle any arithmetic types also applies to std::lerp. This adds a new
overload of std::lerp that does the required promotions to support
arguments of arbitrary arithmetic types.

A new __promoted_t alias template is added, which the C++17 function
templates std::hypot and std::lerp can use to avoid instantiating the
__promote_3 class template.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/101870
* include/c_global/cmath (hypot): Use __promoted_t.
(lerp): Add new overload accepting any arithmetic types.
* include/ext/type_traits.h (__promoted_t): New alias template.
* testsuite/26_numerics/lerp.cc: Moved to...
* testsuite/26_numerics/lerp/1.cc: ...here.
* testsuite/26_numerics/lerp/constexpr.cc: New test.
* testsuite/26_numerics/lerp/version.cc: New test.

(cherry picked from commit 9017326e19fe278d5f62898cca4682b17f8e8e07)

libstdc++: Add pretty printer for std::error_code and std::error_condition

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdErrorCodePrinter): Define.
(build_libstdcxx_dictionary): Register printer for
std::error_code and std::error_condition.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Test it.

(cherry picked from commit 2db38d9fcacf522fe9b98ba847e79ba33abdcadc)

libstdc++: Optimize std::function move constructor [PR101923]

PR 101923 points out that the unconditional swap in the std::function
move constructor makes it slower than copying an empty std::function.
The copy constructor has to check for the empty case before doing
anything, and that makes it very fast for the empty case.

Adding the same check to the move constructor avoids copying the
_Any_data POD when we don't need to. We can also inline the effects of
swap, by copying each member and then zeroing the pointer members.

This makes moving an empty object at least as fast as copying an empty
object.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/101923
* include/bits/std_function.h (function(function&&)): Check for
non-empty parameter before doing any work.

(cherry picked from commit 0808b0df9c4d31f4c362b9c85fb538b6aafcb517)

libstdc++: std::system_category should know meaning of zero [PR102425]

Although 0 is not an errno value, it should still be recognized as
corresponding to a value belonging to the generic_category().

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/102425
* src/c++11/system_error.cc
(system_error_category::default_error_condition): Add 0 to
switch.
* testsuite/19_diagnostics/error_category/102425.cc: New test.

(cherry picked from commit ce01e2e64c340dadb55a8a24c545a13e654804d4)

libstdc++: Fix UB in atomic_ref/wait_notify.cc [PR101761]

Remove UB in atomic_ref/wait_notify test.

Signed-off-by: Thomas Rodgers <trodgers@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/101761
* testsuite/29_atomics/atomic_ref/wait_notify.cc (test): Use
va and vb as arguments to wait/notify, remove unused bb local.

(cherry picked from commit f9f1a6efaaeeec06d5c07378734cb8eb47b976a7)

libstdc++: Remove non-deducible parameter for std::advance overload

This was just a copy and paste error.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* include/bits/fs_path.h (advance): Remove non-deducible
template parameter.

(cherry picked from commit 21c760510d31253074577a14021fdc6ad44084b6)

libstdc++: Fix inefficiency in filesystem::absolute [PR99876]

When the path is already absolute, the call to current_path() is
wasteful, because operator/ will ignore the left operand anyway.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/99876
* src/c++17/fs_ops.cc (fs::absolute): Call non-throwing form,
to avoid unnecessary current_path() call.

(cherry picked from commit 07b990ee23e0c7a92d362dbb25fd5d57d95eb8be)

libstdc++: Add missing return for atomic timed wait [PR102074]

This adds a missing return statement to the non-futex wait-until
operation.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/102074
* include/bits/atomic_timed_wait.h (__timed_waiter_pool)
[!_GLIBCXX_HAVE_PLATFORM_TIMED_WAIT]: Add missing return.

(cherry picked from commit 763eb1f19239ebb19c0f87590a4f02300c02c52b)

libstdc++: Fix last std::tuple constructor missing 'constexpr' [PR102270]

Also rename the test so it actually runs.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/102270
* include/std/tuple (_Tuple_impl): Add constexpr to constructor
missed in previous patch.
* testsuite/20_util/tuple/cons/102270.C: Moved to...
* testsuite/20_util/tuple/cons/102270.cc: ...here.
* testsuite/util/testsuite_allocator.h (SimpleAllocator): Add
constexpr to constructor so it can be used for C++20 tests.

(cherry picked from commit 1fa2c5a695bb962ffcf8abed49f69cdcc59d0e61)

libstdc++: Add missing 'constexpr' to std::tuple [PR102270]

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/102270
* include/std/tuple (_Head_base, _Tuple_impl): Add
_GLIBCXX20_CONSTEXPR to allocator-extended constructors.
(tuple<>::swap(tuple&)): Add _GLIBCXX20_CONSTEXPR.
* testsuite/20_util/tuple/cons/102270.C: New test.

(cherry picked from commit 734b2c2eedca50d966e22540fc136158c3633393)

libstdc++: Rename tests with incorrect extension

The libstdc++ testsuite only runs .cc files, so these two old tests have
never been run.

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

* testsuite/26_numerics/valarray/dr630-3.C: Moved to...
* testsuite/26_numerics/valarray/dr630-3.cc: ...here.
* testsuite/27_io/basic_iostream/cons/16251.C: Moved to...
* testsuite/27_io/basic_iostream/cons/16251.cc: ...here.

(cherry picked from commit 749c31b345c2a37106b57ce805ea46a6d4765e09)

libstdc++: Add missing constraint to std::span deduction guide [PR102280]

Signed-off-by: Jonathan Wakely <jwakely@redhat.com>
libstdc++-v3/ChangeLog:

PR libstdc++/102280
* include/std/span (span(Range&&)): Add constraint to deduction
guide.

(cherry picked from commit e67917f5df9d84f5aed3513b3931a82870d25135)

libstdc++: Add missing header to test

We need to include <iterator> (or one of the containers) to get a
definition for std::begin.

libstdc++-v3/ChangeLog:

* testsuite/25_algorithms/is_permutation/2.cc: Include <iterator>.

(cherry picked from commit 94311bf34704ebecf745043fe2df03df201052fe)

libstdc++: Add test for std::cmp_greater

This was omitted from the commit that added these comparisons.

libstdc++-v3/ChangeLog:

* testsuite/20_util/integer_comparisons/greater.cc: New test.

(cherry picked from commit 824e0855732c601e0866d0e8a9264a85f758213e)

libstdc++: Fix std::match_results::end() for failed matches [PR102667]

The end() function needs to consider whether the underlying vector is
empty, not whether the match_results object is empty. That's because the
underlying vector will always contain at least three elements for a
match_results object that is "ready". It contains three extra elements
which are stored in the vector but are not considered part of sequence,
and so should not be part of the [begin(),end()) range.

libstdc++-v3/ChangeLog:

PR libstdc++/102667
* include/bits/regex.h (match_result::empty()): Optimize by
calling the base function directly.
(match_results::end()): Check _Base_type::empty() not empty().
* testsuite/28_regex/match_results/102667.C: New test.

(cherry picked from commit 84088dc4bb6a546c896a068dc201463493babf43)

Fix PR target/102588

We need a 32-byte wide integer mode (OImode) in order to handle structure
returns in the 64-bit ABI.

gcc/
PR target/102588
* config/sparc/sparc-modes.def (OI): New integer mode.

Fortran version of libgomp.c-c++-common/icv-{3,4}.c

This adds the Fortran testsuite coverage of
omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit

libgomp/
* testsuite/libgomp.fortran/icv-3.f90: New.
* testsuite/libgomp.fortran/icv-4.f90: New.

(cherry picked from commit f5a538e1647ae67cf204c5c3b1bd9cca5224dfd1)

Fortran: Various CLASS + assumed-rank fixed [PR102541]

Starting point was PR102541, were a previous patch caused an invalid
e->ref access for class. When testing, it turned out that for
CLASS to CLASS the code was never executed - additionally, issues
appeared for optional and a bogus error for -fcheck=all. In particular:

There were a bunch of issues related to optional CLASS, can have the
'attr.dummy' set in CLASS_DATA (sym) - but sometimes also in 'sym'!?!
Additionally, gfc_variable_attr could return pointer = 1 for nonpointers
when the expr is no longer "var" but "var%_data".

PR fortran/102541

gcc/fortran/ChangeLog:

* check.c (gfc_check_present): Handle optional CLASS.
* interface.c (gfc_compare_actual_formal): Likewise.
* trans-array.c (gfc_trans_g77_array): Likewise.
* trans-decl.c (gfc_build_dummy_array_decl): Likewise.
* trans-types.c (gfc_sym_type): Likewise.
* primary.c (gfc_variable_attr): Fixes for dummy and
pointer when 'class%_data' is passed.
* trans-expr.c (set_dtype_for_unallocated, gfc_conv_procedure_call):
For assumed-rank dummy, fix setting rank for dealloc/notassoc actual
and setting ubound to -1 for assumed-size actuals.

gcc/testsuite/ChangeLog:

* gfortran.dg/assumed_rank_24.f90: New test.

(cherry picked from commit eb92cd57a1ebe7cd7589bdbec34d9ae337752ead)

openmp: Avoid calling clear_type_padding_in_mask in the common case where there can't be any padding

We can use the clear_padding_type_may_have_padding_p function, which
is conservative for e.g. RECORD_TYPE/UNION_TYPE, but for the floating and
complex floating types is accurate. clear_type_padding_in_mask is
more expensive because we need to allocate memory, fill it, call the function
which itself is more expensive and then analyze the memory, so for the
common case of float/double atomics or even long double on most targets
we can avoid that.

2021-10-12 Jakub Jelinek <jakub@redhat.com>

gcc/
* gimple-fold.h (clear_padding_type_may_have_padding_p): Declare.
* gimple-fold.c (clear_padding_type_may_have_padding_p): No longer
static.
gcc/c-family/
* c-omp.c (c_finish_omp_atomic): Use
clear_padding_type_may_have_padding_p.

(cherry picked from commit 8e1fe3f779185cc678493ceda42c2e620a5c1387)

openmp: Add documentation for omp_{get_max, set_num}_threads and omp_{s, g}et_teams_thread_limit

This patch adds documentation for these new OpenMP 5.1 APIs as well as
two new environment variables - OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT.

2021-10-12 Jakub Jelinek <jakub@redhat.com>

* libgomp.texi (omp_get_max_teams, omp_get_teams_thread_limit,
omp_set_num_teams, omp_set_teams_thread_limit, OMP_NUM_TEAMS,
OMP_TEAMS_THREAD_LIMIT): Document.

(cherry picked from commit 4096bf82a0cda5f79d7e5921b73897f76e00a800)

openmp: Fix up warnings on libgomp.info build

When building libgomp documentation, I see
makeinfo --split-size=5000000  -I ../../../libgomp/../gcc/doc/include -I ../../../libgomp -o libgomp.info ../../../libgomp/libgomp.texi
../../../libgomp/libgomp.texi:503: warning: node next `omp_get_default_device' in menu `omp_get_device_num' and in sectioning `omp_get_dynamic' differ
../../../libgomp/libgomp.texi:528: warning: node prev `omp_get_dynamic' in menu `omp_get_device_num' and in sectioning `omp_get_default_device' differ
../../../libgomp/libgomp.texi:560: warning: node next `omp_get_initial_device' in menu `omp_get_level' and in sectioning `omp_get_device_num' differ
../../../libgomp/libgomp.texi:587: warning: node next `omp_get_device_num' in menu `omp_get_dynamic' and in sectioning `omp_get_level' differ
../../../libgomp/libgomp.texi:587: warning: node prev `omp_get_device_num' in menu `omp_get_default_device' and in sectioning `omp_get_initial_device' differ
../../../libgomp/libgomp.texi:615: warning: node prev `omp_get_level' in menu `omp_get_initial_device' and in sectioning `omp_get_device_num' differ
warnings.  This patch fixes those.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

* libgomp.texi (omp_get_device_num): Move @node before omp_get_dynamic
to avoid makeinfo warnings.

(cherry picked from commit de7fa7063e99d29b6cc2024a90a3755a1001a154)

openmp: Add testsuite coverage for omp_{get_max,set_num}_threads and omp_{s,g}et_teams_thread_limit

This adds (C/C++ only) testsuite coverage for these new OpenMP 5.1 APIs.

2021-10-12 Jakub Jelinek <jakub@redhat.com>

* testsuite/libgomp.c-c++-common/icv-3.c: New test.
* testsuite/libgomp.c-c++-common/icv-4.c: New test.

(cherry picked from commit 88f5ad524a152711b7345b6bee2e06c5af0e88bc)

libgomp: alloc* test fixes [PR102628, PR102668]

As reported, the alloc-9.c test and alloc-{1,2,3}.F* and alloc-11.f90
tests fail on powerpc64-linux with -m32.
The reason why it fails just there is that malloc doesn't guarantee there
128-bit alignment (historically glibc guaranteed 2 * sizeof (void *)
alignment from malloc).

There are two separate issues.
One is a thinko on my side.
In this part of alloc-9.c test (copied to alloc-11.f90), we have
2 allocators, a with pool size 1024B and alignment 16B and default fallback
and a2 with pool size 512B and alignment 32B and a as fallback allocator.
We start at no allocations in both at line 194 and do:
  p = (int *) omp_alloc (sizeof (int), a2);
// This succeeds in a2 and needs 4+overhead bytes (which includes the 32B alignment)
  p = (int *) omp_realloc (p, 420, a, a2);
// This allocates 420 bytes+overhead in a, with 16B alignment and deallocates the above
  q = (int *) omp_alloc (sizeof (int), a);
// This allocates 4+overhead bytes in a, with 16B alignment
  q = (int *) omp_realloc (q, 420, a2, a);
// This allocates 420+overhead in a2 with 32B alignment
  q = (int *) omp_realloc (q, 768, a2, a2);
// This attempts to reallocate, but as there are elevated alignment
// requirements doesn't try to just realloc (even if it wanted to try that
// a2 is almost full, with 512-420-overhead bytes left in it), so it
// tries to alloc in a2, but there is no space left in the pool, falls
// back to a, which already has 420+overhead bytes allocated in it and
// 1024-420-overhead bytes left and so fails too and fails to default
// non-pool allocator that allocates it, but doesn't guarantee alignment
// higher than malloc guarantees.
// But, the test expected 16B alignment.

So, I've slightly lowered the allocation sizes in that part of the test
420->320 and 768 -> 568, so that the last test still fails to allocate
in a2 (568 > 512-320-overhead) but succeeds in a as fallback, which was
the intent of the test.

Another thing is that alloc-1.F90 seems to be transcription of
libgomp.c-c++-common/alloc-1.c into Fortran, but alloc-1.c had:
  q = (int *) omp_alloc (768, a2);
  if ((((uintptr_t) q) % 16) != 0)
    abort ();
  q[0] = 7;
  q[767 / sizeof (int)] = 8;
  r = (int *) omp_alloc (512, a2);
  if ((((uintptr_t) r) % __alignof (int)) != 0)
    abort ();
there but Fortran has:
        cq = omp_alloc (768_c_size_t, a2)
        if (mod (transfer (cq, intptr), 16_c_intptr_t) /= 0) stop 12
        call c_f_pointer (cq, q, [768 / c_sizeof (i)])
        q(1) = 7
        q(768 / c_sizeof (i)) = 8
        cr = omp_alloc (512_c_size_t, a2)
        if (mod (transfer (cr, intptr), 16_c_intptr_t) /= 0) stop 13
I'm changing the latter to 4_c_intptr_t because other spots in the
testcase do that, Fortran sadly doesn't have c_alignof, but strictly
speaking it isn't correct, __alignof (int) could be on some architectures
smaller than 4.
So probably alloc-1.F90 etc. should also have
! { dg-additional-sources alloc-7.c }
! { dg-prune-output "command-line option '-fintrinsic-modules-path=.*' is valid for Fortran but not for C" }
and use get__alignof_int.

2021-10-12  Jakub Jelinek  <jakub@redhat.com>

PR libgomp/102628
PR libgomp/102668
* testsuite/libgomp.c-c++-common/alloc-9.c (main): Decrease
allocation sizes from 420 to 320 and from 768 to 568.
* testsuite/libgomp.fortran/alloc-11.f90: Likewise.
* testsuite/libgomp.fortran/alloc-1.F90: Change expected alignment
for cr from 16 to 4.

(cherry picked from commit 342aedf0e5f324cc2fb026466a8cc5cc7f839183)

gfortran.dg/gomp/defaultmap-2.f90: Use dg-message not -dg-note

gcc/testsuite/
* gfortran.dg/gomp/defaultmap-2.f90: Replace unsupported
dg-note by generic dg-message.

Daily bump.

tree-optimization: [PR102622]: wrong code due to signed one bit integer and "a?-1:0"

Since the problem was already fixed on this branch, we just want to add the
testcase so it does not regress there.

PR tree-optimization/102622

gcc/testsuite/ChangeLog:

* gcc.c-torture/execute/bitfld-10.c: New test.

(cherry picked from commit 882d806c1a8f9d2d2ade1133de88d63e5d4fe40c)

doc: improve -fsanitize=undefined description

gcc/ChangeLog:
* doc/invoke.texi: Add link to UndefinedBehaviorSanitizer
documentation, mention UBSAN_OPTIONS, similar to what is done
for AddressSanitizer.

(cherry picked from commit 1c0a83eff7bb5b1db997a9726ae6542aec893baa)

libgomp: Add tests for omp_atv_serialized and deprecate omp_atv_sequential.

The variable omp_atv_sequential was replaced by omp_atv_serialized in OpenMP
5.1. This was already implemented by Jakub (C/C++, commit ea82325afec) and
Tobias (Fortran, commit fff15bad1ab).

This patch adds two tests to check if omp_atv_serialized is available (one test
for C/C++ and one for Fortran). Besides that omp_atv_sequential is marked as
deprecated in C/C++ and Fortran for OpenMP 5.1.

libgomp/ChangeLog:

* allocator.c (omp_init_allocator): Replace omp_atv_sequential with
omp_atv_serialized.
* omp.h.in: Add deprecated flag for omp_atv_sequential.
* omp_lib.f90.in: Add deprecated flag for omp_atv_sequential.
* testsuite/libgomp.c-c++-common/alloc-10.c: New test.
* testsuite/libgomp.fortran/alloc-12.f90: New test.

(cherry picked from commit f70977936a306e2fb4140361ba47bf5d5cc0a47d)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9094-gb3dfc8635d26b9c3cbe7731dd7b8be8a2242eab9 (Oct 11, 2021)

openmp: Add omp_set_num_teams, omp_get_max_teams, omp_[gs]et_teams_thread_limit

OpenMP 5.1 adds env vars and functions to set and query new ICVs used
as fallback if thread_limit or num_teams clauses aren't specified on
teams construct.

The following patch implements those, though further work will be needed:
1) OpenMP 5.1 also changed the num_teams clause, so that it can specify
   both lower and upper limit for how many teams should be created and
   changed the meaning when only one expression is provided, instead of
   num_teams(expr) in 5.0 meaning num_teams(1:expr) in 5.1, it now means
   num_teams(expr:expr), i.e. while previously we could create 1 to expr
   teams, in 5.1 we have some low limit by default equal to the single
   expression provided and may not create fewer teams.
   For host teams (which we don't currently implement efficiently for
   NUMA hosts) we trivially satisfy it now by always honoring what the
   user asked for, but for the offloading teams I think we'll need to
   rethink the APIs; currently teams construct is just a call that returns
   and possibly lowers the number of teams; and whenever possible we try
   to evaluate num_teams/thread_limit already on the target construct
   and the GOMP_teams call just sets the number of teams to the minimum
   of provided and requested teams; for some cases e.g. where target
   is not combined with teams and num_teams expression calls some functions
   etc., we need to call those functions in the target region and so it is
   late to figure number of teams, but also hw could just limit what it
   is willing to create; in that case I'm afraid we need to run the target
   body multiple times and arrange for omp_get_team_num () returning the
   right values
2) we need to finally implement the NUMA handling for GOMP_teams_reg
3) I now realize I haven't added some testcase coverage, will do that
   incrementally
4) libgomp.texi needs updates for these new APIs, but also others like
   the allocator

2021-10-11  Jakub Jelinek  <jakub@redhat.com>

gcc/
* omp-low.c (omp_runtime_api_call): Handle omp_get_max_teams,
omp_[sg]et_teams_thread_limit and omp_set_num_teams.
libgomp/
* omp.h.in (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare.
* omp_lib.f90.in (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare.
* omp_lib.h.in (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Declare.
* libgomp.h (gomp_nteams_var, gomp_teams_thread_limit_var): Declare.
* libgomp.map (OMP_5.1): Export omp_get_max_teams{,_},
omp_get_teams_thread_limit{,_}, omp_set_num_teams{,_,_8_} and
omp_set_teams_thread_limit{,_,_8_}.
* icv.c (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): New
functions.
* env.c (gomp_nteams_var, gomp_teams_thread_limit_var): Define.
(omp_display_env): Print OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT.
(initialize_env): Handle OMP_NUM_TEAMS and OMP_TEAMS_THREAD_LIMIT env
vars.
* teams.c (GOMP_teams_reg): If thread_limit is not specified, use
gomp_teams_thread_limit_var as fallback if not zero.  If num_teams
is not specified, use gomp_nteams_var.
* fortran.c (omp_set_num_teams, omp_get_max_teams,
omp_set_teams_thread_limit, omp_get_teams_thread_limit): Add
ialias_redirect.
(omp_set_num_teams_, omp_set_num_teams_8_, omp_get_max_teams_,
omp_set_teams_thread_limit_, omp_set_teams_thread_limit_8_,
omp_get_teams_thread_limit_): New functions.

(cherry picked from commit 07dd3bcda17f97cf5476c3d6f2f2501c1e0712e6)

Daily bump.

var-tracking: Fix a wrong-debug issue caused by my r10-7665 var-tracking change [PR102441]

Since my r10-7665-g33c45e51b4914008064d9b77f2c1fc0eea1ad060 change, we get
wrong-debug on e.g. the following testcase at -O2 -g on x86_64-linux for the
x parameter:
void bar (int *r);
int
foo (int x)
{
  int r = 0;
  bar (&r);
  return r;
}
At the start of function, we have
        subq    $24, %rsp
        leaq    12(%rsp), %rdi
instructions.  The x parameter is passed in %rdi, but isn't used in the
function and so the leaq instruction overwrites %rdi without remembering
%rdi anywhere.  Before the r10-7665 change (which was trying to fix a large
(3% for 32-bit, 1% for 64-bit x86-64) debug info/loc growth introduced with
r10-7515), the leaq insn above resulted in a MO_VAL_SET micro-operation that
said that the value of sp + 12, a cselib_sp_derived_value_p, is stored into
the %rdi register.  The r10-7665 change added a change to add_stores that
added no micro-operation for the leaq store, with the rationale that the sp
based values can be and will be always computable some other more compact
and primarily more stable way (cfa based expression like DW_OP_fbreg, that
is the same in the whole function).  That is true.  But by throwing the
micro-operation on the floor, we miss another important part of the
MO_VAL_SET, in particular that the destination of the store, %rdi in this
case, now has a different value from what it had before, so the vt_*
dataflow code thinks that even after the leaq instruction %rdi still holds
the x argument value (and changes it to DW_OP_entry_value (%rdi) only in the
middle of the call to bar).  Previously and with the patches below,
the location for x changes already at the end of leaq instruction to
DW_OP_entry_value (%rdi).

My first attempt to fix this was instead of dropping the MO_VAL_SET add
a MO_CLOBBER operation:
--- gcc/var-tracking.c.jj       2021-05-04 21:02:24.196799586 +0200
+++ gcc/var-tracking.c  2021-09-24 19:23:16.420154828 +0200
@@ -6133,7 +6133,9 @@ add_stores (rtx loc, const_rtx expr, voi
     {
       if (preserve)
        preserve_value (v);
-      return;
+      mo.type = MO_CLOBBER;
+      mo.u.loc = loc;
+      goto log_and_return;
     }

   nloc = replace_expr_with_values (oloc);
so don't track that the value lives in the loc destination, but track
that the previous value doesn't live there anymore.  That failed bootstrap
miserably, the vt_* code isn't prepared to see MO_CLOBBER of a MEM that
isn't tracked (e.g. has MEM_EXPR on it that the var-tracking code wants
to track, i.e. track_p in add_stores).  On the other side, thinking about
it more, in the most common case where a cselib_sp_derived_value_p value
is stored into the sp register (and which is the reason why PR94495
testcase got larger), dropping the micro-operation on the floor is the
right thing, because we have that cselib_sp_derived_value_p tracking, any
reads from the sp hard register will be treated as
cselib_sp_derived_value_p.
Then I've tried 3 different patches described below and in the end
what is committed is patch2.
Additionally, I've gathered statistics from cc1plus by always reverting the
var-tracking.c change after finished bootstrap/regtest and rebuilding the
stage3 var-tracking.o and cc1plus, such that it would be comparable.
dwlocstat and .debug_{info,loclists} section sizes detailed below.
patch3 uses MO_VAL_SET (i.e. essentially reversion of the r10-7665
change) when destination is not a REG_P and !track_p, otherwise if
destination is sp drops the micro-operation on the floor (i.e. no change),
otherwise adds a MO_CLOBBER.
patch1 is similar, except it checks for destination not equal to sp and
!track_p, i.e. for !track_p REG_P destinations other than sp it will use
MO_VAL_SET rather than MO_CLOBBER.
Finally, patch2, the shortest patch, uses MO_VAL_SET whenever destination
is not sp and otherwise drops the micro-operation on the floor.
All the 3 patches don't affect the PR94495 testcase, all the changes
there were caused by stores of sp based values into %rsp.

While the patch2 (and patch1 which results in exactly the same sizes)
causes the largest debug loclists/info growth from the 3, it is still quite
minor (0.651% on 64-bit and 0.114% on 32-bit) compared
to the 1% and 3% PR94495 was trying to solve, and I actually think it is the
best thing to do.  Because, if we have say
  int q[10];
  int *p = &q[0];
or similar and we load the &q[0] sp based value into some hard register,
by noting in the debug info that p lives in some hard reg for some part
of the function and a user is trying to change the p var in the debugger,
if we say it lives in some register or memory, there is some chance that
the changing of the value could work successfully (of course, nothing
is guaranteed, we don't have tracking of where each var lives at which
moment for changing purposes (i.e. what register, memory or else you need
to change in order to change behavior of the code)), while if we just say
that p's location is DW_OP_fbreg 16 DW_OP_stack_value, that is a read-only
value one can just print but not change.  Now, for stores of variable
values into the sp register, I don't think we have such an issue, you don't
want debugger to change your stack pointer when user asks to change value
of some variable whose value lives in the stack pointer, that would pretty
much always result in misbehavior of the program.
So, my preference from these 3 is patch2 and that is being committed.

64-bit cc1plus
==============
vanilla
cov%    samples cumul
0..10   1064665/37%     1064665/37%
11..20  35972/1%        1100637/38%
21..30  47969/1%        1148606/40%
31..40  45787/1%        1194393/42%
41..50  57529/2%        1251922/44%
51..60  53974/1%        1305896/46%
61..70  112055/3%       1417951/50%
71..80  79420/2%        1497371/52%
81..90  126225/4%       1623596/57%
91..100 1206682/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a44949f 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff5d046 506e947 00      0   0  1
patch1 (same as patch2)
cov%    samples cumul
0..10   1064685/37%     1064685/37%
11..20  36011/1%        1100696/38%
21..30  47975/1%        1148671/40%
31..40  45799/1%        1194470/42%
41..50  57566/2%        1252036/44%
51..60  54011/1%        1306047/46%
61..70  112068/3%       1418115/50%
71..80  79421/2%        1497536/52%
81..90  126171/4%       1623707/57%
91..100 1206571/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a448f27 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff608bc 52070dd 00      0   0  1
patch3
cov%    samples cumul
0..10   1064698/37%     1064698/37%
11..20  36018/1%        1100716/38%
21..30  47977/1%        1148693/40%
31..40  45804/1%        1194497/42%
41..50  57562/2%        1252059/44%
51..60  54018/1%        1306077/46%
61..70  112071/3%       1418148/50%
71..80  79424/2%        1497572/52%
81..90  126172/4%       1623744/57%
91..100 1206534/42%     2830278/100%
  [34] .debug_info       PROGBITS        0000000000000000 2f1c74c a449548 00      0   0  1
  [38] .debug_loclists   PROGBITS        0000000000000000 ff5df39 507acd8 00      0   0  1
So, size of .debug_info+.debug_loclists grows for vanilla -> patch1 (or patch2) by
0.651% and for vanilla -> patch3 by 0.020%.

32-bit cc1plus
==============
vanilla
cov%    samples cumul
0..10   1061892/37%     1061892/37%
11..20  34002/1%        1095894/39%
21..30  43513/1%        1139407/40%
31..40  41667/1%        1181074/42%
41..50  59144/2%        1240218/44%
51..60  47009/1%        1287227/45%
61..70  105069/3%       1392296/49%
71..80  72990/2%        1465286/52%
81..90  125988/4%       1591274/56%
91..100 1208726/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c83d 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebc816e 3fe44fd 00      0   0  1
patch1 (same as patch2)
cov%    samples cumul
0..10   1061999/37%     1061999/37%
11..20  34065/1%        1096064/39%
21..30  43557/1%        1139621/40%
31..40  41690/1%        1181311/42%
41..50  59191/2%        1240502/44%
51..60  47143/1%        1287645/45%
61..70  105045/3%       1392690/49%
71..80  73021/2%        1465711/52%
81..90  125885/4%       1591596/56%
91..100 1208404/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c597 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebca915 401ffad 00      0   0  1
patch3
cov%    samples cumul
0..10   1062006/37%     1062006/37%
11..20  34073/1%        1096079/39%
21..30  43559/1%        1139638/40%
31..40  41693/1%        1181331/42%
41..50  59189/2%        1240520/44%
51..60  47142/1%        1287662/45%
61..70  105054/3%       1392716/49%
71..80  73027/2%        1465743/52%
81..90  125874/4%       1591617/56%
91..100 1208383/43%     2800000/100%
  [33] .debug_info       PROGBITS        00000000 351ab10 8b1c690 00      0   0  1
  [37] .debug_loclists   PROGBITS        00000000 ebca40a 4020a6e 00      0   0  1
So, size of .debug_info+.debug_loclists grows for vanilla -> patch1 (or patch2) by
0.114% and for vanilla -> patch3 by 0.116%.

2021-10-10  Jakub Jelinek  <jakub@redhat.com>

PR debug/102441
* var-tracking.c (add_stores): For cselib_sp_derived_value_p values
use MO_VAL_SET if loc is not sp.

(cherry picked from commit 9583b26f3701ea0456405d84f9a898451a2f7452)

Daily bump.

openmp: Add support for OpenMP 5.1 structured-block-sequences

Related to this is the addition of structured-block-sequence in OpenMP 5.1,
which doesn't change anything for Fortran, but for C/C++ allows multiple
statements instead of just one possibly compound around the separating
directives (section and scan).

I've also made some updates to the OpenMP 5.1 support list in libgomp.texi.

2021-10-09  Jakub Jelinek  <jakub@redhat.com>

gcc/c/
* c-parser.c (c_parser_omp_structured_block_sequence): New function.
(c_parser_omp_scan_loop_body): Use it.
(c_parser_omp_sections_scope): Likewise.
gcc/cp/
* parser.c (cp_parser_omp_structured_block): Remove disallow_omp_attrs
argument.
(cp_parser_omp_structured_block_sequence): New function.
(cp_parser_omp_scan_loop_body): Use it.
(cp_parser_omp_sections_scope): Likewise.
gcc/testsuite/
* c-c++-common/gomp/sections1.c (foo): Don't expect errors on
multiple statements in between section directive(s).  Add testcases
for invalid no statements in between section directive(s).
* gcc.dg/gomp/sections-2.c (foo): Don't expect errors on
multiple statements in between section directive(s).
* g++.dg/gomp/sections-2.C (foo): Likewise.
* g++.dg/gomp/attrs-6.C (foo): Add testcases for multiple
statements in between section directive(s).
(bar): Add testcases for multiple statements in between scan
directive.
* g++.dg/gomp/attrs-7.C (bar): Adjust expected error recovery.
libgomp/
* libgomp.texi (OpenMP 5.1): Mention implemented support for
structured block sequences in C/C++.  Mention support for
unconstrained/reproducible modifiers on order clause.
Mention partial (C/C++ only) support of extentensions to atomics
construct.  Mention partial (C/C++ on clause only) support of
align/allocator modifiers on allocate clause.

(cherry picked from commit 875124eb0822d8905d73bd30d51421ba8afde282)

Daily bump.

Fortran: Add diagnostic for F2018:C839 (TS29113:C535c)

2021-10-08 Sandra Loosemore <sandra@codesourcery.com>

PR fortran/54753

gcc/fortran/
* interface.c (gfc_compare_actual_formal): Add diagnostic
for F2018:C839. Refactor shared code and fix bugs with class
array info lookup, and extend similar diagnostic from PR94110
to also cover class types.

gcc/testsuite/
* gfortran.dg/c-interop/c535c-1.f90: Rewrite and expand.
* gfortran.dg/c-interop/c535c-2.f90: Remove xfails.
* gfortran.dg/c-interop/c535c-3.f90: Likewise.
* gfortran.dg/c-interop/c535c-4.f90: Likewise.
* gfortran.dg/PR94110.f90: Extend to cover class types.

(cherry picked from commit 7afb61087d2cb7a6d27463bab5a7567fac69f97a)

Fortran: Avoid var initialization in interfaces [PR54753]

Intent(out) implies deallocation/default initialization; however, it is
pointless to do this for dummy-arguments symbols of procedures which are
inside an INTERFACE block. – This also fixes a bogus error for the attached
included testcase, but fixing the non-interface version still has to be done.

PR fortran/54753

gcc/fortran/ChangeLog:

* resolve.c (can_generate_init, resolve_fl_variable_derived,
resolve_symbol): Only do initialization with intent(out) if not
inside of an interface block.

(cherry picked from commit 51d9ef7747b2dc439f7456303f0784faf5cdb1d3)

openmp: Fix up declare target handling for vars with DECL_LOCAL_DECL_ALIAS [PR102640]

The introduction of DECL_LOCAL_DECL_ALIAS and push_local_extern_decl_alias
in r11-3699-g4e62aca0e0520e4ed2532f2d8153581190621c1a broke the following
testcase.  The following patch fixes it by treating similarly not just
the variable to or link clause is put on, but also its DECL_LOCAL_DECL_ALIAS
if any.  If it hasn't been created yet, when it is created it will copy
attributes and therefore should get it for free, and as it is an extern,
nothing more than attributes is needed for it.

2021-10-08  Jakub Jelinek  <jakub@redhat.com>

PR c++/102640
gcc/cp/
* parser.c (handle_omp_declare_target_clause): New function.
(cp_parser_omp_declare_target): Use it.
gcc/testsuite/
* c-c++-common/gomp/pr102640.c: New test.

(cherry picked from commit db3d7270b42fe27fb05664c4fdf524ab7ad13a75)

Daily bump.

c++: variadic ttp constraint subsumption [PR99904]

Here we're crashing when level-lowering the variadic constraint C<Ts...>
on the template template parameter TT because tsubst_pack_expansion expects
processing_template_decl to be set during a partial substitution.

PR c++/99904

gcc/cp/ChangeLog:

* pt.c (is_compatible_template_arg): Set processing_template_decl
around tsubst_constraint_info.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-ttp4.C: New test.

(cherry picked from commit 2e6e0d86a06389056d0e7fecc99c547420ad787a)

Daily bump.

c++: unifying equal NONTYPE_ARGUMENT_PACKs [PR102547]

Here during partial ordering of the two partial specializations we end
up in unify with parm=arg=NONTYPE_ARGUMENT_PACK<V0, V1>, and crash shortly
thereafter because uses_template_parms(parms) calls potential_const_expr
which doesn't handle NONTYPE_ARGUMENT_PACK.

This patch fixes this by extending potential_constant_expression to handle
NONTYPE_ARGUMENT_PACK appropriately.

PR c++/102547

gcc/cp/ChangeLog:

* constexpr.c (potential_constant_expression_1): Handle
NONTYPE_ARGUMENT_PACK.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/variadic-partial2.C: New test.
* g++.dg/cpp0x/variadic-partial2a.C: New test.

(cherry picked from commit d4c470c376b4cb82c9a0b7e8a4b88c44d5e4289d)

c++: __is_trivially_xible and multi-arg aggr paren init [PR102535]

is_xible_helper assumes only 0- and 1-argument ctors can be trivial, but
C++20 aggregate paren init means multi-arg ctors can now be trivial too.
This patch relaxes the relevant early exit check accordingly.

PR c++/102535

gcc/cp/ChangeLog:

* method.c (is_xible_helper): Don't exit early for multi-arg
ctors in C++20.

gcc/testsuite/ChangeLog:

* g++.dg/ext/is_trivially_constructible7.C: New test.

(cherry picked from commit 9845c52db38f15740861435f38f7e5ad8a8de2ec)

c++: defaulted comparisons and vptr fields [PR95567]

We need to explicitly skip over vptr fields when synthesizing a
defaulted comparison operator, because next_initializable_field
doesn't do so for us.

PR c++/95567

gcc/cp/ChangeLog:

* method.c (build_comparison_op): Skip DECL_VIRTUAL_P fields.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/spaceship-virtual1.C: New test.

(cherry picked from commit b6bca2e631b54f992c058ca8e445b45e9816690b)

real: fix encoding of negative IEEE double/quad values [PR98216]

In encode_ieee_double/quad, the assignment

unsigned long WORD = r->sign << 31;

is intended to set the 31st bit of WORD whenever the sign bit is set.
But on LP64 hosts it also unintentionally sets the upper 32 bits of WORD,
because r->sign gets promoted from unsigned:1 to int and then the result
of the shift (equal to INT_MIN) gets sign extended from int to long.

In the C++ frontend, this bug causes incorrect mangling of negative
floating point values because the output of real_to_target called from
write_real_cst unexpectedly has the upper 32 bits of this word set,
which the caller doesn't mask out.

This patch fixes this by avoiding the unwanted sign extension. Note
that r0-53976 fixed the same bug in encode_ieee_single long ago.

PR c++/98216
PR c++/91292

gcc/ChangeLog:

* real.c (encode_ieee_double): Avoid unwanted sign extension.
(encode_ieee_quad): Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-float2.C: New test.

(cherry picked from commit 34947d4e97ee72b26491cfe5ff4fa8258fadbe95)

c++: concept-ids and value-dependence [PR102412]

The problem here is that uses_template_parms returns true for all
concept-ids (even those with non-dependent arguments), so when a concept-id
is used as a default template argument then during deduction the default
argument is considered dependent even after substituting into it, which
leads to deduction failure (from type_unification_real).

This patch fixes this by implementing the resolution of CWG 2446 which
says a concept-id is dependent only if its arguments are.

DR 2446
PR c++/102412

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_constant_expression)
<case TEMPLATE_ID_EXPR>: Check value_dependent_expression_p
instead of processing_template_decl.
* pt.c (value_dependent_expression_p) <case TEMPLATE_ID_EXPR>:
Return true only if any_dependent_template_arguments_p.
(instantiation_dependent_r) <case CALL_EXPR>: Remove this case.
<case TEMPLATE_ID_EXPR>: Likewise.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-nondep2.C: New test.
* g++.dg/cpp2a/concepts-nondep3.C: New test.

(cherry picked from commit 9329344a6d81a6a5e3bd171167ebc7b158bb44f4)

c++: constrained variable template issues [PR98486]

This fixes some issues with constrained variable templates:

  - Constraints aren't checked when explicitly specializing a variable
    template.
  - Constraints aren't attached to a static data member template at
    parse time.
  - Constraints don't get propagated when (partially) instantiating a
    static data member template, so we need to make sure to look up
    constraints using the most general template during satisfaction.

PR c++/98486

gcc/cp/ChangeLog:

* constraint.cc (get_normalized_constraints_from_decl): Always
look up constraints using the most general template.
* decl.c (grokdeclarator): Set constraints on a static data
member template.
* pt.c (determine_specialization): Check constraints on a
variable template.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-var-templ1.C: New test.
* g++.dg/cpp2a/concepts-var-templ1a.C: New test.
* g++.dg/cpp2a/concepts-var-templ1b.C: New test.

(cherry picked from commit 2e2e65a46d2674bed53afd211493876ee2b79453)

c++: empty union member activation during constexpr [PR102163]

Here, the union's constructor is defined to activate its empty data
member _M_rest, but during constexpr evaluation of this constructor the
subobject constructor call O::O(&_M_rest, 42) doesn't produce a side
effect that actually activates the member, so the union still appears
uninitialized after its constructor has run. This patch fixes this by
using a dummy MODIFY_EXPR in this situation, whose evaluation ensures
the member gets activated.

PR c++/102163

gcc/cp/ChangeLog:

* constexpr.c (cxx_eval_call_expression): After evaluating a
subobject constructor call for an empty union member, produce a
side effect that makes sure the member gets activated.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-empty17.C: New test.

(cherry picked from commit de07cff96abd43f6f65dcf333958899c2ec42598)

c++: aggregate CTAD and brace elision [PR101344]

Here the problem is ultimately that collect_ctor_idx_types always
recurses into an eligible sub-CONSTRUCTOR regardless of whether the
corresponding pair of braces was elided in the original initializer.
This causes us to reject some completely-braced forms of aggregate
CTAD as in the first testcase below, because collect_ctor_idx_types
effectively assumes that the original initializer is always minimally
braced (and so the aggregate deduction candidate is given a function
type that's incompatible with the original completely-braced initializer).

In order to fix this, collect_ctor_idx_types needs to somehow know the
shape of the original initializer when iterating over the reshaped
initializer. To that end this patch makes reshape_init flag sub-ctors
that were built to undo brace elision in the original ctor, so that
collect_ctor_idx_types that determine whether to recurse into a sub-ctor
by simply inspecting this flag.

This happens to also fix PR101820, which is about aggregate CTAD using
designated initializers, for much the same reasons.

A curious case is the "intermediately-braced" initialization of 'e3'
(which we reject) in the first testcase below. It seems to me we're
behaving as specified here (according to [over.match.class.deduct]/1)
because the initializer element x_1={1, 2, 3, 4} corresponds to the
subobject e_1=E::t, hence the type T_1 of the first function parameter
of the aggregate deduction candidate is T(&&)[2][2], but T can't be
deduced from x_1 using this parameter type (as opposed to say T(&&)[4]).

PR c++/101344
PR c++/101803

gcc/cp/ChangeLog:

* cp-tree.h (CONSTRUCTOR_BRACES_ELIDED_P): Define.
* decl.c (reshape_init_r): Set it.
* pt.c (collect_ctor_idx_types): Recurse into a sub-CONSTRUCTOR
iff CONSTRUCTOR_BRACES_ELIDED_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/class-deduction-aggr11.C: New test.
* g++.dg/cpp2a/class-deduction-aggr12.C: New test.

(cherry picked from commit be4a4fb516688d7cfe28a80a4aa333f4ecf0b518)

c++: ignore explicit dguides during NTTP CTAD [PR101883]

Since (template) argument passing is a copy-initialization context,
we mustn't consider explicit deduction guides when deducing a CTAD
placeholder type of an NTTP.

PR c++/101883

gcc/cp/ChangeLog:

* pt.c (convert_template_argument): Pass LOOKUP_IMPLICIT to
do_auto_deduction.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class49.C: New test.

(cherry picked from commit a6b3db3e8625a3cba1240f0b5e1a29bd6c68b8ca)

Merge remote-tracking branch 'origin/releases/gcc-11' into devel/omp/gcc-11

Merge up to r11-9077-g7d7630fb6636530d5154d68a777b1d7d7086cfe3 (Oct 6, 2021)

Fortran: Fix deprecate warning with parameter

Only warn with !GCC$ ATTRIBUTES DEPRECATED if
deprecated PARMETERS are actually used.

gcc/fortran/ChangeLog:

* resolve.c (resolve_values): Only show
deprecated warning if attr.referenced.

gcc/testsuite/ChangeLog:

* gfortran.dg/attr_deprecated-2.f90: New test.

(cherry picked from commit ece8b0fce6bbfb1e531de8164da47eeed80d3cf1)

openmp: Optimize for OpenMP atomics 2x__builtin_clear_padding+__builtin_memcmp if possible

For the few long double types that do have padding bits, e.g. on x86
the clear_type_padding_in_mask computed mask is
ff ff ff ff ff ff ff ff ff ff 00 00 for 32-bit and
ff ff ff ff ff ff ff ff ff ff 00 00 00 00 00 00 for 64-bit.
Instead of doing __builtin_clear_padding on both operands that will clear the
last 2 or 6 bytes and then memcmp on the whole 12/16 bytes, we can just
memcmp 10 bytes. The code also handles if the padding would be at the start
or both at the start and end, but everything on byte boundaries only and
non-padding bits being contiguous.
This works around a tree-ssa-dse.c bug (but we need to fix it anyway,
as libstdc++ won't do this and as it can deal with arbitrary types, it even
can't do that generally).

2021-10-06 Jakub Jelinek <jakub@redhat.com>

PR tree-optimization/102571
* c-omp.c (c_finish_omp_atomic): Optimize the case where type has
padding, but the non-padding bits are contiguous set of bytes
by adjusting the memcmp call arguments instead of emitting
__builtin_clear_padding and then comparing all the type's bytes.

(cherry picked from commit ba837323dbda2bca5a1c8a4c78092a88241dcfa3)