Jakub Jelinek [Tue, 8 Nov 2022 11:21:55 +0000 (12:21 +0100)]
i386: Improve vector [GL]E{,U} comparison against vector constants [PR107546]
For integer vector comparisons without XOP before AVX512{F,VL} we are
constrained by only GT and EQ being supported in HW.
For GTU we play tricks to implement it using GT or unsigned saturating
subtraction, for LT/LTU we swap the operands and thus turn it into
GT/GTU. For LE/LEU we handle it by using GT/GTU and negating the
result and for GE/GEU by using GT/GTU on swapped operands and negating
the result.
If the second operand is a CONST_VECTOR, we can usually do better though,
we can avoid the negation. For LE/LEU cst by doing LT/LTU cst+1 (and
then cst+1 GT/GTU x) and for GE/GEU cst by doing GT/GTU cst-1, provided
there is no wrap-around on those cst+1 or cst-1.
GIMPLE canonicalizes x < cst to x <= cst-1 etc. (the rule is smaller
absolute value on constant), but only for scalars or uniform vectors,
so in some cases this undoes that canonicalization in order to avoid
the extra negation, but it handles also non-uniform constants.
E.g. with -mavx2 the testcase assembly difference is:
- movl $47, %eax
+ movl $48, %eax
vmovdqa %xmm0, %xmm1
vmovd %eax, %xmm0
vpbroadcastb %xmm0, %xmm0
- vpminsb %xmm0, %xmm1, %xmm0
- vpcmpeqb %xmm1, %xmm0, %xmm0
+ vpcmpgtb %xmm1, %xmm0, %xmm0
and
- vmovdqa %xmm0, %xmm1
- vmovdqa .LC1(%rip), %xmm0
- vpminsb %xmm1, %xmm0, %xmm1
- vpcmpeqb %xmm1, %xmm0, %xmm0
+ vpcmpgtb .LC1(%rip), %xmm0, %xmm0
while with just SSE2:
- pcmpgtb .LC0(%rip), %xmm0
- pxor %xmm1, %xmm1
- pcmpeqb %xmm1, %xmm0
+ movdqa %xmm0, %xmm1
+ movdqa .LC0(%rip), %xmm0
+ pcmpgtb %xmm1, %xmm0
and
- movdqa %xmm0, %xmm1
- movdqa .LC1(%rip), %xmm0
- pcmpgtb %xmm1, %xmm0
- pxor %xmm1, %xmm1
- pcmpeqb %xmm1, %xmm0
+ pcmpgtb .LC1(%rip), %xmm0
2022-11-08 Jakub Jelinek <jakub@redhat.com>
PR target/107546
* config/i386/predicates.md (vector_or_const_vector_operand): New
predicate.
* config/i386/sse.md (vec_cmp<mode><sseintvecmodelower>,
vec_cmpv2div2di, vec_cmpu<mode><sseintvecmodelower>,
vec_cmpuv2div2di): Use nonimmediate_or_const_vector_operand
predicate instead of nonimmediate_operand and
vector_or_const_vector_operand instead of vector_operand.
* config/i386/i386-expand.cc (ix86_expand_int_sse_cmp): For
LE/LEU or GE/GEU with CONST_VECTOR cop1 try to transform those
into LE/LEU or GT/GTU with larger or smaller by one cop1 if
there is no wrap-around. Force CONST_VECTOR cop0 or cop1 into
REG. Formatting fix.
Eric Botcazou [Tue, 18 Oct 2022 09:32:02 +0000 (11:32 +0200)]
ada: Fix oversight in implementation of allocators for storage models
When the allocator is of an unconstrained array type and has an initializing
expression, the copy of the initializing expression must be done separately
from that of the bounds.
gcc/ada/
* gcc-interface/utils2.cc (build_allocator): For unconstrained
array types with a storage model and an initializing expression,
copy the initialization expression separately from the bounds. In
all cases with a storage model, pass the locally computed size for
the store.
Steve Baird [Tue, 25 Oct 2022 23:59:29 +0000 (16:59 -0700)]
ada: Compile-time simplification of 'Image incorrectly ignores Put_Image
In the case of Some_Enumeration_Type'Image (<some static value>),
the compiler will replace this expression in its internal program
representation with a corresponding string literal. This is incorrect
if the Put_Image aspect has been specified (directly or via inheritance)
for the enumeration type.
gcc/ada/
* sem_attr.adb
(Eval_Attribute): Don't simplify 'Image call if Put_Image has been
specified.
Before this patch, a classwide contract expression was preanalyzed
only when its primitive operation's type was frozen. It caused name
resolution to be off in the cases where the freezing took place
after the end of the declaration list the primitive operation was
declared in.
This patch makes it so that if the compiler gets to the end of
the declaration list before the type is frozen, it preanalyzes the
classwide contract expression, so that the names are resolved in the
right context.
gcc/ada/
* contracts.adb
(Preanalyze_Class_Conditions): New procedure.
(Preanalyze_Condition): Moved out from Merge_Class_Conditions in
order to be spec-visible.
* contracts.ads
(Preanalyze_Class_Conditions): New procedure.
* sem_prag.adb
(Analyze_Pre_Post_Condition_In_Decl_Part): Call
Preanalyze_Class_Conditions when necessary.
ada: Set Support_Atomic_Primitives for VxWorks 7 runtimes
gcc/ada/
* libgnat/system-vxworks7-aarch64-rtp-smp.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-aarch64.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-arm-rtp-smp.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-arm.ads: Set Support_Atomic_Primitives
to True.
* libgnat/system-vxworks7-ppc-kernel.ads: Set
Support_Atomic_Primitives to False.
* libgnat/system-vxworks7-ppc-rtp-smp.ads: Set
Support_Atomic_Primitives to False.
* libgnat/system-vxworks7-ppc64-kernel.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-ppc64-rtp-smp.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-x86-kernel.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-x86-rtp-smp.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-x86_64-kernel.ads: Set
Support_Atomic_Primitives to True.
* libgnat/system-vxworks7-x86_64-rtp-smp.ads: Set
Support_Atomic_Primitives to True.
Piotr Trojanek [Fri, 21 Oct 2022 18:12:15 +0000 (20:12 +0200)]
ada: Propagate aspect Ghost when instantiating null formal procedures
When instantiating generic package that includes a formal subprogram
declaration with Ghost aspect and a subprogram_default of null, e.g.:
generic
with procedure Proc is null with Ghost;
package P is ...
the Ghost aspect should be propagated to the internally generated null
subprogram, so this null subprogram can be used in contexts that require
ghost entities.
gcc/ada/
* sem_ch12.adb (Instantiate_Formal_Subprogram): Copy aspect Ghost
from formal subprogram declaration to the internally generated
procedure.
Eric Botcazou [Thu, 20 Oct 2022 18:41:08 +0000 (20:41 +0200)]
ada: Implement RM 4.5.7(10/3) name resolution rule
This rule deals with the specific case of a conditional expression that is
the operand of a type conversion and effectively distributes the conversion
to the dependent expressions with the help of the dynamic semantics.
gcc/ada/
* sem_ch4.adb (Analyze_Case_Expression): Compute the
interpretations of the expression only at the end of the analysis,
but skip doing it if it is the operand of a type conversion.
(Analyze_If_Expression): Likewise.
* sem_res.adb (Resolve): Deal specially with conditional
expression that is the operand of a type conversion.
(Resolve_Dependent_Expression): New procedure.
(Resolve_Case_Expression): Call Resolve_Dependent_Expression.
(Resolve_If_Expression): Likewise.
(Resolve_If_Expression.Apply_Check): Take result type as
parameter.
(Resolve_Type_Conversion): Do not warn about a redundant
conversion when the operand is a conditional expression.
Javier Miranda [Sun, 16 Oct 2022 19:48:53 +0000 (19:48 +0000)]
ada: Enforce matching of extra formals
This patch enforces matching of extra formals in overridden subprograms,
subprogram renamings, and subprograms to which attributes 'Access,
'Unchecked_Access, or 'Unrestricted_Access is applied (for these access
cases the subprogram is checked against its corresponding subprogram
type). This enforcement is an internal consistency check, not an
implementation of some language legality rule.
gcc/ada/
* debug.adb
(Debug_Flag_Underscore_XX): Switch -gnatd_X used temporarily to allow
disabling extra formal checks.
* exp_attr.adb
(Expand_N_Attribute_Reference [access types]): Add extra formals
to the subprogram referenced in the prefix of 'Unchecked_Access,
'Unrestricted_Access or 'Access; required to check that its extra
formals match the extra formals of the corresponding subprogram type.
* exp_ch3.adb
(Stream_Operation_OK): Declaration moved to the public part of the
package.
(Validate_Tagged_Type_Extra_Formals): New subprogram.
(Expand_Freeze_Record_Type): Improve the code that takes care of
adding the extra formals of dispatching primitives; extended to
add also the extra formals to renamings of dispatching primitives.
* exp_ch3.ads
(Stream_Operation_OK): Declaration moved from the package body.
* exp_ch6.adb
(Check_BIP_Actuals): Complete documentation.
(Has_BIP_Extra_Formal): Subprogram declaration moved to the public
part of the package. In addition, a parameter has been added to
disable an assertion that requires its use with frozen entities.
(Duplicate_Params_Without_Extra_Actuals): New subprogram.
(Check_Subprogram_Variant): Emit the call without duplicating the
extra formals since they will be added when the call is analyzed.
(Expand_Call_Helper): Ensure that the called subprogram has all its
extra formals, enforce assertion checking extra formals on thunks,
and mark calls from thunks as processed-BIP-calls to avoid adding
their extra formals twice.
(Is_Build_In_Place_Function): Return False for entities with foreign
convention.
(Is_Build_In_Place_Function_Call): Return True also for not BIP functions
that have BIP formals since the extra actuals are required.
(Make_Build_In_Place_Call_In_Object_Declaration): Occurrences of
Is_Return_Object replaced by the local variable Is_OK_Return_Object
that evaluates to False for scopes with foreign convention.
(Might_Have_Tasks): Fix check of class-wide limited record types.
(Needs_BIP_Task_Actuals): Remove assertion to allow calling this
function in more contexts; in addition it returns False for functions
returning objects with foreign convention.
(Needs_BIP_Finalization_Master): Likewise.
(Needs_BIP_Alloc_Form): Likewise.
(Validate_Subprogram_Calls): Check that the number of actuals (including
extra actuals) of calls in the subtree N match their corresponding
formals.
* exp_ch6.ads
(Has_BIP_Extra_Formal): Subprogram declaration moved to the public
part of the package. In addition, a parameter has been added to
disable an assertion that requires its use with frozen entities.
(Is_Build_In_Place_Function_Call): Complete documentation.
(Validate_Subprogram_Calls): Check that the number of actuals (including
extra actuals) of calls in the subtree N match their corresponding
formals.
* freeze.adb
(Check_Itype): Add extra formals to anonymous access subprogram itypes.
(Freeze_Expression): Improve code that disables the addition of extra
formals to functions with foreign convention.
(Check_Extra_Formals): Moved to package Sem_Ch6 as Extra_Formals_OK.
(Freeze_Subprogram): Add extra formals to non-dispatching subprograms.
* frontend.adb
(Frontend): Validate all the subprogram calls; it can be disabled using
switch -gnatd_X
* sem_ch3.adb
(Access_Subprogram_Declaration): Defer the addition of extra formals to
the freezing point so that we know the convention.
(Check_Anonymous_Access_Component): Likewise.
(Derive_Subprogram): Fix documentation.
* sem_ch6.adb
(Has_Reliable_Extra_Formals): New subprogram.
(Check_Anonymous_Return): Fix check of access to class-wide limited
record types.
(Check_Untagged_Equality): Placed in alphabetical order.
(Extra_Formals_OK): Subprogram moved from freeze.adb.
(Extra_Formals_Match_OK): New subprogram.
(Has_BIP_Formals): New subprogram.
(Has_Extra_Formals): New subprograms.
(Needs_Accessibility_Check_Extra): New subprogram.
(Parent_Subprogram): New subprogram.
(Add_Extra_Formal): Minor code cleanup.
(Create_Extra_Formals): Enforce matching extra formals on overridden
and aliased entities.
* sem_ch6.ads
(Extra_Formals_Match_OK): New subprogram.
(Extra_Formals_OK): Subprogram moved from freeze.adb.
* sem_eval.adb
(Compile_Time_Known_Value): Improve predicate to avoid assertion
failure; found working on this ticket; this change does not
affect the behavior of the compiler because this subprogram
has an exception handler that returns False when the assertion
fails.
* sem_util.adb
(Needs_Result_Accessibility_Level): Do not return False for dispatching
operations compiled with Ada_Version < 2012 since they they may be
overridden by primitives compiled with Ada_Version >= Ada_2012.
Bob Duff [Fri, 21 Oct 2022 15:09:49 +0000 (11:09 -0400)]
ada: Move warnings switches -- initial work
This patch prepares to move warning switches from Opt into Warnsw.
gcc/ada/
* warnsw.ads, warnsw.adb, fe.h, err_vars.ads, errout.ads: Move
Warning_Doc_Switch from Err_Vars to Warnsw. Access
Warn_On_Questionable_Layout on the C side via a function rather
than a variable, because we plan to turn the variables into
renamings, and you can't Export renamings.
* erroutc.adb, switch-c.adb, errout.adb: Likewise.
* gcc-interface/decl.cc: Use Get_Warn_On_Questionable_Layout
instead of Warn_On_Questionable_Layout.
* gcc-interface/Makefile.in (GNATMAKE_OBJS): Add warnsw.o, because
it is indirectly imported via Errout.
* gcc-interface/Make-lang.in (GNATBIND_OBJS): Likewise and remove
restrict.o (not needed).
Steve Baird [Wed, 19 Oct 2022 19:42:55 +0000 (12:42 -0700)]
ada: Improve handling of declare expressions in deferred-freezing contexts
In some cases where a declare expression occurs in a deferred-freezing
context (e.g., within the default value for a discriminant or for a formal
parameter, or within the expression of an expression function), the compiler
generates a bugbox.
gcc/ada/
* sem_ch3.adb
(Analyze_Object_Declaration): Do not perform expansion actions if
In_Spec_Expression is true.
Eric Botcazou [Thu, 20 Oct 2022 09:05:16 +0000 (11:05 +0200)]
ada: Minor consistency tweaks in Sem_Ch4
This ensures that, during the analysis of the qualified expressions, type
conversions and unchecked type conversions, the determination of the type
of the node and the analysis of its expression are done in the same order.
No functional changes.
gcc/ada/
* sem_ch4.adb (Analyze_Qualified_Expression): Analyze the
expression only after setting the type.
(Analyze_Unchecked_Type_Conversion): Likewise.
(Analyze_Short_Circuit): Likewise for the operands.
(Analyze_Type_Conversion): Minor tweaks.
(Analyze_Unchecked_Expression): Likewise.
ada: Preanalyze classwide contracts as spec expressions
Classwide contracts are "spec expressions" as defined in the
documentation in sem.ads. Before this patch, the instances of
classwide contracts that are destined to class conditions merging
were not preanalyzed as spec expressions. That caused preanalysis to
emit spurious errors in some cases.
gcc/ada/
* contracts.adb (Preanalyze_Condition): Use
Preanalyze_Spec_Expression.
Piotr Trojanek [Thu, 17 Jun 2021 17:44:40 +0000 (19:44 +0200)]
ada: Fix expansion of 'Wide_Image and 'Wide_Wide_Image on composite types
Attributes Wide_Image and Wide_Wide_Image applied to composite types are
now expanded just like attribute Image.
gcc/ada/
* exp_imgv.adb
(Expand_Wide_Image_Attribute): Handle just like attribute Image.
(Expand_Wide_Wide_Image_Attribute): Likewise.
* exp_put_image.adb
(Build_Image_Call): Adapt to also work for Wide and Wide_Wide
attributes.
* exp_put_image.ads
(Build_Image_Call): Update comment.
* rtsfind.ads
(RE_Id): Support wide variants of Get.
(RE_Unit_Table): Likewise.
Piotr Trojanek [Tue, 6 Sep 2022 21:20:47 +0000 (23:20 +0200)]
ada: Remove unneeded code in handling formal type defaults
Unneeded code found while experimenting with improved detection of
unreferenced objects.
gcc/ada/
* sem_ch12.adb (Validate_Formal_Type_Default): Remove call to
Collect_Interfaces, which had no effect apart from populating a
list that was not used; fix style.
Piotr Trojanek [Tue, 6 Sep 2022 21:28:26 +0000 (23:28 +0200)]
ada: Cleanup local variable that is only set as an out parameter
Minor improvements; found experimenting with improved detection of
unreferenced objects.
gcc/ada/
* exp_spark.adb (SPARK_Freeze_Type): Refine type of a local
object.
* sem_ch3.adb (Derive_Subprograms): Remove initial value for
New_Subp, which is in only written as an out parameter and never
read.
Piotr Trojanek [Thu, 14 Oct 2021 21:31:21 +0000 (23:31 +0200)]
ada: Reject limited objects in array and record delta aggregates
For array delta aggregates the base expression cannot be limited; for
record delta aggregates the base expression can only be limited if it is
a newly constructed object.
gcc/ada/
* sem_aggr.adb (Resolve_Delta_Aggregate): Implement rules related
to limited objects appearing as the base expression.
Piotr Trojanek [Thu, 14 Oct 2021 21:24:54 +0000 (23:24 +0200)]
ada: Allow initialization of limited objects with delta aggregates
Objects of a limited type can be initialized with "aggregates", which is
a collective term for ordinary aggregates (i.e. record aggregates and
array aggregates), extension aggregates and finally for delta
aggregates (introduced by Ada 2022).
gcc/ada/
* sem_ch3.adb (OK_For_Limited_Init_In_05): Handle delta aggregates
just like other aggregates.
Piotr Trojanek [Tue, 18 Oct 2022 12:31:00 +0000 (14:31 +0200)]
ada: Raise Tag_Error when Ada.Tags operations are called with No_Tag
Implement missing behavior of RM 13.9 (25.1/3): Tag_Error is raised by a
call of Interface_Ancestor_Tags and Is_Descendant_At_Same_Level, if any
tag passed is No_Tag. This change also fixes Descendant_Tag, which
relies on Is_Descendant_At_Same_Level. The remaining operations already
worked properly.
gcc/ada/
* libgnat/a-tags.adb
(Interface_Ancestor_Tags): Raise Tag_Error on No_Tag.
(Is_Descendant_At_Same_Level): Likewise.
Tweak analyzer handling of strchr, so that we show the
when 'strchr' returns non-NULL
message for that execution path.
gcc/analyzer/ChangeLog:
* region-model-impl-calls.cc (region_model::impl_call_strchr):
Move to on_call_post. Handle both outcomes using bifurcation,
rather than just the "not found" case.
* region-model.cc (region_model::on_call_pre): Move
BUILT_IN_STRCHR and "strchr" to...
(region_model::on_call_post): ...here.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/strchr-1.c (test_literal): Detect writing to a
string literal. Verify that we emit the "when '__builtin_strchr'
returns non-NULL" message.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Jason Merrill [Fri, 4 Nov 2022 19:22:45 +0000 (15:22 -0400)]
c++: implement P2468R2, the equality operator you are looking for
This paper is resolving the problem of well-formed C++17 code becoming
ambiguous in C++20 due to asymmetrical operator== being compared with itself
in reverse. I had previously implemented a tiebreaker such that if the two
candidates were functions with the same parameter types, we would prefer the
non-reversed candidate. But the committee went with a different approach:
if there's an operator!= with the same parameter types as the operator==,
don't consider the reversed form of the ==.
So this patch implements that, and changes my old tiebreaker to give a
pedwarn if it is used. I also noticed that we were giving duplicate errors
for some testcases, and fixed the tourney logic to avoid that.
As a result, a lot of tests of the form
struct A { bool operator==(const A&); };
need to be fixed to add a const function-cv-qualifier, e.g.
struct A { bool operator==(const A&) const; };
The committee thought such code ought to be fixed, so breaking it was fine.
18_support/comparisons/algorithms/fallback.cc also breaks with this patch,
because of the similarly asymmetrical
bool operator==(const S&, S&) { return true; }
As a result, some of the asserts need to be reversed.
The H test in spaceship-eq15.C is specified in the standard to be
well-formed because the op!= in the inline namespace is not found by the
search, but that seems wrong to me. I've implemented that behavior, but
disabled it for now; if we decide that is the way we want to go, we can just
remove the "0 &&" in add_candidates to enable it.
Co-authored-by: Jakub Jelinek <jakub@redhat.com>
gcc/cp/ChangeLog:
* cp-tree.h (fns_correspond): Declare.
* decl.cc (fns_correspond): New.
* call.cc (add_candidates): Look for op!= matching op==.
(joust): Complain about non-standard reversed tiebreaker.
(tourney): Fix champ_compared_to_predecessor logic.
(build_new_op): Don't complain about error_mark_node not having
'bool' type.
* pt.cc (tsubst_copy_and_build): Don't try to be permissive
when seen_error().
Andrew MacLeod [Mon, 7 Nov 2022 20:07:35 +0000 (15:07 -0500)]
Add transitive inferred range processing.
Rewalk statements at the end of a block to see if any inferred ranges
affect earlier calculations and register those as inferred ranges.
gcc/
PR tree-optimization/104530
* gimple-range-cache.cc (ranger_cache::register_inferred_value):
New. Split from:
(ranger_cache::apply_inferred_ranges): Move setting cache to
separate function.
* gimple-range-cache.h (register_inferred_value): New prototype.
* gimple-range-infer.cc (infer_range_manager::has_range_p): New.
* gimple-range-infer.h (has_range_p): New prototype.
* gimple-range.cc (register_transitive_inferred_ranges): New.
* gimple-range.h (register_transitive_inferred_ranges): New proto.
* tree-vrp.cc (rvrp_folder::fold_stmt): Check for transitive inferred
ranges at the end of the block before folding final stmt.
Jakub Jelinek [Mon, 7 Nov 2022 23:35:09 +0000 (00:35 +0100)]
libstdc++: Fix up libstdc++ build against glibc 2.25 or older [PR107562]
On Mon, Nov 07, 2022 at 05:48:42PM +0000, Jonathan Wakely wrote:
> On Mon, 7 Nov 2022 at 16:11, Joseph Myers <joseph@codesourcery.com> wrote:
> >
> > On Wed, 2 Nov 2022, Jakub Jelinek via Gcc-patches wrote:
> >
> > > APIs. So that one can build gcc against older glibc and then compile
> > > user programs on newer glibc, the patch uses weak references unless
> > > gcc is compiled against glibc 2.26+. strfromf128 unfortunately can't
> >
> > This support for older glibc doesn't actually seem to be working, on an
> > older system with glibc 2.19 I'm seeing
> >
> > /scratch/jmyers/fsf/gcc-mainline/libstdc++-v3/src/c++17/floating_to_chars.cc:52:3: error: expected initializer before '__asm'
> > 52 | __asm ("strfromf128");
> > | ^~~~~
> >
> > and a series of subsequent errors.
>
> This seems to "fix" it (not sure if it's right though):
>
> #ifndef _GLIBCXX_HAVE_FLOAT128_MATH
> extern "C" _Float128 __strtof128(const char*, char**)
> __attribute__((__weak__));
> #endif
> extern "C" _Float128 __strtof128(const char*, char**)
> __asm ("strtof128");
It is, but floating_from_chars.cc has the same problem,
and I think we can avoid the duplication, like this:
2022-11-08 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/107562
* src/c++17/floating_from_chars.cc (__strtof128): Put __asm before
__attribute__.
* src/c++17/floating_to_chars.cc (__strfromf128): Likewise.
David Faust [Mon, 7 Nov 2022 18:30:52 +0000 (10:30 -0800)]
bpf: cleanup missed refactor
Commit 068baae1864 "bpf: add preserve_field_info builtin" factored out
some repeated code to a new function maybe_make_core_relo (), but missed
using it in one place. Clean that up.
gcc/
* config/bpf/bpf.cc (handle_attr_preserve): Use maybe_make_core_relo().
Aldy Hernandez [Fri, 4 Nov 2022 21:24:42 +0000 (22:24 +0100)]
Improve multiplication by powers of 2 in range-ops.
For unsigned numbers, multiplication by X, where X is a power of 2 is
[0,0][X,+INF].
This patch causes a regression to g++.dg/pr71488.C where
-Wstringop-overflow gets the same IL as before, but better ranges
cause it to issue a bogus warning. I will create a PR with some
notes.
No discernible changes in performance.
Tested on x86-64 Linux.
PR tree-optimization/55157
gcc/ChangeLog:
* range-op.cc (operator_mult::wi_fold): Optimize multiplications
by powers of 2.
Patrick Palka [Mon, 7 Nov 2022 18:29:30 +0000 (13:29 -0500)]
libstdc++: Implement ranges::cartesian_product_view from P2374R4
This also implements the proposed resolutions of the tentatively ready
LWG issues 3760, 3761 and 3801 for cartesian_product_view.
I'm not sure how/if we should implement the recommended practice of:
iterator::difference_type should be the smallest signed-integer-like
type that is sufficiently wide to store the product of the maximum
sizes of all underlying ranges if such a type exists
because for e.g.
extern std::vector<int> x, y;
auto v = views::cartesian_product(x, y);
IIUC it'd mean difference_type should be __int128 (on 64-bit systems),
which seems quite wasteful: in practice the size of any cartesian product
probably won't exceed the precision of say ptrdiff_t, and using anything
larger will just incur unnecessary space/time overhead. It's also
probably not worth the complexity to use less precision than ptrdiff_t
(when possible) either. So this patch defines difference_type as
which should mean it's least as large as the difference_type of each
underlying range, and at least as large as ptrdiff_t. This patch also
adds assertions to catch any overflow that occurs due to this choice of
difference_type.
libstdc++-v3/ChangeLog:
* include/std/ranges (__maybe_const_t): New alias for
__detail::__maybe_const_t.
(__detail::__cartesian_product_is_random_access): Define.
(__detail::__cartesian_product_common_arg): Define.
(__detail::__cartesian_product_is_bidirectional): Define.
(__detail::__cartesian_product_is_common): Define.
(__detail::__cartesian_product_is_sized): Define.
(__detail::__cartesian_is_sized_sentinel): Define.
(__detail::__cartesian_common_arg_end): Define.
(cartesian_product_view): Define.
(cartesian_product_view::_Iterator): Define.
(views::__detail::__can_cartesian_product_view): Define.
(views::_CartesianProduct, views::cartesian_product): Define.
* testsuite/std/ranges/cartesian_product/1.cc: New test.
Richard Purdie [Mon, 7 Nov 2022 16:26:44 +0000 (17:26 +0100)]
Fix NULL filename handling
The previous commit introduced a regression as some Ada tests end up passing
NULL as the filename to remap_filename. Handle this as before to fix them.
Jakub Jelinek [Mon, 7 Nov 2022 14:17:21 +0000 (15:17 +0100)]
libstdc++: Update from latest fast_float [PR107468]
The following patch updates from fast_float trunk. That way
it grabs two of the 4 LOCAL_PATCHES, some smaller tweaks, to_extended
cleanups and most importantly fix for the incorrect rounding case,
PR107468 aka https://github.com/fastfloat/fast_float/issues/149
Using std::fegetround showed in benchmarks too slow, so instead of
doing that the patch limits the fast path where it uses floating
point multiplication rather than integral to cases where we can
prove there will be no rounding (the multiplication will be exact, not
just that the two multiplication or division operation arguments are
exactly representable).
2022-11-07 Jakub Jelinek <jakub@redhat.com>
PR libstdc++/107468
* src/c++17/fast_float/MERGE: Adjust for merge from upstream.
* src/c++17/fast_float/LOCAL_PATCHES: Remove commits that were
upstreamed.
* src/c++17/fast_float/README.md: Merge from fast_float 662497742fea7055f0e0ee27e5a7ddc382c2c38e commit.
* src/c++17/fast_float/fast_float.h: Likewise.
* testsuite/20_util/from_chars/pr107468.cc: New test.
Jakub Jelinek [Mon, 7 Nov 2022 14:15:50 +0000 (15:15 +0100)]
libstdc++: Add _Float128 to_chars/from_chars support for x86, ia64 and ppc64le with glibc
The following patch adds std::{to,from}_chars support for std::float128_t
on glibc 2.26+ for {i?86,x86_64,ia64,powerpc64le}-linux.
When long double is already IEEE quad, previous changes already handle
it by using long double overloads in _Float128 overloads.
The powerpc64le case (with explicit or implicit -mabi=ibmlongdouble)
is handled by using the __float128/__ieee128 entrypoints which are
already in the library and used for -mabi=ieeelongdouble.
For i?86, x86_64 and ia64 this patch adds new library entrypoints,
mostly by enabling the code that was already there for powerpc64le-linux.
Those use __float128 or __ieee128, the patch uses _Float128 for the
exported overloads and internally as template parameter. While
powerpc64le-linux uses __sprintfieee128 and __strtoieee128,
for _Float128 the patch uses the glibc 2.26 strfromf128 and strtof128
APIs. So that one can build gcc against older glibc and then compile
user programs on newer glibc, the patch uses weak references unless
gcc is compiled against glibc 2.26+. strfromf128 unfortunately can't
handle %.0Lf and %.*Le, %.*Lf, %.*Lg format strings sprintf/__sprintfieee128
use, we need to remove the L from those and replace * with actually
directly printing the precision into the format string (i.e. it can
handle %.0f and %.27f (floating point type is implied from the function
name)).
Unlike the std::{,b}float16_t support, this one actually exports APIs
with std::float128_t aka _Float128 in the mangled name, because no
standard format is superset of it. On the other side, e.g. on i?86/x86_64
it doesn't have restrictions like for _Float16/__bf16 which ISAs need
to be enabled in order to use it.
The denorm_min case in the testcase is temporarily commented out because
of the ERANGE subnormal issue Patrick posted patch for.
2022-11-07 Jakub Jelinek <jakub@redhat.com>
* include/std/charconv (from_chars, to_chars): Add _Float128
overfloads if _GLIBCXX_HAVE_FLOAT128_MATH is defined.
* config/abi/pre/gnu.ver (GLIBCXX_3.4.31): Export
_ZSt8to_charsPcS_DF128_, _ZSt8to_charsPcS_DF128_St12chars_format,
_ZSt8to_charsPcS_DF128_St12chars_formati and
_ZSt10from_charsPKcS0_RDF128_St12chars_format.
* src/c++17/floating_from_chars.cc (USE_STRTOF128_FOR_FROM_CHARS):
Define if needed.
(__strtof128): Declare.
(from_chars_impl): Handle _Float128.
(from_chars): New _Float128 overload if USE_STRTOF128_FOR_FROM_CHARS
is define.
* src/c++17/floating_to_chars.cc (__strfromf128): Declare.
(FLOAT128_TO_CHARS): Define even when _Float128 is supported and
wider than long double.
(F128_type): Use _Float128 for that case.
(floating_type_traits): Specialize for F128_type rather than
__float128.
(sprintf_ld): Add length argument. Handle _Float128.
(__floating_to_chars_shortest, __floating_to_chars_precision):
Pass length to sprintf_ld.
(to_chars): Add _Float128 overloads for the F128_type being
_Float128 cases.
* testsuite/20_util/to_chars/float128_c++23.cc: New test.
Aldy Hernandez [Mon, 7 Nov 2022 07:40:12 +0000 (08:40 +0100)]
[range-op] Restrict division by power of 2 optimization to positive numbers.
The problem here is that we are transforming a division by a power of
2 into a right shift, and using this to shift the maybe nonzero bits.
This gives the wrong result when the number being divided is negative.
In the testcase we are dividing this by 8:
[irange] int [-256, -255] NONZERO 0xffffff01
and coming up with:
[irange] int [-32, -31] NONZERO 0xffffffe0
The maybe nonzero bits are wrong as -31 has the LSB set (0xffffffe1)
whereas the bitmask says the lower 4 bits are off.
PR tree-optimization/107541
gcc/ChangeLog:
* range-op.cc (operator_div::fold_range): Restrict power of 2
optimization to positive numbers.
Tobias Burnus [Mon, 7 Nov 2022 10:32:33 +0000 (11:32 +0100)]
Fortran: Fix reallocation on assignment for kind=4 strings [PR107508]
The check whether reallocation on assignment was required did not handle
kind=4 characters correctly such that there was always a reallocation,
implying issues with pointer addresses and lower bounds. Additionally,
with all deferred strings, the old memory was not freed on reallocation.
And, finally, inside the block which was only executed if string lengths
or bounds or dynamic types changed, was a subcheck of the same, which
was effectively a no op but still confusing and at least added with -O0
extra instructions to the binary.
PR fortran/107508
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_alloc_allocatable_for_assignment): Fix
string-length check, plug memory leak, and avoid generation of
effectively no-op code.
* trans-expr.cc (alloc_scalar_allocatable_for_assignment): Extend
comment; minor cleanup.
Richard Biener [Tue, 27 Sep 2022 08:16:52 +0000 (10:16 +0200)]
unswitch most profitable condition first
When doing the loop unswitching re-org we promised to followup
with improvements on the cost modeling. The following makes sure we
try to unswitch on the most profitable condition first. As most profitable
we pick the condition leading to the edge with the highest profile count.
Note the profile is only applied when picking the first unswitching
opportunity since the profile counts are not updated with earlier
unswitchings in mind. Further opportunities are picked in DFS order.
* tree-ssa-loop-unswitch.cc (unswitch_predicate::count): New.
(unswitch_predicate::unswitch_predicate): Initialize count.
(init_loop_unswitch_info): First collect candidates and
determine the outermost loop to unswitch.
(tree_ssa_unswitch_loops): First perform all guard hoisting,
then perform unswitching on innermost loop predicates.
(find_unswitching_predicates_for_bb): Keep track of the
most profitable predicate to unswitch on.
(tree_unswitch_single_loop): Unswitch given predicate if
not NULL.
Martin Liska [Mon, 7 Nov 2022 08:50:21 +0000 (09:50 +0100)]
Mitigate clang warnings:
gcc/range-op.cc:1752:16: warning: 'wi_fold' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/range-op.cc:1757:16: warning: 'wi_op_overflows' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/range-op.cc:1759:16: warning: 'op1_range' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/range-op.cc:1763:16: warning: 'op2_range' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/range-op.cc:1928:16: warning: 'wi_fold' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
gcc/range-op.cc:1933:16: warning: 'wi_op_overflows' overrides a member function but is not marked 'override' [-Winconsistent-missing-override]
Piotr Trojanek [Tue, 18 Oct 2022 07:33:38 +0000 (09:33 +0200)]
ada: Tune hash function for cross-reference entries
Tune the hash function that combines entity identifiers with source
locations of where those entities are referenced. Previously the source
location was multiplied by 2 ** 7 (i.e. shifted left by 7 bits), then
added to the entity identifier, and finally divided modulo 2 ** 16 (i.e.
masked to only use the lowest 16 bits). This hash routine caused
collisions that could make some tests up to twice slower.
With a large entity number the source location was only contributing few
bits to the hash value. This large entity number might correspond to
entity like Ada.Characters.Latin_1.NUL that occurs thousands of times in
generated code.
Piotr Trojanek [Mon, 17 Oct 2022 20:08:37 +0000 (22:08 +0200)]
ada: Fix performance regression related to references in Refined_State
Recently added call to In_Pragma_Expression caused a performance
regression. It might require climbing syntax trees of arbitrarily deep
expressions, while previously references within pragmas were detected in
bounded time.
This patch restores the previous efficiency. However, while the original
code only detected references directly within pragma argument
associations, now we also detect references inside aggregates, e.g.
like those in pragma Refined_State.
gcc/ada/
* sem_prag.adb (Non_Significant_Pragma_Reference): Detect
references with aggregates; only assign local variables Id and C
when necessary.
Bob Duff [Mon, 17 Oct 2022 15:56:27 +0000 (11:56 -0400)]
ada: New warning about noncomposing user-defined "="
Print warning for a user-defined "=" that does not compose
as might be expected (i.e. is ignored for predefined "=" of
a containing record or array type). This warning is enabled by
-gnatw_q; we don't enable it by default because it generates
too many false positives. We also don't enable it via -gnatwa.
gcc/ada/
* exp_ch4.adb
(Expand_Array_Equality): Do not test Ltyp = Rtyp here, because
that is necessarily true. Move assertion thereof to more general
place.
(Expand_Composite_Equality): Pass in Outer_Type, for use in
warnings. Rename Typ to be Comp_Type, to more clearly distinguish
it from Outer_Type. Print warning when appropriate.
* exp_ch4.ads: Minor comment fix.
* errout.ads: There is no such pragma as Warning_As_Pragma --
Warning_As_Error must have been intended. Improve comment for ?x?.
* exp_ch3.adb
(Build_Untagged_Equality): Update comment to be accurate for more
recent versions of Ada.
* sem_case.adb
(Choice_Analysis): Declare user-defined "=" functions as abstract.
* sem_util.ads
(Is_Bounded_String): Give RM reference in comment.
* warnsw.ads, warnsw.adb
(Warn_On_Ignored_Equality): Implement new warning switch -gnatw_q.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Document new warning switch.
* gnat_ugn.texi: Regenerate.
Piotr Trojanek [Mon, 17 Oct 2022 14:28:20 +0000 (16:28 +0200)]
ada: Inline composite node kind AST queries
Queries that ultimately examine the same field of an AST
node (e.g. Nkind) are visibly more efficient when inlined.
In particular, routines Is_Body_Or_Package_Declaration and Is_Body can
apparently be inlined into a single Nkind membership test.
This patch fixes some of the performance lost with the recent changes,
which increased the number of calls to Is_Body_Or_Package_Declaration
(as it is typically used to prevent AST search from climbing too far).
However, it should be generally beneficial to inline routines like this.
gcc/ada/
* sem_aux.ads (Is_Body): Annotate with Inline.
* sem_util.ads (Is_Body_Or_Package_Declaration): Likewise.
Quentin Ochem [Fri, 14 Oct 2022 10:30:04 +0000 (06:30 -0400)]
ada: Fixed elaboration of CUDA programs.
The names of imported / exported symbols were not consistent
between the device and the host when compiling for CUDA.
Remove the function Device_Ada_Final_Link_Name as it is no
longer referenced.
gcc/ada/
* bindgen.adb: fixed the way the device init and final symbols are
computed, re-using the normal way these symbols would be computed
with a __device_ prefix. Also fixed the "is null;" procedure on
the host side which are not Ada 95, replaced with a procedure
raising an exception as it should never be called. Remove the
unused function Device_Ada_Final_Link_Name.
Steve Baird [Fri, 14 Oct 2022 00:07:31 +0000 (17:07 -0700)]
ada: Rework CUDA host-side invocation of device-side elaboration code
When the binder is invoked with a "-d_c" switch, add an argument to that
switch which is the library name on the device side; so "-d_c" becomes
"-d_c=some_library_name". This does not effect the case where "-d_c" is
specified as a switch for compilation (as opposed to binding). Use this
new piece of information in the code generated by the binder to invoke
elaboration code on the device side from the host side.
gcc/ada/
* opt.ads: Declare new string pointer variable, CUDA_Device_Library_Name.
Modify comments for existing Boolean variable Enable_CUDA_Device_Expansion.
* switch-b.adb: When "-d_c" switch is encountered, check that the next
character is an "'='; use the remaining characters to initialize
Opt.CUDA_Device_Library_Name.
* bindgen.adb: Remove (for now) most support for host-side invocation of
device-side finalization. Make use of the new CUDA_Device_Library_Name
in determining the string used to refer (on the host side) to the
device-side initialization procedure. Declare the placeholder routine
that is named in the CUDA_Execute pragma (and the CUDA_Register_Function
call) as an exported null procedure, rather than as an imported procedure.
It is not clear whether it is really necessary to specify the link-name
for this should-never-be-called subprogram on the host side, but for now it
shouldn't hurt to do so.
Piotr Trojanek [Fri, 14 Oct 2022 18:22:34 +0000 (20:22 +0200)]
ada: Fix detection of external calls to protected objects in instances
Detection of external-vs-internal calls to protected objects relied on
the scope stack. This didn't work when the call appeared in an instance
of generic unit, because instances are analyzed in different context to
where they appear.
gcc/ada/
* exp_ch6.adb (Expand_Protected_Subprogram_Call): Examine scope
tree and not the scope stack.
Bob Duff [Thu, 13 Oct 2022 20:51:08 +0000 (16:51 -0400)]
ada: Suppress warnings on derived True/False
GNAT normally warns on "return ...;" if the "..." is known to be True or
False, but not when it is a Boolean literal True or False. This patch
also suppresses the warning when the type is derived from Boolean, and
has convention C or Fortran (and therefore True is represented as
"nonzero").
Without this fix, GNAT would give warnings like "False is always False".
gcc/ada/
* sem_warn.adb
(Check_For_Warnings): Remove unnecessary exception handler.
(Warn_On_Known_Condition): Suppress warning when we detect a True
or False that has been turned into a more complex expression
because True is represented as "nonzero". (Note that the complex
expression will subsequently be constant-folded to a Boolean True
or False). Also simplify to always print "condition is always ..."
instead of special-casing object names. The special case was
unhelpful, and indeed wrong when the expression is a literal.
Recently routine Safe_To_Capture_Value was adapted, so that various data
properties like validity/nullness/values are tracked also for
in-parameters. Now a similar routine Safe_To_Capture_In_Parameter_Value,
which was only used to track data nullness, is redundant, so this patch
deconstructs it.
Also the removed routine had at least few problems and limitations, for
example:
1) it only worked for functions and procedures, but not for protected
entries and task types (whose discriminants work very much like
in-parameters)
2) it only worked for subprogram bodies with no spec, because of this
dubious check (here simplified):
if Nkind (Parent (Parent (Current_Scope))) /= N_Subprogram_Body then
return False;
3) it only recognized references within short-circuit operators as
certainly evaluated if they were directly their left hand expression,
e.g.:
X.all and then ...
but not when they were certainly evaluated as part of a bigger
expression on the left hand side, e.g.:
(X.all > 0) and then ...
4) it categorizes parameters with 'Unrestricted_Access attribute as safe
to capture, which is not necessarily wrong, but risky (because the
object becomes aliased).
Routine Safe_To_Capture_Value, which is kept by this patch, seems to
behave better in all those situations, though it has its own problems as
well and ideally should be further scrutinized.
gcc/ada/
* checks.adb (Safe_To_Capture_In_Parameter_Value): Remove.
* sem_util.adb (Safe_To_Capture_Value): Stop search at the current
body.
Piotr Trojanek [Wed, 7 Sep 2022 15:24:40 +0000 (17:24 +0200)]
ada: Cleanup detection of code within generic instances
To check if a node is located in a generic instance we can either look
at Instantiation_Location or at the Instantiation_Depth, but just
looking at the location is simpler and more efficient.
Cleanup related to improved detection of references to uninitialized
objects; semantics is unaffected.
gcc/ada/
* sem_ch13.adb (Add_Call): Just look at Instantiation_Depth.
* sem_ch3.adb (Derive_Subprograms): Likewise.
* sem_warn.adb (Check_References): Remove redundant filtering with
Instantiation_Depth that follows filtering with
Instantiation_Location.
* sinput.adb (Instantiation_Depth): Reuse Instantiation_Location.
Piotr Trojanek [Mon, 5 Sep 2022 22:24:17 +0000 (00:24 +0200)]
ada: Remove redundant suppression for non-modified IN OUT parameters
Non-modified IN OUT parameters are first collected and then filtered by
examining uses of their enclosing subprograms. In this filtering we
don't need to look again at properties of the formal parameters
themselves.
Cleanup related to improved detection of references to uninitialized
objects; semantics is unaffected.
gcc/ada/
* sem_warn.adb
(No_Warn_On_In_Out): For subprograms we can simply call
Warnings_Off.
(Output_Non_Modified_In_Out_Warnings): Remove repeated
suppression.
Piotr Trojanek [Thu, 14 Oct 2021 15:50:43 +0000 (17:50 +0200)]
ada: Reject boxes in delta array aggregates
Implement Ada 2022 4.3.4(11/5), which rejects box compound delimiter <>
in delta record aggregates, just like another rule rejects it in delta
array aggregates.
gcc/ada/
* sem_aggr.adb (Resolve_Delta_Array_Aggregate): Reject boxes in
delta array aggregates.
Piotr Trojanek [Fri, 25 Sep 2020 08:43:27 +0000 (10:43 +0200)]
ada: Allow reuse of Enclosing_Declaration_Or_Statement by GNATprove
Move routine Enclosing_Declaration_Or_Statement from body of Sem_Res to spec
of Sem_Util, so it can be reused. In particular, GNATprove needs this
functionality to climb from an arbitrary subexpression with target_name (@)
to the enclosing assignment statement. Behaviour of the compiler is
unaffected.
gcc/ada/
* sem_res.adb (Enclosing_Declaration_Or_Statement): Moved to
Sem_Util.
* sem_util.ads (Enclosing_Declaration_Or_Statement): Moved from
Sem_Res.
* sem_util.adb (Enclosing_Declaration_Or_Statement): Likewise.
Piotr Trojanek [Fri, 12 Aug 2022 10:04:35 +0000 (12:04 +0200)]
ada: Clean up unnecesary call in resolution of overloaded expressions
When experimentally enabling frontend inlining by default, the
unnecessary call to Comes_From_Predefined_Lib_Unit in Resolve appears to
be a performance bottleneck (most likely this call is expensive because
it involves a loop over the currently inlined subprograms).
Code cleanup; semantics is unaffected.
gcc/ada/
* sem_res.adb (Resolve): Only call Comes_From_Predefined_Lib_Unit
when its result might be needed.
Piotr Trojanek [Fri, 2 Sep 2022 11:32:27 +0000 (13:32 +0200)]
ada: Create operator nodes in functional style
A recent patch removed two rewritings, where we kept the operator node
but replaced its operands. This patch removes explicit setting of the
operands; instead, the operator is already created together with its
operands, which seems a bit safer and more consistent with how we
typically create operator nodes.
Piotr Trojanek [Fri, 2 Sep 2022 20:42:57 +0000 (22:42 +0200)]
ada: Don't reuse operator nodes in expansion
This patch removes handling of references to unset objects that relied
on Original_Node. This handling was only needed because of rewriting
that reused operator nodes, for example, when an array inequality like:
by keeping the node for operator "<" and only substituting its operands.
It seems safer to simply create an new operator node when rewriting and
not rely on Original_Node afterwards.
Cleanup related to improved detection uninitialized objects.
gcc/ada/
* checks.adb (Apply_Arithmetic_Overflow_Strict): Rewrite using a
newly created operator node.
* exp_ch4.adb (Expand_Array_Comparison): Likewise.
* exp_ch6.adb (Add_Call_By_Copy_Code): Rewriting actual parameter
using its own location and not the location of the subprogram
call.
* sem_warn.adb (Check_References): Looping with Original_Node is
no longer needed.
Piotr Trojanek [Wed, 7 Sep 2022 13:02:04 +0000 (15:02 +0200)]
ada: Reject misplaced pragma Obsolescent
Pragma Obsolescent appearing before declaration was putting the
Obsolescent flag on the Standard package, which is certainly wrong. The
problem was that we relied on the Find_Lib_Unit_Name routine without
sanitizing the pragma placement with Check_Valid_Library_Unit_Pragma.
Part of cleaning up the warnings machinery to better handle references
to unset objects.
Piotr Trojanek [Wed, 7 Sep 2022 13:01:16 +0000 (15:01 +0200)]
ada: Fix missing tag for with of an obsolescent function
Fix minor inconsistency in tags of warnings about obsolescent entities.
Part of cleaning up the warnings machinery to better handle references
to unset objects.
gcc/ada/
* sem_warn.adb (Output_Obsolescent_Entity_Warnings): Tag warnings
about obsolescent functions just like we tag similar warnings for
packages and procedures.
Piotr Trojanek [Wed, 12 Oct 2022 10:17:34 +0000 (12:17 +0200)]
ada: Remove useless validity suppression for attribute Input
Attributes 'Input and 'Read are similar, but only the 'Read denotes a
subprogram with parameter of mode OUT where operand validity checks need
to be suppressed.
Cleanup related to fix for attributes 'Has_Same_Storage and
'Overlaps_Storage.
gcc/ada/
* exp_attr.adb (Expand_N_Attribute_Reference): Remove useless
skipping for attribute Input.
Kewen Lin [Mon, 7 Nov 2022 08:07:27 +0000 (02:07 -0600)]
vect: Fold LEN_{LOAD,STORE} if it's for the whole vector [PR107412]
As the test case in PR107412 shows, we can fold IFN .LEN_{LOAD,
STORE} into normal vector load/store if the given length is known
to be equal to the length of the whole vector. It would help to
improve overall cycles as normally the latency of vector access
with length in bytes is bigger than normal vector access, and it
also saves the preparation for length if constant length can not
be encoded into instruction (such as on power).
PR tree-optimization/107412
gcc/ChangeLog:
* gimple-fold.cc (gimple_fold_mask_load_store_mem_ref): Rename to ...
(gimple_fold_partial_load_store_mem_ref): ... this, add one parameter
mask_p indicating it's for mask or length, and add some handlings for
IFN LEN_{LOAD,STORE}.
(gimple_fold_mask_load): Rename to ...
(gimple_fold_partial_load): ... this, add one parameter mask_p.
(gimple_fold_mask_store): Rename to ...
(gimple_fold_partial_store): ... this, add one parameter mask_p.
(gimple_fold_call): Add the handlings for IFN LEN_{LOAD,STORE},
and adjust calls on gimple_fold_mask_load_store_mem_ref to
gimple_fold_partial_load_store_mem_ref.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr107412.c: New test.
* gcc.target/powerpc/p9-vec-length-epil-8.c: Adjust scan times for
folded LEN_LOAD.
Patrick Palka [Sun, 6 Nov 2022 16:30:47 +0000 (11:30 -0500)]
libstdc++: Declare const global variables inline
The changes inside the regex_constants and execution namespaces seem to
be (the only) unimplemented parts of P0607R0 "Inline Variable for the
Standard Library"; the rest of the changes are to implementation details.
Patrick Palka [Sun, 6 Nov 2022 16:16:00 +0000 (11:16 -0500)]
libstdc++: Move stream initialization into compiled library [PR44952]
This patch moves the static object for constructing the standard streams
out from <iostream> and into the compiled library on systems that support
init priorities. This'll mean <iostream> no longer introduces a separate
global constructor in each TU that includes it.
We can do this only if the init_priority attribute is supported because
we need a way to ensure the stream initialization runs first before any
user global initializer, particularly when linking with a static
libstdc++.a.
* include/std/iostream (__ioinit): No longer define here if
the init_priority attribute is usable.
* src/c++98/ios_init.cc (__ioinit): Define here instead if
init_priority is usable, via ...
* src/c++98/ios_base_init.h: ... this new file.