Jakub Jelinek [Thu, 5 Jun 2025 13:47:19 +0000 (15:47 +0200)]
real: Fix up real_from_integer [PR120547]
The function has 2 problems, one is _BitInt specific and the other is
most likely also reproduceable only with it.
The first issue is that I've missed updating the function for _BitInt,
maxbitlen as MAX_BITSIZE_MODE_ANY_INT + HOST_BITS_PER_WIDE_INT
obviously isn't guaranteed to be larger than any integral type we might
want to convert at compile time from wide_int to REAL_VALUE_FORMAT.
Just using len instead of it works fine, at least when used after
HOST_BITS_PER_WIDE_INT is added to it and it is truncated to multiples
of HOST_BITS_PER_WIDE_INT.
The other bug is that if the value has too many significant bits (formerly
maxbitlen - cnt_l_z, now len - cnt_l_z), the code just shifts it right and
adds the shift count to the future exponent. That isn't correct for
rounding as the testcase attempts to show, the internal real format has more
bits than any precision in supported format, but we still need to
distinguish bewtween values exactly half way between representable floating
point values (those should be rounded to even) and the case when we've
shifted away some non-zero bits, so the value was tiny bit larger than half
way and then we should round up.
The patch uses something like e.g. soft-fp uses in these cases, right shift
with sticky bit in the least significant bit.
2025-06-05 Jakub Jelinek <jakub@redhat.com>
PR middle-end/120547
* real.cc (real_from_integer): Remove maxbitlen variable, use
len instead of that. When shifting right, or in 1 if any of the
shifted away bits are non-zero. Formatting fix.
Jonathan Wakely [Wed, 4 Jun 2025 17:22:28 +0000 (18:22 +0100)]
libstdc++: Fix std::format thousands separators when sign present [PR120548]
The leading sign character should be skipped when deciding whether to
insert thousands separators into a floating-point format.
libstdc++-v3/ChangeLog:
PR libstdc++/120548
* include/std/format (__formatter_fp::_M_localize): Do not
include a leading sign character in the string to be grouped.
* testsuite/std/format/functions/format.cc: Check grouping when
sign is present in the output.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
Jan Hubicka [Thu, 5 Jun 2025 13:24:36 +0000 (15:24 +0200)]
Fix handling of GUESSED_LOCAL in auto-fdo and preserve more static profile
This patch fixes ICE where GUESSED_LOCAL was kept in autofdo profile.
It may make more sense to turn GESSED_LOCAL 0 to GUESSED 0 since it seems
bit more informative then autofdo 0 (which really means that count is below
the 2% threshold or that info was lost due to some code transformation).
The patch also modifies code setting probabilities of edge to keep reliable
predictions of 0 or 1.
gcc/ChangeLog:
* auto-profile.cc (update_count_by_afdo_count): Fix handling
of GUESSED_LOCAL.
(afdo_calculate_branch_prob): Preserve static profile for
probabilities 0 and 1.
Pan Li [Thu, 5 Jun 2025 03:04:33 +0000 (11:04 +0800)]
RISC-V: Fix ICE for gcc.dg/graphite/pr33576.c with rv32gcv
The div of rvv has not such insn v2 = div (vec_dup (x), v1), thus
the generated rtl like that hit the unreachable assert when
expand insn. This patch would like to remove op div from
the binary op form (vec_dup (x), v) to avoid pattern matching
by mistake.
No new test introduced as pr33576.c covered already.
The below test suites are passed for this patch series.
* The rv64gcv fully regression test.
gcc/ChangeLog:
* config/riscv/autovec-opt.md: Leverage vdup_v and v_vdup
binary op for different patterns.
* config/riscv/vector-iterators.md: Add vdup_v and v_vdup
binary op iterators.
Jeff Law [Thu, 5 Jun 2025 12:17:25 +0000 (06:17 -0600)]
[RISC-V] Improve sequences to generate -1, 1 in some cases.
This patch has a minor improvement to if-converted sequences based on
observations I found while evaluating another patch from Shreya to handle more
cases with zicond insns.
Specifically there is a smaller/faster way than zicond to generate a -1,1
result when the condition is testing the sign bit.
So let's consider these two tests (rv64):
long foo1 (long c, long a) { return c >= 0 ? 1 : -1; }
long foo2 (long c, long a) { return c < 0 ? -1 : 1; }
So if we right arithmetic shift c by 63 bits, that splats the sign bit across a
register giving us 0, -1 for the first test and -1, 0 for the second test. We
then unconditionally turn on the LSB resulting in 1, -1 for the first case and
-1, 1 for the second.
This is implemented as a 4->2 splitter. There's another pair of cases we don't
handle because we don't have 4->3 splitters. Specifically if the true/false
values are reversed in the above examples without reversing the condition.
Raphael is playing a bit in the gimple space to see what opportunities might
exist to recognize more idioms in phiopt and generate better code earlier. No
idea how that's likely to pan out.
This is a pretty consistent small win. It's been through the rounds in my
tester. Just waiting on a green light from pre-commit testing.
gcc/
* config/riscv/zicond.md: Add new splitters to select
1, -1 or -1, 1 based on a sign bit test.
gcc/testsuite/
* gcc.target/riscv/nozicond-1.c: New test.
* gcc.target/riscv/nozicond-2.c: New test.
Jiawei [Thu, 5 Jun 2025 05:59:14 +0000 (13:59 +0800)]
RISC-V: Support Ssu64xl extension.
Support the Ssu64xl extension, which requires UXLEN to be 64.
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
Jiawei [Thu, 5 Jun 2025 05:52:08 +0000 (13:52 +0800)]
RISC-V: Support Sstvecd extension.
Support the Sstvecd extension, which allows Supervisor Trap Vector
Base Address register (stvec) to support Direct mode.
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
Jiawei [Thu, 5 Jun 2025 05:46:39 +0000 (13:46 +0800)]
RISC-V: Support Sstvala extension.
Support the Sstvala extension, which provides all needed values in
Supervisor Trap Value register (stval).
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
Jiawei [Thu, 5 Jun 2025 05:33:21 +0000 (13:33 +0800)]
RISC-V: Support Sscounterenw extension.
Support the Sscounterenw extension, which allows writeable enables for any
supported counter.
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
Jiawei [Thu, 5 Jun 2025 05:15:02 +0000 (13:15 +0800)]
RISC-V: Support Ssccptr extension.
Support the Ssccptr extension, which allows the main memory to support
page table reads.
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
Jiawei [Thu, 5 Jun 2025 03:24:43 +0000 (11:24 +0800)]
RISC-V: Support Smrnmi extension.
Support the Smrnmi extension, which provides new CSRs
for Machine mode Non-Maskable Interrupts.
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
Jiawei [Thu, 5 Jun 2025 02:16:19 +0000 (10:16 +0800)]
RISC-V: Support Sm/scsrind extensions.
Support the Sm/scsrind extensions, which provide indirect access to
machine-level CSRs.
gcc/ChangeLog:
* config/riscv/riscv-ext.def: New extension definition.
* config/riscv/riscv-ext.opt: New extension mask.
* doc/riscv-ext.texi: Document the new extension.
What happens is that symtab_remove_unreachable_nodes leaves the last symbol
in kind of a limbo state: in .remove_symbols, we have:
opt7_pkg__enum_name_table/13 (Opt7_Pkg.Enum_Name_Table)
Type: variable
Body removed by symtab_remove_unreachable_nodes
Visibility: externally_visible semantic_interposition external public
References:
Referring: opt7_pkg__image/2 (read) opt7_pkg__image/2 (read)
Availability: not_available
Varpool flags: initialized read-only const-value-known
This means that the "body" (DECL_INITIAL) of the symbol has been disregarded
during reachability analysis, causing the first two symbols to be discarded:
but the DECL_INITIAL is explicitly preserved for later constant folding,
which makes it possible to retrofit the DECLs corresponding to the first
two symbols in the GIMPLE IR and ultimately leads to the crash.
gcc/
* tree-vect-data-refs.cc (vect_can_force_dr_alignment_p): Return
false if the variable has no symtab node.
gcc/testsuite/
* gnat.dg/specs/opt7.ads: New test.
* gnat.dg/specs/opt7_pkg.ads: New helper.
* gnat.dg/specs/opt7_pkg.adb: Likewise.
Spencer Abson [Tue, 3 Jun 2025 12:15:12 +0000 (12:15 +0000)]
middle-end: Fix operation_could_trap_p for FIX_TRUNC expressions
Floating-point to integer conversions can be inexact or invalid (e.g., due to
overflow or NaN). However, since users of operation_could_trap_p infer the
bool FP_OPERATION argument from the expression's type, the FIX_TRUNC family
are considered non-trapping here.
* gcc.target/aarch64/sve/pr96357.c: Change to avoid producing
a conditional FIX_TRUNC_EXPR, whilst still reproducing the bug
in PR96357.
* gcc.dg/tree-ssa/ifcvt-fix-trunc-1.c: New test.
* gcc.dg/tree-ssa/ifcvt-fix-trunc-2.c: Likewise.
Tomasz Kamiński [Mon, 2 Jun 2025 07:06:56 +0000 (09:06 +0200)]
libstdc++: Fix formatting of 3-digits months,day,weekday and hour [PR120481]
This patch fixes the handle multiple digits values for the month, day, weekday
and hour, when used with the %m, %d, %e, %m, %u, %w, %H, and %D, %F specifiers.
The values are now printed unmodified. This patch also fixes printing negative
year with %F, where the values was not padded to four digits.
Furthemore, the %I,%p are adjusted to handle input with hours values set to
over 24 hours. In the case the values is interpretd modulo 24. This was already
the case for %r (locale's 12-hour clock), as we convert the input into seconds.
In case of %u, %w we print values unchanged, this makes the behavior of this
specifiers equivalent to printing the iso_encoding and c_encoding respectively.
As constructing weekday from value 7, initializes it with 0, the !ok() weekdays
values are always greater of equal eight, so they are clearly distinguishable.
The months, weekday, day values that can have 3 decimal digit as maximum
(range [0, 255]), we are using new _S_str_d1, _S_str_d2 that return string_view
containing textual representation, without padding or padded to two digits.
This function accepts are 3 character buffer, that are used for 3 digits number.
In other cases, we return _S_digit and _S_two_digits result directly. The former
is changed to return string_view to facilitate this.
For %F and %D when at least one component have more digits that expected (2 for
month and day, 4 for year), we produce output using format_to with appropriate
format string. Otherwise the representation is produced in local char buffer.
Two simply fill this buffer, _S_fill_two_digits function was added. We also
make sure that minus is not included in year width for %F.
The handling of %C, %Y, %y was adjusted to use similar pattern, for years with
more than two digits. To support that the order of characters in _S_chars was
adjusted so it contain "-{}" string.
For handling of %H, we print 3 or more digits values using format_to. The handling
for large hours values in %T and %R was changed, so they printed using format_to,
and otherwise we use same stack buffer as for minutes to print them.
PR libstdc++/120481
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__format::_S_chars): Reorder so it
contains "-{}".
(__format::_S_colon, __format::_S_slash, __format::_S_space)
(__format::_S_plus_minus): Updated starting indicies.
(__format::_S_minus_empty_spec): Define.
(__formatter_chrono::_M_C_y_Y, __formatter_chrono::_M_R_T):
Rework implementation.
(__formatter_chrono::_M_d_e, __formatter_chrono::_M_F)
(__formatter_chrono::_M_m, __formatter_chrono::_M_u_w)
(__formatter_chrono::_M_H_I, __formatter_chrono::_M_p):
Handle multi digits values.
(__formatter_chrono::_S_digit): Return string view.
(__formatter_chrono::_S_str_d1, __formatter_chrono::_S_str_d2)
(__formatter_chrono::_S_fill_two_digits): Define.
* testsuite/std/time/format/empty_spec.cc: Update test for
year_month_day, that uses '%F'.
* testsuite/std/time/format/pr120481.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Tobias Burnus [Thu, 5 Jun 2025 08:36:21 +0000 (10:36 +0200)]
gcn: Update --with-arch= for newer archs
Replace hard-coded list of supported devices by directly checking
config/gcn/gcn-devices.def.
gcc/ChangeLog:
* config.gcc (--with-{arch,tune}): Use .def file to validate gcn
processor names.
* doc/install.texi (amdgcn*-*-*): Update list of devices supported
by --with-arch/--with-tune.
squirek [Mon, 13 Jan 2025 21:04:51 +0000 (21:04 +0000)]
ada: Confusing "modified by call, but value overwritten" warning
The patch fixes an issue in the compiler whereby not referencing a local
variable used in multiple procedure calls as an "out" actual in between
calls would lead to a warning despite "-gnatw.o" not being present.
Additionally, this meant that using pragma Unreferenced on such variables
would not be able to silence such warnings.
gcc/ada/ChangeLog:
* sem_warn.adb
(Warn_On_Useless_Assignment): Disable out value "overwritten" warning
when we are not warning on unread out parameters (e.g. "-gnatw.o").
The documentation comment under SFN_Patterns was misleading. This patch
fixes it and also fixes Get_Default_File_Name which assumed the comment
was correct.
gcc/ada/ChangeLog:
* fname-uf.adb: Fix documentation comment.
(Get_Default_File_Name): Fix indices of default patterns.
ada: Tweak wording of documentation comments in Atree
This patch removes an outdated reference to the concept of node
extensions in comments. It also slightly clarifies the documentation of
Atree.Relocate_Node.
Javier Miranda [Thu, 6 Feb 2025 09:40:57 +0000 (09:40 +0000)]
ada: Spurious compilation error with repeated loop index
When multiple for-loop statements in the same scope use the
same index name to iterate through container elements, the
compiler reports a spurious error indicating a conflict
between index names.
gcc/ada/ChangeLog:
* exp_ch7.adb (Process_Object_Declaration): Avoid generating
duplicate names for master nodes.
If the body of a loop includes a raise statement then the loop should not be
considered to be free of side-effects and therefore eligible for elimination
by the compiler.
gcc/ada/ChangeLog:
* sem_util.adb
(Side_Effect_Free_Statements): Return False if the statement list
includes an explicit (i.e. Comes_From_Source) raise statement.
The generation of the check mandated by Ada issue AI05-0073 was not done
handled properly for protected types when used through subtypes. This
patch fixes the issue.
gcc/ada/ChangeLog:
* exp_ch4.adb (Tagged_Membership): Fix for protected types.
Bob Duff [Tue, 4 Feb 2025 19:36:03 +0000 (14:36 -0500)]
ada: Improve efficiency of very large shift counts
For a call to an intrinsic shift function with a large Amount, for
example Shift_Right(..., Amount => Natural'Last), and a
compile-time-known value, the compiler would take an absurdly long time
to compute the value. This patch fixes that by special-casing shift
counts that are larger than the size of the thing being shifted.
gcc/ada/ChangeLog:
* sem_eval.adb (Fold_Shift): If the Amount parameter is greater
than the size in bits, use the size. For example, if we are
shifting an Unsigned_8 value, then Amount => 1_000_001 gives the
same result as Amount => 8. This change avoids computing the value
of 2**1_000_000, which takes too long and uses too much memory.
Note that the computation we're talking about is a compile-time
computation. Minor cleanup. DRY.
* sem_eval.ads (Fold_Str, Fold_Uint, Fold_Ureal): Fold the
comments into one comment, because DRY. Remove useless
verbiage.
The way we fetch the path to shared objects for traceback generation is
not perfectly precise. This patch adds a sanity check to mitigate the
consequences of incorrect shared object paths. It's motivated by a real
world failure in a GNATSAS test.
Eric Botcazou [Fri, 24 Jan 2025 09:26:13 +0000 (10:26 +0100)]
ada: Implement built-in-place expansion of two-pass array aggregates
These are array aggregates containing only component associations that are
iterated with iterator specifications, as per RM 4.3.3(20.2/5-20.4/5).
It is implemented for the array aggregates that are used to initialize an
object, as specified by RM 7.6(17.2/3-17.3/3) for immutably limited types
and types that need finalization, but for all types like other aggregates.
gcc/ada/ChangeLog:
* exp_aggr.adb (Build_Two_Pass_Aggr_Code): New function containing
most of the code initially present in Two_Pass_Aggregate_Expansion.
(Two_Pass_Aggregate_Expansion): Remove redundant N parameter.
Implement built-in-place expansion for (static) object declarations
and allocators, using Build_Two_Pass_Aggr_Code for the main work.
(Expand_Array_Aggregate): Adjust Two_Pass_Aggregate_Expansion call.
Replace Etype (N) by Typ in a couple of places.
* exp_ch3.adb (Expand_Freeze_Array_Type): Remove special case for
two-pass array aggregates.
(Expand_N_Object_Declaration): Do not adjust the object when it is
initialized by a two-pass array aggregate.
* exp_ch4.adb (Expand_Allocator_Expression): Apply the processing
used for container aggregates to two-pass array aggregates.
* exp_ch6.adb (Validate_Subprogram_Calls): Skip calls present in
initialization expressions of N_Object_Declaration nodes that have
No_Initialization set.
* sem_ch3.adb (Analyze_Object_Declaration): Detect the cases of an
array originally initialized by an aggregate consistently.
Viljar Indus [Mon, 20 Jan 2025 13:10:22 +0000 (15:10 +0200)]
ada: Reject Valid_Value arguments originating from Standard
The constraint for Valid_Value not applying to types from Standard
should also apply to all types derived from those types.
gcc/ada/ChangeLog:
* doc/gnat_rm/implementation_defined_attributes.rst: Update the
documentation for Valid_Value.
* sem_attr.adb (Analyze_Attribute): Reject types where
the root type originates from Standard.
* gnat_rm.texi: Regenerate.
* gnat_ugn.texi: Regenerate.
Gary Dismukes [Sat, 18 Jan 2025 01:11:12 +0000 (01:11 +0000)]
ada: Error about assignment to limited target on aggregate with "for of" iterator
The compiler reports a spurious error about an assignment to a limited
object on an aggregate of a array type with limited components that has
an association with a "for of" iterator. This is fixed by arranging to
have the Assignment_OK flag set on the indexed_names generated by the
expander for initializing the aggregate object.
gcc/ada/ChangeLog:
* exp_aggr.adb (Two_Pass_Aggregate_Expansion): Change call to Make_Assignment
for the indexed aggregate object to call Change_Make_OK_Assignment instead.
Steve Baird [Tue, 14 Jan 2025 23:53:57 +0000 (15:53 -0800)]
ada: Add error message for a declared-too-late abstract state constituent
In the error case of an undefined abstract state constituent, we want to
help users distinguish between the case where the constituent is
"really" undefined versus being defined "too late" (i.e., after a body).
So in the latter case we generate an additional message.
gcc/ada/ChangeLog:
* sem_prag.adb
(Analyze_Constituent): In the specific case case of a defined-too-late
abstract state constituent, generate an additional error message.
Viljar Indus [Mon, 20 Jan 2025 18:04:59 +0000 (20:04 +0200)]
ada: Fix various issues in the SARIF report
gcc/ada/ChangeLog:
* diagnostics-sarif_emitter.adb (Print_Invocations): fix
commandLine and executionSuccessful nodes.
Fix typo in the name for startLine.
* osint.adb (Modified Get_Current_Dir) Fix generation of
the current directory.
(Relative_Path): Avoid relative paths starting with a
path separator.
* osint.ads: Update the documentation for Relative_Path.
ada: Fix unnecessarily large allocation in New_String
This patches fixes an issue where Interfaces.C.Strings.New_String
allocates more memory than necessary when passed a string that contains
a NUL character.
gcc/ada/ChangeLog:
* libgnat/i-cstrin.adb (New_String): Fix size of allocation.
squirek [Fri, 17 Jan 2025 15:38:43 +0000 (15:38 +0000)]
ada: Implement use implies with experimental extension
The patch implements the experimental feature to allow use package
clauses within the context area to imply with.
gcc/ada/ChangeLog:
* sem_ch8.adb (Analyze_Package_Name): Add code to expand use
clauses such that they have an implicit with associated with them
when extensions are enabled.
* sem_ch10.ads (Analyze_With_Clause): New.
* sem_ch10.adb (Analyze_With_Clause): Add comes from source check
for warning.
(Expand_With_Clause): Moved to the spec.
* sem_util.adb, sem_util.ads
(Is_In_Context_Clause): Moved from sem_prag.
* sem_prag.adb (Analyze_Pragma): Update calls to
Is_In_Context_Clause.
(Is_In_Context_Clause): Moved to sem_util.
Piotr Trojanek [Thu, 16 Jan 2025 16:41:56 +0000 (17:41 +0100)]
ada: Extend and clarify documentation of stack size settings for Windows
The original documentation for more recent versions of Windows didn't specify
whether the specified stack size acts as a "reserved" or "committed" stack
size.
Also, clarify the wording for older versions of Windows.
squirek [Thu, 16 Jan 2025 17:09:49 +0000 (17:09 +0000)]
ada: Spurious accessibility error with -gnatc
The patch fixes an issue in the compiler whereby a spurious accessibility
error gets generated in semantic checking mode (-gnatc) when an explicitly
aliased formal gets used as an actual for an access disriminant in a return
object.
gcc/ada/ChangeLog:
* accessibility.adb (Check_Return_Construct_Accessibility):
Disable check generation when we are only checking semantics.
* opt.ads: Add new flag for -gnatc mode
* switch-c.adb (Scan_Front_End_Switches): Set flag for -gnatc mode
Viljar Indus [Thu, 9 Jan 2025 10:37:56 +0000 (12:37 +0200)]
ada: Mark the types of operator arguments as used
When a use type clause is used then it makes the type and
all of its operators use visible in the context. When analyzing
whether a use type clause is effective we should additionally
mark the types of an overloaded operator as cases where the
use type clause is effective.
gcc/ada/ChangeLog:
* sem_ch8.adb (Mark_Use_Type): Additionally mark the types
of the parameters and return values as used when analyzing an
operator.
Eric Botcazou [Thu, 16 Jan 2025 14:51:00 +0000 (15:51 +0100)]
ada: Fix couple of remaining incompatibilities with CHERI architecture
These are the usual problematic patterns in the expanded code.
gcc/ada/ChangeLog:
* exp_ch9.adb (Build_Dispatching_Requeue): Take 'Tag of the
concurrent object instead of doing an unchecked conversion.
* exp_pakd.adb (Expand_Packed_Address_Reference): Perform address
arithmetic using an operator of System.Storage_Elements.
Eric Botcazou [Wed, 15 Jan 2025 19:37:48 +0000 (20:37 +0100)]
ada: Fix buffer overflow for function call returning discriminated limited record
This occurs when the discriminated limited record type is declared with
default values for its discriminants, is not controlled, and the context
of the call is anonymous, i.e. the result of the call is not assigned
to an object. In this case, a temporary is created to hold the result
of the call, with the default values of the discriminants, but the result
may have different values for the discriminants and, in particular, may
be larger than the temporary, which leads to a buffer overflow.
This problem does not occur when the context is an object declaration, so
the fix just makes sure that the expansion in an anonymous context always
uses the model of an object declaration. It requires a minor tweak to the
helper function Entity_Of of the Sem_Util package.
gcc/ada/ChangeLog:
* exp_ch6.adb (Expand_Actuals): Remove obsolete comment.
(Make_Build_In_Place_Call_In_Anonymous_Context): Always use a proper
object declaration initialized with the function call in the cases
where a temporary is needed, with Assignment_OK set on it.
* sem_util.adb (Entity_Of): Deal with rewritten function call first.
This patch fixes an integer underflow issue on calls of the form
New_Char_Array (X) with X'Last < X'First - 2. That integer underflow
caused attempts at allocating impossibly large amount of memory in some
cases.
gcc/ada/ChangeLog:
* libgnat/i-cstrin.adb (Position_Of_Nul): Change specification and
adjust body accordingly.
(New_Char_Array): Fix size of allocation.
(To_Chars_Ptr): Adapt to Position_Of_Nul change.
This patch adds a missing "-quiet" switch to the compiler invocations
performed by generated oracles. Without that switch, log lines could be
present before bug boxes for crashes in gigi and that caused the crash
detection logic to fail.
ada: Fix Generate_Minimal_Reproducer on instantiations
Before this patch, the code that creates a copy of the semantic closure
with the default naming convention was incorrect when the compiler was
processing a library unit that was an instantiation of a generic with a
body. This patch adds code to detect that situation and adjusts the
copying process accordingly.
gcc/ada/ChangeLog:
* generate_minimal_reproducer.adb (Generate_Minimal_Reproducer):
Fix when main library item is an instantiation.
Steve Baird [Mon, 13 Jan 2025 22:18:26 +0000 (14:18 -0800)]
ada: Fix compile-time failure due to duplicated attribute subprograms.
For a given type, and for certain attributes (the 4 streaming attributes
and, for Ada2022, the Put_Image attribute), the compiler needs to keep track
of whether a subprogram has already been generated for the given
type/attribute pair. In some cases this was being done incorrectly;
the compiler ended up generating duplicate subprograms (with the same
name), resulting in compilation failures. This could occur if the prefix
of an attribute reference denoted a subtype (more precisely, a non-first
subtype). This includes the case of a subtype declaration that is implicitly
introduced by the compiler to capture the binding between a formal type
in a generic and the corresponding actual type in an instantiation.
gcc/ada/ChangeLog:
* exp_attr.adb (Expand_N_Attribute_Reference): When accessing the
maps declared in package Cached_Attribute_Ops, the key value
passed to Get or to Set should never be the entity node for a
subtype. Use the entity of the corresponding type declaration
instead.
Viljar Indus [Tue, 14 Jan 2025 11:31:04 +0000 (13:31 +0200)]
ada: Mark constants inside a declare expression as referenced
Expressions within a declare expression were simply bound to
locally defined constants. However they were never marked as
referenced. This would trigger an unreferenced constant warning
if -gnatwu was used.
gcc/ada/ChangeLog:
* sem_res.adb (Resolve_Declare_Expression): Mark used
local variables inside a declare expression as referenced.
Javier Miranda [Tue, 14 Jan 2025 11:08:57 +0000 (11:08 +0000)]
ada: Cleanup preanalysis of static expressions (part 6)
Rename Preanalyze_Spec_Expression as Preanalyze_And_Resolve_Spec_Expression,
Preanalyze_Assert_Expression as Preanalyze_And_Resolve_Assert_Expression,
and Preanalyze_Default_Expression as Preanalyze_And_Resolve_Default_Expression;
cleanup the version of Preanalyze_Assert_Expression without context type.
gcc/ada/ChangeLog:
* sem.ads: Update reference to renamed subprogram in documentation.
* sem_ch3.ads (Preanalyze_Assert_Expression): Renamed.
(Preanalyze_Spec_Expression): Renamed.
* sem_ch3.adb (Preanalyze_Assert_Expression): Renamed and code cleanup.
(Preanalyze_Spec_Expression): Renamed.
(Preanalyze_Default_Expression): Renamed.
* contracts.adb: Update calls to renamed subprograms.
* exp_pakd.adb: Ditto.
* exp_util.adb: Ditto.
* freeze.adb: Ditto.
* sem_ch12.adb: Ditto.
* sem_ch13.adb: Ditto.
* sem_ch6.adb: Ditto.
* sem_prag.adb: Ditto.
* sem_res.adb (Preanalyze_And_Resolve): Add to the version without
context type the special handling for GNATprove mode provided by
the version with context type; required to cleanup the body of
Preanalyze_Assert_Expression.
squirek [Tue, 14 Jan 2025 06:40:08 +0000 (06:40 +0000)]
ada: Spurious accessibility error with -gnatc
The patch fixes an issue in the compiler whereby a spurious accessibility
error gets generated in semantic checking mode (-gnatc) when an explicitly
aliased formal gets used as an actual for an access disriminant in a return
object.
gcc/ada/ChangeLog:
* accessibility.adb
(Check_Return_Construct_Accessibility): Disable check generation
when we are only checking semantics.
Viljar Indus [Mon, 2 Dec 2024 10:18:06 +0000 (12:18 +0200)]
ada: Use absolute paths in SARIF reports
gcc/ada/ChangeLog:
* diagnostics-json_utils.adb: Add new method To_File_Uri to
convert any path to the URI standard.
* diagnostics-json_utils.ads: Likewise.
* diagnostics-sarif_emitter.adb: Converted Artifact_Change
types to use the Source_File_Index instead of the file name
to store the source file.
Removed the body from Destroy (Elem : in out Artifact_Change)
since it no longer contained elements with dynamic memory.
Updated the implementation of Equals (L, R : Artifact_Change)
to take into account the changes for Artifact_Change.
Print_Artifact_Location: Use the Source_File_Index as an
input argument. Now prints the uriBaseId attribute and a
relative path from the uriBaseId to the file in question as
the value of the uri attribute.
New method Print_Original_Uri_Base_Ids to print the
originalUriBaseIds node.
Print_Run no prints the originalUriBaseIds node.
Use constants instead of strings for all the SARIF attributes.
* osint.adb: Add new method Relative_Path to calculate the
relative path from a base directory.
Add new method Root to calculate the root of each directory.
Add new method Get_Current_Dir to get the current working
directory for the execution environment.
* osint.ads: Likewise.
* clean.adb: Use full names for calls to Get_Current_Dir.
* gnatls.adb: Likewise.
Steve Baird [Fri, 10 Jan 2025 21:15:18 +0000 (13:15 -0800)]
ada: Avoid calling Resolve with Stand.Any_Fixed as the expected type
When we call Resolve for an expression, we pass in the expected type
for that expression. In the absence of semantic errors, that expected type
should never be any of the "Any_xxx" types declared in stand.ads (e.g.,
Any_Array, Any_Numeric, Any_Real). In particular, it should never be Any_Fixed.
Fix a case in which this rule was being violated.
gcc/ada/ChangeLog:
* sem_res.adb
(Set_Mixed_Mode_Operand): If we are about to call Resolve
passing in Any_Fixed as the expected type, then instead pass in
the fixed point type of the other operand (i.e., B_Typ).
Gary Dismukes [Fri, 10 Jan 2025 22:39:52 +0000 (22:39 +0000)]
ada: Compiler crash on array aggregate association iterating over function result
The compiler triggers a bug box when compiling an array aggregate with
an iterated_component_association that iterates over another array object,
failing when trying to retrieve a Choices field, which isn't an allowed
field for N_Iterated_Component_Association nodes. This occurs in procedure
Check_Function_Writable_Actuals, which wasn't accounting for the iterated
association forms.
gcc/ada/ChangeLog:
* sem_util.adb (Check_Function_Writable_Actuals): Add handling for
N_Iterated_Component_Association and N_Iterated_Element_Association.
Fix a typo in an RM reference (6.4.1(20/3) => 6.4.1(6.20/3)).
(Collect_Expression_Ids): New procedure factoring code for collecting
identifiers from expressions of aggregate associations.
(Handle_Association_Choices): New procedure factoring code for handling
id collection for expressions of aggregate associations with multiple
choices. Removed redundant test of Box_Present from original code.
Hongyu Wang [Thu, 5 Jun 2025 06:45:08 +0000 (14:45 +0800)]
tree-sra: Use MOVE_MAX for sra size limit [PR112824]
Current sra use UNITS_PER_WORD to define max scalarization size, but
for targets like x86 it allows operations on larger size, so the
components like vector variables in an aggregate can be larger than
just UNITS_PER_WORD. Use MOVE_MAX instead of UNITS_PER_WORD to allow
sra for aggregates with vector components.
gcc/ChangeLog:
PR middle-end/112824
* tree-sra.cc (sra_get_max_scalarization_size): Use MOVE_MAX
instead of UNITS_PER_WORD to define max_scalarization_size.
We do not need to generate this code early, since it does not affect
any of the analysis. Lowering it later takes less code, and avoids
modifying the initial await expresssion which will simplify changes
to analysis to deal with open PRs.
gcc/cp/ChangeLog:
* coroutines.cc (expand_one_await_expression): Set the
initial_await_resume_called flag here.
(build_actor_fn): Populate the frame accessor for the
initial_await_resume_called flag.
(cp_coroutine_transform::wrap_original_function_body): Do
not modify the initial_await expression to include the
initial_await_resume_called flag here.
Hu, Lin1 [Tue, 27 May 2025 11:09:04 +0000 (19:09 +0800)]
i386: Fix vmovvdup's mem attribute
Some vmovvdup pattern's type attribute is sselog1 and then mem attribute is
both. Modify type attribute according to other patterns about vmovvdup.
gcc/ChangeLog:
* config/i386/sse.md
(avx512f_movddup512<mask_name>): Change sselog1 to ssemov.
(avx_movddup256<mask_name>): Ditto.
(*vec_dupv2di): Change alternative 4's type attribute from sselog1
to ssemov.
This patch introduces a new testcase to verify the merging of profiles
is performed for cloned functions.
Since this is invoked very early, before the pass manager, we need to
set up the dumping explicitly. This is similar to the handling in
finish_optimization_passes.
OpenMP: Fix regressions in metadirective-target-device-2.c [PR120518]
My previous patch that added a CLEANUP_POINT_EXPR around the device_num
selector expression in the C++ front end broke the testcase
c-c++-common/gomp/metadirective-target-device-2.c on offload targets.
It confused the code in omp_device_num_check that tries to bypass error
checking and do early resolution when the expression is a call to one
of the OpenMP library functions. The solution is to make that code smart
enough to look inside a CLEANUP_POINT_EXPR.
gcc/ChangeLog
PR c++/120518
* omp-general.cc (omp_device_num_check): Look inside a
CLEANUP_POINT_EXPR when trying to optimize special cases.
Jonathan Wakely [Wed, 28 May 2025 14:19:18 +0000 (15:19 +0100)]
libstdc++: Make system_clock::to_time_t always_inline [PR99832]
For some 32-bit targets Glibc supports changing the size of time_t to be
64 bits by defining _TIME_BITS=64. That causes an ABI change which
would affect std::chrono::system_clock::to_time_t. Because to_time_t is
not a function template, its mangled name does not depend on the return
type, so it has the same mangled name whether it returns a 32-bit time_t
or a 64-bit time_t. On targets where the size of time_t can be selected
at preprocessing time, that can cause ODR violations, e.g. the linker
selects a definition of to_time_t that returns a 32-bit value but a
caller expects 64-bit and so reads 32 bits of garbage from the stack.
This commit adds always_inline to to_time_t so that all callers inline
the conversion to time_t, and will do so using whatever type time_t
happens to be in that translation unit.
Existing objects compiled before this change will either have inlined
the function anyway (which is likely if compiled with any optimization
enabled) or will contain a COMDAT definition of the inline function and
so still be able to find it at link-time.
The attribute is also added to system_clock::from_time_t, because that's
an equally simple function and it seems reasonable for them to both be
always inlined.
libstdc++-v3/ChangeLog:
PR libstdc++/99832
* include/bits/chrono.h (system_clock::to_time_t): Add
always_inline attribute to be agnostic to the underlying type of
time_t.
(system_clock::from_time_t): Add always_inline for consistency
with to_time_t.
* testsuite/20_util/system_clock/99832.cc: New test.
Nathan Myers [Wed, 4 Jun 2025 18:52:29 +0000 (14:52 -0400)]
libstdc++: sstream from string_view (P2495R3) [PR119741]
Add constructors to stringbuf, stringstream, istringstream, and ostringstream,
and a matching overload of str(sv) in each, that take anything convertible to
a string_view in places where the existing ctors and function take a string.
Note this change omits the constraint applied to the istringstream constructor
from string cited as a "drive-by" in P2495R3, as we have determined it is
redundant.
libstdc++-v3/ChangeLog:
PR libstdc++/119741
* include/std/sstream: full implementation, really just
decls, requires clause and plumbing.
* include/bits/version.def, include/bits/version.h:
new preprocessor symbol
__cpp_lib_sstream_from_string_view.
* testsuite/27_io/basic_stringbuf/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_istringstream/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_ostringstream/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_stringstream/cons/char/string_view.cc:
New tests.
* testsuite/27_io/basic_stringbuf/cons/wchar_t/string_view.cc:
New tests.
* testsuite/27_io/basic_istringstream/cons/wchar_t/string_view.cc:
New tests.
* testsuite/27_io/basic_ostringstream/cons/wchar_t/string_view.cc:
New tests.
* testsuite/27_io/basic_stringstream/cons/wchar_t/string_view.cc:
New tests.
This implements the library changes in P0849R8 "auto(x): decay-copy
in the language" which consist of replacing most uses of the
exposition-only function decay-copy with auto(x) throughout the library
wording. We implement this as a DR against C++20 since there should be
no behavior change in practice (especially in light of LWG 3724 which
makes decay-copy SFINAE-friendly).
The main difference between decay-copy and auto(x) is that decay-copy
materializes its argument unlike auto(x), and so the latter is a no-op
when its argument is a prvalue. Effectively the former could introduce
an unnecessary move constructor call in some contexts. In C++20 and
earlier we could emulate auto(x) with decay_t<decltype((x))>(x).
After this paper the only remaining uses of decay-copy in the standard
are in the specification of some range adaptors. In our implementation
of those range adaptors I believe decay-copy is already implied which is
why we don't use __decay_copy explicitly there. So since it's apparently
no longer needed this patch goes ahead and removes __decay_copy.
Jason Merrill [Wed, 4 Jun 2025 17:31:02 +0000 (13:31 -0400)]
c++: constexpr prvalues vs genericize [PR120502]
Here constexpr evaluation was getting confused by the result of
split_nonconstant_init, which leaves an INIT_EXPR from an empty CONSTRUCTOR
to be followed by member initialization. As a result
CONSTRUCTOR_NO_CLEARING was set for the time_zone, and
cxx_eval_store_expression didn't set it again for the initial clobber in the
basic_string constructor, so when cxx_fold_indirect_ref wants to check
whether the anonymous union active member had type non_trivial_if, we see
that we don't currently have a value for the anonymous union, try to add
one, and fail.
So let's do constexpr evaluation before split_nonconstant_init.
PR c++/120502
gcc/cp/ChangeLog:
* cp-gimplify.cc (cp_fold_r) [TARGET_EXPR]: Do constexpr evaluation
before genericize.
* constexpr.cc (cxx_eval_store_expression): Add comment.
Thomas Schwinge [Mon, 26 May 2025 11:31:54 +0000 (13:31 +0200)]
Avoid SIGSEGV in nvptx 'mkoffload' for voluminous PTX code
In commit 50be486dff4ea2676ed022e9524ef190b92ae2b1
"nvptx: libgomp+mkoffload.cc: Prepare for reverse offload fn lookup", some
additional tracking of the PTX code was added, and this assumes that
potentially every single character of PTX code needs to be tracked as a new
chunk of PTX code. That's problematic if we're dealing with voluminous PTX
code (for example, non-trivial C++ code), and the 'file_idx' 'alloca'tion then
causes stack overflow. For example:
FAIL: libgomp.c++/target-std__valarray-1.C (test for excess errors)
UNRESOLVED: libgomp.c++/target-std__valarray-1.C compilation failed to produce executable
lto-wrapper: fatal error: [...]/build-gcc/gcc//accel/nvptx-none/mkoffload terminated with signal 11 [Segmentation fault], core dumped
gcc/
* config/nvptx/mkoffload.cc (process): Use an 'auto_vec' for
'file_idx'.
Andrew Pinski [Fri, 21 Feb 2025 06:05:38 +0000 (22:05 -0800)]
gimple-fold: Implement simple copy propagation for aggregates [PR14295]
This implements a simple copy propagation for aggregates in the similar
fashion as we already do for copy prop of zeroing.
Right now this only looks at the previous vdef statement but this allows us
to catch a lot of cases that show up in C++ code.
This used to deleted aggregate copies that are to the same location (PR57361)
But that was found to delete statements that are needed for aliasing markers reason.
So we need to keep them around until that is solved. Note DSE will delete the statements
anyways so there is no testcase added since we expose the latent bug in the same way.
See https://gcc.gnu.org/pipermail/gcc-patches/2025-May/685003.html for the testcase and
explaintation there.
Also adds a variant of pr22237.c which was found while working on this patch.
Changes since v1:
* v2: change check for vuse to use default definition.
Remove dest/src arguments for optimize_agr_copyprop
Changed dump messages slightly.
Added stats
Don't delete `a = a` until aliasing markers are added.
* tree-ssa-forwprop.cc (optimize_agr_copyprop): New function.
(pass_forwprop::execute): Call optimize_agr_copyprop for load/store statements.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/20031106-6.c: Un-xfail. Add scan for forwprop1.
* g++.dg/opt/pr66119.C: Disable forwprop since that does
the copy prop now.
* gcc.dg/tree-ssa/pr108358-a.c: New test.
* gcc.dg/tree-ssa/pr114169-1.c: New test.
* gcc.c-torture/execute/builtins/pr22237-1-lib.c: New test.
* gcc.c-torture/execute/builtins/pr22237-1.c: New test.
* gcc.dg/tree-ssa/pr57361.c: Disable forwprop1.
* gcc.dg/tree-ssa/pr57361-1.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Pengfei Li [Wed, 4 Jun 2025 15:59:44 +0000 (16:59 +0100)]
match.pd: Fold (x + y) >> 1 into IFN_AVG_FLOOR (x, y) for vectors
This patch folds vector expressions of the form (x + y) >> 1 into
IFN_AVG_FLOOR (x, y), reducing instruction count on platforms that
support averaging operations. For example, it can help improve the
codegen on AArch64 from:
add v0.4s, v0.4s, v31.4s
ushr v0.4s, v0.4s, 1
to:
uhadd v0.4s, v0.4s, v31.4s
As this folding is only valid when the most significant bit of each
element in both x and y is known to be zero, this patch checks leading
zero bits of elements in x and y, and extends get_nonzero_bits_1() to
handle uniform vectors. When the input is a uniform vector, the function
now returns the nonzero bits of its element.
Additionally, this patch adds more checks to reject vector types in bit
constant propagation (tree-bit-ccp), since tree-bit-ccp was designed for
scalar values only, and the new vector logic in get_non_zero_bits_1()
could lead to incorrect propagation results.
The result was many hundreds of warnings. The vast bulk of them were
recommendations for declaring variables as const, recommendations for
changing C-style casts to C++ casts, cheery notes about shadowed
variables, and complaints that malloc() results weren't being checked
for errors.
Two and a half days of applied OCD on my part has reduced the number of
warnings down to zero.
Xi Ruoyao [Sun, 11 May 2025 08:44:31 +0000 (16:44 +0800)]
ext-dce: Don't refine live width with SUBREG mode if !TRULY_NOOP_TRUNCATION_MODES_P [PR 120050]
If we see a promoted subreg and TRULY_NOOP_TRUNCATION says the
truncation is not a noop, then all bits of the inner reg are live. We
cannot reduce the live mask to that of the mode of the subreg.
gcc/ChangeLog:
PR rtl-optimization/120050
* ext-dce.cc (ext_dce_process_uses): Break early if a SUBREG in
rhs is promoted and the truncation from the inner mode to the
outer mode is not a noop when handling SETs.
Jakub Jelinek [Wed, 4 Jun 2025 15:22:58 +0000 (17:22 +0200)]
ranger: Some parameter formatting fixes
When reading the code, I've noticed various function definitions
with misaligned parameters, they should IMHO always align below the first
character after opening ( and in most cases they do, but in some
cases they were indented more or less. Perhaps the functions changed
name or something.
Jakub Jelinek [Wed, 4 Jun 2025 15:21:51 +0000 (17:21 +0200)]
ranger: Add support for float <-> float casts [PR120231]
I've noticed we don't even support say float -> double and other
scalar floating point to scalar floating point conversions in the
ranger, we just end up with VARYING for those.
The following patch attempts to fix that.
The reverse cast case uses float_binary_op_range_finish e.g. because
if the result isn't infinite, then the source couldn't be infinite
either even if the reverse fold_range would suggest that.
And special cases the case of guaranteed widening cast (where
we have assurance that all the source type values are exactly
representable in the destination type; using ieee_bits for that).
2025-06-04 Jakub Jelinek <jakub@redhat.com>
PR tree-optimization/120231
* range-op-mixed.h (operator_cast::fold_range): Add overload
with 3 {,const} frange & operands. Change parameter names and
add final override keywords for float <-> integer cast overloads.
(operator_cast::op1_range): Likewise.
* range-op-float.cc (operator_cast::fold_range): New overload
with 3 {,const} frange & operands.
(operator_cast::op1_range): Likewise.
Tomasz Kamiński [Tue, 3 Jun 2025 09:40:17 +0000 (11:40 +0200)]
libstdc++: Test for formatting with empty spec for time points.
Adding a tests for behavior of the ostream operator and the formatting
with empty chrono-spec for the chrono types. Current coverage is:
* time point, zoned_time and local_time_format in this commit,
* duration and hh_mm_ss in r16-1099-gac0a04b7a254fb,
* calendar types in r16-1016-g28a17985dd34b7.
libstdc++-v3/ChangeLog:
* testsuite/std/time/format/empty_spec.cc: New tests.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Patrick Palka [Wed, 4 Jun 2025 14:29:47 +0000 (10:29 -0400)]
libstdc++: Implement C++23 P1659R3 starts_with and ends_with
This implements ranges::starts_with and ranges::ends_with from the C++23
paper P1659R3. The corresponding_S_impl member functions take optional
optional size parameters __n1 and __n2 of the two ranges, where -1 means
the corresponding size is not known.
Dongyan Chen [Wed, 4 Jun 2025 14:03:31 +0000 (08:03 -0600)]
[PATCH] RISC-V: Imply zicsr for svade and svadu extensions.
This patch implies zicsr for svade and svadu extensions.
According to the riscv-privileged spec, the svade and svadu extensions
are privileged instructions, so they should imply zicsr.
Dongyan Chen [Wed, 4 Jun 2025 13:57:01 +0000 (07:57 -0600)]
[PATCH v2] RISC-V: Add svbare extension.
This patch support svbare extension, which is an extension in RVA23 profile.
To enable GCC to recognize and process svbare extension correctly at compile time.
Jonathan Wakely [Mon, 2 Jun 2025 22:01:40 +0000 (23:01 +0100)]
libstdc++: Refactor __semaphore_base member functions
Replace the _S_get_current and _S_do_try_acquire static member functions
with non-static member functions _M_get_current and _M_do_try_acquire.
This means they don't need the address of _M_counter passed in.
libstdc++-v3/ChangeLog:
* include/bits/semaphore_base.h (_S_get_current): Replace with
non-static _M_get_current.
(_S_do_try_acquire): Replace with non-static _M_do_try_acquire.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
There's a deadlock in std::counting_semaphore that occurs when the
semaphore is under contention. The bug happens when one thread tries to
acquire the mutex, calling __semaphore_base::_S_do_try_acquire to
atomically decrement the counter using compare_exchange_strong. If the
counter is non-zero (and so should be possible to decrement) but another
thread changes it (either incrementing or decrementing it) then the
compare_exchange fails and _S_do_try_acquire returns false. Because that
function is used by the predicate passed to __atomic_wait_address, when
it returns false the thread does a futex wait until the value changes.
However, when the predicate is false because the compare_exchange failed
due to not matching the expected value, waiting for the value to change
is incorrect. The correct behaviour would be to retry the
compare_exchange using the new value (as long as it's still non-zero).
Waiting for the value to change again means we can block forever,
because it might never change again.
The predicate should only test the value, not also attempt to alter it,
and its return value should mean only one thing, not conflate a busy
semaphore that cannot be acquired with a contended one that can be
acquired by retrying.
The correct behaviour of __semaphore_base::_M_acquire would be to
attempt the decrement, and to retry immediately if it failed due to
contention on the variable (i.e. due to the variable not having the
expected value). It should only wait for the value to change when the
value is zero, because that's the only time we can't decrement it.
This commit moves the _S_do_try_acquire call out of the predicate and
loops while it is false, only doing an atomic wait when the counter's
value is zero. The predicate used for the atomic wait now only checks
whether the value is decrementable (non-zero), without also trying to
perform that decrement.
In order for the caller to tell whether it should retry a failed
_S_do_try_acquire or should wait for the value to be non-zero, the value
obtained by a failed compare_exchange needs to be passed back to the
caller. _S_do_try_acquire is changed to take its parameter by reference,
so that the caller gets the new value and can check whether it's zero.
In order to avoid doing another atomic load after returning from an
atomic wait, the predicate is also changed to capture the local __val by
reference, and then assign to __val when it sees a non-zero value. That
makes the new value available to _M_acquire, so it can be passed to
_S_do_try_acquire as the expected value of the compare_exchange.
Although this means that the predicate is modifying data again, not just
checking a value, this modification is safe. It's not changing the
semaphore's counter, only changing a local variable in the caller to
avoid a redundant atomic load.
Equivalent changes are made to _M_try_acquire_until and
_M_try_acquire_for. They have the same bug, although they can escape the
deadlock if the wait is interrupted by timing out. For _M_acquire
there's no time out so it potentially waits forever.
_M_try_acquire also has the same bug, but can be simplified to just
calling _M_try_acquire_for(0ns). A timeout of zero results in calling
__wait_impl with the __spin_only flag set, so that the value is loaded
and checked in a spin loop but there is no futex wait. This means that
_M_try_acquire can still succeed under light contention if the counter
is being changed concurrently, at the cost of a little extra overhead.
It would be possible to implement _M_try_acquire as nothing more than an
atomic load and a compare_exchange, but it would fail when there is any
contention.
libstdc++-v3/ChangeLog:
PR libstdc++/104928
* include/bits/semaphore_base.h (_S_do_try_acquire): Take old
value by reference.
(_M_acquire): Move _S_do_try_acquire call out of the predicate
and loop on its result. Make the predicate capture and update
the local copy of the value.
(_M_try_acquire_until, _M_try_acquire_for): Likewise.
(_M_try_acquire): Just call _M_try_acquire_for.
* testsuite/30_threads/semaphore/104928-2.cc: New test.
* testsuite/30_threads/semaphore/104928.cc: New test.
Reviewed-by: Tomasz Kamiński <tkaminsk@redhat.com>
In the comment trail for PR119966, I'd said that the validate_subreg
condition:
/* The outer size must be ordered wrt the register size, otherwise
we wouldn't know at compile time how many registers the outer
mode occupies. */
if (!ordered_p (osize, regsize))
return false;
"is also potentially relevant" for paradoxical subregs. But I'd
forgotten an important caveat. If the inner size is smaller than
a register, we know that the inner value will only occupy a single
register. Although the paradoxical subreg might extend that single
register to multiple registers by padding with undefined bits,
the register size that matters for the extension is:
REGMODE_NATURAL_SIZE (omode)
rather than regsize's:
REGMODE_NATURAL_SIZE (imode)
The ordered check is still relevant if the inner value spans
multiple registers.
Enabling the check above for paradoxical subregs led to an ICE in the
testcase, where we tried to generate a VNx4QI paradoxical subreg of a
QI scalar. This was previously allowed, and AFAIK worked correctly.
The patch doesn't have the effect of relaxing the condition for
non-paradoxical subregs, since:
So even before the patch for PR119966, the condition only existed for
the maybe_gt (isize, regsize) case.
The term "block" used in the comment is taken from the rtl.texi
documentation of subregs.
gcc/
PR rtl-optimization/120447
* emit-rtl.cc (validate_subreg): Restrict ordered_p test
between osize and regsize to cases where the inner value
occupies multiple blocks.
gcc/testsuite/
PR rtl-optimization/120447
* gcc.dg/pr120447.c: New test.
Tobias Burnus [Wed, 4 Jun 2025 11:25:05 +0000 (13:25 +0200)]
libgomp.texi (omp_interop_*): Add note about 5.2-to-6.0 incompatibility
GCC uses the 6.0 types - which are unfortunately not quite compatible with
code expecting 5.1/5.2 data types. Therefore, this commit adds a note to
hopefully reduce surprises. Namely:
For C/C++: while OpenMP 5.1 and 5.2 used 'int *ret_code', OpenMP 6.0 uses
'omp_interop_rc_t *ret_code' in omp_interop_{int,ptr,str} and 'int' instead
of 'omp_interop_rc_t ret_code' in omp_get_interop_rc_desc.
Neither C nor C++ like passing the wrong pointer type, albeit for C, GCC < 14
and clang only warn (gcc >= r14-6037-g9715c545d33b3a has an error) and
using -fpermissive turns it into a warning and -Wno-incompatible-pointer-types
silences it for C.
C++ also dislikes passing an int to an enum, albeit -fpermissive turns the
error into a warning with g++ (but not clang++). And, here, using an enum
on the caller side works with both int and enum on the callee side.
libgomp/ChangeLog:
* libgomp.texi (omp_interop_{int,ptr,str,rc_desc}): Add note about
the 'ret_code' type change in OpenMP 6.
Tomasz Kamiński [Wed, 4 Jun 2025 09:05:11 +0000 (11:05 +0200)]
libstdc++: Fix format call and test formatting with empty specs for durations.
This patches fixes an obvious error, where the output iterator argument was
missing for call to format_to, when duration with custom representation types
are used.
It's also adding the test for behavior of ostream operator and the formatting
with empty chron-spec for the chrono types. Current coverage is:
* duration and hh_mm_ss in this commit,
* calendar types in r16-1016-g28a17985dd34b7.
libstdc++-v3/ChangeLog:
* include/bits/chrono_io.h (__formatter_chrono:_M_s): Add missing
__out argument to format_to call.
* testsuite/std/time/format/empty_spec.cc: New test.
Reviewed-by: Jonathan Wakely <jwakely@redhat.com> Signed-off-by: Tomasz Kamiński <tkaminsk@redhat.com>
Program received signal SIGSEGV, Segmentation fault.
0x000000000131174f in prepare_call_arguments (
bb=<basic_block 0x7fffe99dfba0 (2)>, insn=0x7fffe980cc60)
at /export/gnu/import/git/sources/gcc/gcc/var-tracking.cc:6277
6277 fndecl = MEM_EXPR (XEXP (call, 0));
(gdb) bt
bb=<basic_block 0x7fffe99dfba0 (2)>, insn=0x7fffe980cc60)
at /export/gnu/import/git/sources/gcc/gcc/var-tracking.cc:6277
at /export/gnu/import/git/sources/gcc/gcc/var-tracking.cc:10297
at /export/gnu/import/git/sources/gcc/gcc/var-tracking.cc:10526
at /export/gnu/import/git/sources/gcc/gcc/var-tracking.cc:10579
at /export/gnu/import/git/sources/gcc/gcc/var-tracking.cc:10616
Update prepare_call_arguments to check MEM_P before using MEM_EXPR.
gcc/
PR debug/120525
* var-tracking.cc (prepare_call_arguments): Use MEM_EXPR only
if MEM_P is true.
Fortran: Fix missing substring ref for allocatable saved vars [PR120483]
Compute a substring ref on an allocatable static character array
using pointer arithmetic. Using an array type corrupts type
layouting and crashes omp generation.
PR fortran/120483
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_substring): Use pointer arithmetic on
static allocatable char arrays.
Hu, Lin1 [Mon, 10 Mar 2025 08:52:22 +0000 (16:52 +0800)]
i386: Add more peephole2 for APX NDD
The patch aims to optimize
movb (%rdi), %al
movq %rdi, %rbx
xorl %esi, %eax, %edx
movb %dl, (%rdi)
cmpb %sil, %al
jne
to
xorb %sil, (%rdi)
movq %rdi, %rbx
jne
Reduce 2 mov and 1 cmp instructions.
Due to APX NDD allowing the dest register and source register to be different,
some original peephole2 are invalid. Add new peephole2 patterns for APX NDD.
gcc/ChangeLog:
* config/i386/i386.md (define_peephole2): Define some new peephole2 for
APX NDD.
Hu, Lin1 [Wed, 19 Feb 2025 07:51:40 +0000 (15:51 +0800)]
i386: Add more forms peephole2 for adc/sbb
Enable -mapxf will change some patterns about adc/sbb.
Hence gcc will raise an extra mov like
movq 8(%rdi), %rax
adcq %rax, 8(%rsi), %rax
movq %rax, 8(%rdi)
rather than
movq 8(%rsi), %rax
adcq %rax, 8(%rdi)
The patch add more kinds of peephole2 to eliminate the extra mov.
gcc/ChangeLog:
* config/i386/i386.md: Add 4 new peephole2 by swap the original
peephole2's operands' order to support new pattern.
Martin Uecker [Sun, 1 Jun 2025 18:34:52 +0000 (20:34 +0200)]
c: Move checking assertions from recursion when forming composite types to avoid ICE.
The checking assertion in composite_type_internal for structures and unions may
fail if there are self-referential types. To avoid this, we move them out of
the recursion. This should also be more efficient and covers other types.
We have to ignore some cases where we form composite types with qualifiers
not matching (PR120510).