Jason Merrill [Tue, 26 Nov 2024 21:19:05 +0000 (16:19 -0500)]
libcpp: modules and -include again
I enabled include translation to header units in r15-1104-ga29f481bbcaf2b,
but it seems that patch wasn't sufficient, as any diagnostics in the main
source file would show up as coming from the header instead.
Fixed by setting buffer->file for leaving the file transition that my
previous patch made us enter. And don't push a buffer of newlines, in this
case that messes up line numbers instead of aligning them.
libcpp/ChangeLog:
* files.cc (_cpp_stack_file): Handle -include of header unit more
specially.
gcc/testsuite/ChangeLog:
* g++.dg/modules/dashinclude-1_b.C: Add an #error.
* g++.dg/modules/dashinclude-1_a.H: Remove dg-module-do run.
Andi Kleen [Thu, 31 Oct 2024 17:26:16 +0000 (10:26 -0700)]
PR117350: Keep assembler name for abstract decls for autofdo
autofdo looks up inline stacks and tries to match them with the profile
data using their symbol name. Make sure all decls that can be in a inline stack
have a valid assembler name.
This fixes a bootstrap problem with autoprofiledbootstrap and LTO.
2024-10-30 Jason Merrill <jason@redhat.com>
Andrew Pinski <quic_apinski@quicinc.com>
Andi Kleen <ak@gcc.gnu.org>
gcc/ChangeLog:
PR bootstrap/117350
* tree.cc (need_assembler_name_p): Keep assembler name
for abstract declarations when autofdo is used.
Harald Anlauf [Tue, 26 Nov 2024 19:37:35 +0000 (20:37 +0100)]
Fortran: fix minor front-end memleaks
gcc/fortran/ChangeLog:
* expr.cc (find_inquiry_ref): Fix memleak introduced by scanning
the reference chain to find and simplify inquiry references.
* symbol.cc (gfc_copy_formal_args_intr): Free formal namespace
when not needed to avoid a front-end memleak.
Andrew Pinski [Tue, 26 Nov 2024 21:38:15 +0000 (13:38 -0800)]
aarch64: Update error message check for __builtin_launder check of sve-sizeless-2.C
r15-3614-g9fe57e4879de93 changed the error message for __builtin_launder but this testcase
was not updated for the new format of the error message since it is an aarch64 specific
testcase.
This patch updates the expected error message.
Pushed as obvious after testing to see the testcase now works.
gcc/testsuite/ChangeLog:
* g++.dg/ext/sve-sizeless-2.C: Update the expected error message
for __builtin_launder.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Andrew Pinski [Tue, 26 Nov 2024 21:05:00 +0000 (13:05 -0800)]
aarch64: Fix fp8_scalar_1.c's stacktest1
The function body test was expecting:
umov w0, v0.b[0]
strb w0, [sp, 15]
But the code generation was improved after r15-5375-gbeec291225be to just:
str b0, [sp, 15]
which is correct and better because no longer need to move between SIMD registers
and the GPRs.
This changes the function body test to new better code generation.
Pushed as obvious after a test of the testcase to make sure it now passes.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/fp8_scalar_1.c (stacktest1): Fix for new
improved code generation.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
David Malcolm [Tue, 26 Nov 2024 21:09:37 +0000 (16:09 -0500)]
selftest: invoke "diff" when ASSERT_STREQ fails
Currently when ASSERT_STREQ or ASSERT_STREQ_AT fail we print
both strings to stderr. However it can be hard to figure out
the problem (e.g. for 1-character differences in long strings).
Extend the output by writing out the strings to tempfiles and
invoking "diff -up" on them when we have such a selftest failure,
to (I hope) simplify debugging.
gcc/ChangeLog:
* selftest.cc (selftest::print_diff): New function.
(selftest::assert_streq): Call it when we have non-equal
non-null strings.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
David Malcolm [Tue, 26 Nov 2024 21:01:35 +0000 (16:01 -0500)]
testsuite: rename plugins from .c to .cc
In r12-6650-g5c69acb32329d4 we updated our sources from .c to .cc
since for some time GCC has been implemented in C++, not C.
GCC plugins are also implemented in C++, not C, but the plugins
in our testsuite still have .c extensions.
Rename the plugin implementation files in the testsuite from .c to .cc,
for consistency with GCC's implementation files (as opposed to .C,
which is used in C++ parts of the testsuite).
Don't rename the files that the plugins are tested *on*.
David Malcolm [Tue, 26 Nov 2024 20:58:25 +0000 (15:58 -0500)]
csky: use quotes when referring to cpus and archs [PR90160]
gcc/ChangeLog:
PR translation/90160
* config/csky/csky.cc (csky_configure_build_target): Use %qs when
referring to cpu and arch names.
(csky_option_override): Likewise.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
Harald Anlauf [Mon, 25 Nov 2024 21:55:10 +0000 (22:55 +0100)]
Fortran: passing inquiry ref of complex array to assumed rank dummy [PR117774]
PR fortran/117774
gcc/fortran/ChangeLog:
* trans-expr.cc (gfc_conv_procedure_call): When passing an array
to an assumed-rank dummy, terminate search for array reference of
actual argument before an inquiry reference (e.g. INQUIRY_RE,
INQUIRY_IM) so that bounds update works properly.
Alex Coplan [Tue, 26 Nov 2024 15:10:29 +0000 (15:10 +0000)]
gdbhooks: Handle references to vec* in VecPrinter
vec.h has this method:
template<typename T, typename A>
inline T *
vec_safe_push (vec<T, A, vl_embed> *&v, const T &obj CXX_MEM_STAT_INFO)
where v is a reference to a pointer to vec. This matches the regex for
VecPrinter, so gdbhooks.py attempts to print it but chokes on the reference.
I see the following:
#1 0x0000000002b84b7b in vec_safe_push<edge_def*, va_gc> (v=Traceback (most
recent call last):
File "$SRC/gcc/gcc/gdbhooks.py", line 486, in to_string
return '0x%x' % intptr(self.gdbval)
File "$SRC/gcc/gcc/gdbhooks.py", line 168, in intptr
return long(gdbval) if sys.version_info.major == 2 else int(gdbval)
gdb.error: Cannot convert value to long.
This patch makes VecPrinter handle such references by stripping them
(dereferencing) at the top of the relevant functions.
gcc/ChangeLog:
* gdbhooks.py (strip_ref): New. Use it ...
(VecPrinter.to_string): ... here,
(VecPrinter.children): ... and here.
Pan Li [Mon, 25 Nov 2024 03:45:30 +0000 (11:45 +0800)]
RISC-V: Refactor the testcases for RVV gather/scatter
This patch would like to refactor the testcases of gather/scatter
after sorts of optimization option passing to testcase. Includes:
* Remove unnecessary optimization options.
* Adjust dg-final by any-opts and/or no-opts if the rtl dump changes
on different optimization options (like O2, O3).
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
Pan Li [Mon, 25 Nov 2024 03:45:29 +0000 (11:45 +0800)]
RISC-V: Fix incorrect optimization options passing to gather/scatter
Like the strided load/store, the testcases of vector gather/scatter are
designed to pick up different sorts of optimization options but actually
these option are ignored according to the Execution log of gcc.log. This patch
would like to make it correct almost the same as what we fixed for
strided load/store.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/rvv.exp: Fix the incorrect optimization
options passing to testcases.
Jan Hubicka [Tue, 26 Nov 2024 12:52:09 +0000 (13:52 +0100)]
improve std::deque::_M_reallocate_map
Looking into reason why we still do throw_bad_alloc in clang binary I noticed
that quite few calls come from deque::_M_reallocate_map. This patch adds
unreachable to limit the size of realloc_map. _M_reallocate_map is called only
if new size is smaller then max_size. map is an array holding pointers to
entries of fixed size.
Since rellocation is done by doubling the map size, I think the maximal size of
map allocated is max_size / deque_buf_size rounded up times two. This should
be also safe for overflows since we have extra bit.
map size is always at least 8. Theoretically this computation may be wrong for
very large T, but in that case callers should never reallocate.
On the testcase I get:
jh@shroud:~> ~/trunk-install-new4/bin/g++ -O2 dq.C -c ; size -A dq.o | grep text
.text 284 0
.text._ZNSt5dequeIiSaIiEE17_M_reallocate_mapEmb 485 0
.text.unlikely 10 0
jh@shroud:~> ~/trunk-install-new5/bin/g++ -O2 dq.C -c ; size -A dq.o | grep text
.text 284 0
.text._ZNSt5dequeIiSaIiEE17_M_reallocate_mapEmb 465 0
.text.unlikely 10 0
so this saves about 20 bytes of rellocate_map, which I think is worthwhile.
Curiously enough gcc14 does:
jh@shroud:~> g++ -O2 dq.C -c ; size -A dq.o | grep text
.text 604 0
.text.unlikely 10 0
which is 145 bytes smaller. Obvoius difference is that _M_reallocate_map gets inlined.
Compiling gcc14 preprocessed file with trunk gives:
jh@shroud:~> g++ -O2 dq.C -S ; size -A dq.o | grep text
.text 762 0
.text.unlikely 10 0
So inlining is due to changes at libstdc++ side, but code size growth is due to
something else.
For clang this reduced number of thris_bad_new_array_length from 121 to 61.
libstdc++-v3/ChangeLog:
* include/bits/deque.tcc (std::deque::_M_reallocate_map): Add
__builtin_unreachable check to declare that maps are not very large.
* include/bits/stl_deque.h (std::deque::size): Add __builtin_unreachable
to check for maximal size of map.
gcc/testsuite/ChangeLog:
* g++.dg/tree-ssa/deque-1.C: New test.
* g++.dg/tree-ssa/deque-2.C: New test.
Eric Botcazou [Mon, 11 Nov 2024 10:16:26 +0000 (11:16 +0100)]
ada: Do not use ATTR_ADDR_EXPR for 'Unrestricted_Access
Unlike for 'Access or 'Unchecked_Access, the Attribute_to_gnu routine passes
ATTR_ADDR_EXPR to build_unary_op for 'Unrestricted_Access, which causes the
processing done in build_unary_op to flatten the reference, in particular to
remove all intermediate (view) conversions, which may be problematic for the
SUBSTITUTE_PLACEHOLDER_IN_EXPR machinery.
gcc/ada/ChangeLog:
* gcc-interface/trans.cc (Attribute_to_gnu) <Attr_Access>: Do not
pass ATTR_ADDR_EXPR to build_unary_op for 'Unrestricted_Access.
Eric Botcazou [Tue, 12 Nov 2024 18:46:12 +0000 (19:46 +0100)]
ada: Add minimal support for address clause/aspect on controlled objects
The clause and aspect have been accepted by the compiler for a few years,
but the result is generally an internal compiler error or an incorrect
finalization at run time.
gcc/ada/ChangeLog:
* exp_ch3.adb (Expand_N_Object_Declaration): Do not insert the tag
assignment there if the object has the Address aspect.
* exp_ch7.adb: Add clauses for Aspect package.
(Build_Finalizer.Process_Object_Declaration): Deal with an object
with delayed freezing.
(Insert_Actions_In_Scope_Around): If the target is the declaration
of an object with address clause or aspect, move all the statements
that have been inserted after it into the Initialization_Statements
list of the object.
* freeze.adb (Check_Address_Clause): Do not reassign the tag here,
instead set the appropriate flag on the assignment statement.
Eric Botcazou [Wed, 13 Nov 2024 08:42:01 +0000 (09:42 +0100)]
ada: Minor adjustments to error message for RM B.1(24)
The RM B.1(24) sub-clause says that imported entities cannot be initialized
and it is checked in three contexts, aspect Import, pragma Import and pragma
Import_Object, with slightly different error messages. Moreover, for the
aspect, the error is given twice because that of the pragma is also given.
In addition, if the initialization expression is an aggregate that is not
static, the error is given only for the aspect and not for the two pragmas.
This change aligns the error messages on that of pragma Import and plugs the
aforementioned loophole for the two pragmas.
gcc/ada/ChangeLog:
* sem_ch13.adb (Analyze_Aspect_Export_Import): Add explicit mention
of the declaration in the error message for the Import.
* sem_prag.adb (Process_Extended_Import_Export_Object_Pragma): Also
test Has_Init_Expression on the declaration node for Import_Object
and use the same wording as that of Import.
(Process_Import_Or_Interface): Also test Has_Init_Expression on the
declaration node for Import.
Javier Miranda [Tue, 29 Oct 2024 08:31:28 +0000 (08:31 +0000)]
ada: Refactor code of Check_Ambiguous_Call and Valid_Conversion
Code cleanup; factorizing code.
gcc/ada/ChangeLog:
* sem_ch2.adb (Check_Ambiguous_Call): Replace code factorized
code by call to the new subprogram Is_Ambiguous_Operand.
* sem_res.ads (Is_Ambiguous_Operand): New subprogram that
factorizes previous code in Check_Ambiguous_Call and
Valid_Conversion.
* sem_res.adb (Is_Ambiguous_Operand): New subprogram.
(Valid_Tagged_Conversion): Replace factorized code by call to
the new subprogram Is_Ambiguous_Operand.
(Report_Error_N): New subprogram.
(Report_Error_NE): New subprogram.
(Report_Interpretation): New subprogram.
(Conversion_Error_N): Removed; replaced by Report_Error_N.
(Conversion_Error_NE): Removed; replaced by Report_Error_NE.
(Valid_Conversion): Update Opnd_Type after the call to
Is_Ambiguous_Operand in the overloaded case.
Viljar Indus [Mon, 11 Nov 2024 09:01:12 +0000 (11:01 +0200)]
ada: Remove Warn_Runtime_Raise attribute from Error_Msg_Object
The goal of this attribute is to raise a warning to an error when
the -gnatwE flag is used. This is similar to the existing warnings
as error behavior under the Warn_Err flag so it can be merged.
gcc/ada/ChangeLog:
* errout.adb: Set Warn_Err as true if Is_Runtime_Error was
set in the error message.
* erroutc.adb: Remove instances of Warn_Runtime_Raise.
* erroutc.ads: Likewise.
* errutil.adb: Likewise.
Viljar Indus [Mon, 11 Nov 2024 08:19:21 +0000 (10:19 +0200)]
ada: Refactor checking redundant messages
Move common code between errout and errutil into a single function.
gcc/ada/ChangeLog:
* errout.adb: Use Is_Redundant_Error_Message.
* erroutc.adb: Move the common code for checking if a message
can be removed to Is_Redundant_Error_Message.
* erroutc.ads: Add definition of Is_Redundant_Error_Message.
* errutil.adb: Use Is_Redundant_Error_Message.
Viljar Indus [Tue, 5 Nov 2024 08:42:55 +0000 (10:42 +0200)]
ada: Remove Current_Node from Errout
This variable was used for Opt.Include_Subprogram_In_Messages
activated by -gnatdJ. This switch has been removed so this variable
is no longer used.
gcc/ada/ChangeLog:
* errout.ads: Remove Current_Node.
* errout.adb: Remove uses of Current_Node.
* par-ch6.adb: Same as above.
* par-ch7.adb: Same as above.
* par-ch9.adb: Same as above.
Viljar Indus [Mon, 4 Nov 2024 12:16:02 +0000 (14:16 +0200)]
ada: Remove Raise_Exception_On_Error
Raise_Exception_On_Error is never modified so it can be removed.
gcc/ada/ChangeLog:
* err_vars.ads: Remove Raise_Exception_On_Error and
Error_Msg_Exception.
* errout.ads: Same as above.
* errout.adb: Remove uses of Raise_Exception_On_Error and
Error_Msg_Exception.
* errutil.adb: Same as above.
Viljar Indus [Thu, 31 Oct 2024 13:50:46 +0000 (15:50 +0200)]
ada: Store error message kind as an enum
Simplify the storage for the kind of error message under a single
enumerator. This replaces the existing attributes with the following
enumeration values.
* Is_Warning_Msg => Warning
* Is_Style_Msg => Style
* Is_Info_Msg => Info
* Is_Check_Msg => Low_Check, Medium_Check, High_Check
* Is_Serious_Error => Error, if the attribute was false then
Non_Serious_Error.
gcc/ada/ChangeLog:
* diagnostics-converter.adb: Use new enum values instead
of the old attributes.
* diagnostics-switch_repository.adb: Same as above.
* diagnostics-utils.adb: Same as above.
* diagnostics.adb: Same as above.
* diagnostics.ads: Same as above.
* errout.adb: Same as above.
* erroutc.adb: Same as above.
* erroutc.ads: Remove old attriubtes and replace them
with Error_Msg_Kind.
* errutil.adb: Same as others.
Viljar Indus [Fri, 1 Nov 2024 11:15:21 +0000 (13:15 +0200)]
ada: Refactor code for printing the error location
gcc/ada/ChangeLog:
* errout.adb: Use Output_Msg_Location
* erroutc.adb: add common implementation for printing the
error message line.
* erroutc.ads: Add new method Output_Msg_Location
* errutil.adb: use Output_Msg_Location
The old specifications were ambiguous as to whether they expected
actuals to have %s/%b suffixes. The new specifications also increases
modularity across the board.
gcc/ada/ChangeLog:
* uname.ads (Is_Internal_Unit_Name, Is_Predefined_Unit_Name): Change
specifications to take a Unit_Name_Type as input.
(Encoded_Library_Unit_Name): New subprogram.
(Is_Predefined_Unit_Name): New overloaded subprogram.
(Get_External_Unit_Name_String): Make use of new
Encoded_Library_Unit_Name subprogram.
* uname.adb (Is_Internal_Unit_Name, Is_Predefined_Unit_Name): Adapt
bodies to specification changes.
* fname-uf.adb (Get_File_Name): Adapt to Uname interface changes.
Before this patch, the body of Fname.UF.Get_File_Name did a lot of
juggling with the global name buffer, which made it hard to understand.
This patch makes the body use local buffers instead.
gcc/ada/ChangeLog:
* fname-uf.adb (Get_File_Name): Use local name buffers.
Eric Botcazou [Mon, 11 Nov 2024 23:18:00 +0000 (00:18 +0100)]
ada: Fix latent issue exposed by recent change in aggregate expansion
The tag is not assigned when a compile-time known aggregate initializes an
object declared with an address clause/aspect.
gcc/ada/ChangeLog:
* freeze.adb: Remove clauses for Exp_Ch3.
(Check_Address_Clause): Always reassign the tag for an object of a
tagged type if there is an initialization expression.
Paul Thomas [Tue, 26 Nov 2024 08:58:21 +0000 (08:58 +0000)]
Fortran: Partial reversion of r15-5083 [PR117763]
2024-11-26 Paul Thomas <pault@gcc.gnu.org>
gcc/fortran
PR fortran/117763
* trans-array.cc (gfc_get_array_span): Guard against derefences
of 'expr'. Clean up some typos. Use 'gfc_get_vptr_from_expr'
for clarity and apply a functional reversion of last section
that deals with class dummies.
gcc/testsuite/
PR fortran/117763
* gfortran.dg/pr117763.f90: New test.
we wrongly "propagate" VL=2 from vslidedown into the load.
Although we check whether the "target" instruction has a merge operand
the check only handles cases where the merge operand itself is
loaded, like (2) in the snippet above. For (1) we load the non-merged
operand, assume propagation is valid and continue despite (2).
This patch just re-uses avl_can_be_propagated_p in order to disable
slides altogether in such situations.
gcc/ChangeLog:
* config/riscv/riscv-avlprop.cc (pass_avlprop::get_vlmax_ta_preferred_avl):
Check whether the use insn is valid for propagation.
Jakub Jelinek [Tue, 26 Nov 2024 08:46:51 +0000 (09:46 +0100)]
builtins: Fix up DFP ICEs on __builtin_fpclassify [PR102674]
This patch is similar to the one I've just posted, __builtin_fpclassify also
needs to print decimal float minimum differently and use real_from_string3.
Plus I've done some formatting fixes.
2024-11-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/102674
* builtins.cc (fold_builtin_fpclassify): Use real_from_string3 rather
than real_from_string. Use "1E%d" format string rather than "0x1p%d"
for decimal float minimum. Formatting fixes.
Jakub Jelinek [Tue, 26 Nov 2024 08:45:21 +0000 (09:45 +0100)]
builtins: Fix up DFP ICEs on __builtin_is{inf,finite,normal} [PR43374]
__builtin_is{inf,finite,normal} builtins ICE on _Decimal{32,64,128,64x}
operands unless those operands are constant.
The problem is that we fold the builtins to comparisons with the largest
finite number, but
a) get_max_float was only handling binary floats
b) real_from_string again assumes binary float
and so we were ICEing in the build_real called after the two calls.
This patch adds decimal handling into get_max_float (well, moves it
from c-cppbuiltin.cc which was printing those for __DEC{32,64,128}_MAX__
macros) and uses real_from_string3 (perhaps it is time to rename it
to just real_from_string now that we can use function overloading)
so that it handles both binary and decimal floats.
2024-11-26 Jakub Jelinek <jakub@redhat.com>
PR middle-end/43374
gcc/
* real.cc (get_max_float): Handle decimal float.
* builtins.cc (fold_builtin_interclass_mathfn): Use
real_from_string3 rather than real_from_string. Use
"1E%d" format string rather than "0x1p%d" for decimal
float minimum.
gcc/c-family/
* c-cppbuiltin.cc (builtin_define_decimal_float_constants): Use
get_max_float.
gcc/testsuite/
* gcc.dg/dfp/pr43374.c: New test.
Andrew Pinski [Tue, 26 Nov 2024 08:37:33 +0000 (00:37 -0800)]
affine: Remove unused variable rem from wide_int_constant_multiple_p
This might fix the current bootstrap failure on aarch64, I only tested it
on x86_64. But the rem variable is unused and the for poly_widest_int, there
could be loop if NUM_POLY_INT_COEFFS is 2 or more. In the case of aarch64,
NUM_POLY_INT_COEFFS is 2.
Note the reason why there is warning for the unused variable is due to the deconstructor.
Pushed as obvious after a build for x86_64-linux-gnu.
gcc/ChangeLog:
* tree-affine.cc (wide_int_constant_multiple_p): Remove unused rem variable.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Jonathan Wakely [Sun, 17 Nov 2024 20:46:07 +0000 (20:46 +0000)]
libstdc++: Move std::error_category symbol to separate file [PR117630]
As described in PR 117630 the cow-stdexcept.cc file pulls in symbols
from system_error.cc, which are not actually needed there. Moving the
definition of error_category::_M_message to a separate file should solve
it.
libstdc++-v3/ChangeLog:
PR libstdc++/117630
* src/c++11/Makefile.am: Add new file.
* src/c++11/Makefile.in: Regnerate.
* src/c++11/cow-stdexcept.cc (error_category::_M_message): Move
member function definition to ...
* src/c++11/cow-system_error.cc: New file.
Cui, Lili [Tue, 26 Nov 2024 07:10:23 +0000 (15:10 +0800)]
Optimize 128-bit vector permutation with pand, pandn and por.
This patch introduces a new subroutine in ix86_expand_vec_perm_const_1.
On x86, use mixed constant permutation for V8HImode and V16QImode when
SSE2 is supported. This patch handles certain vector shuffle operations
more efficiently using pand, pandn, and por. This change is intended to
improve assembly code generation for configurations that support SSE2.
gcc/ChangeLog:
PR target/116675
* config/i386/i386-expand.cc (expand_vec_perm_pand_pandn_por):
New subroutine.
(ix86_expand_vec_perm_const_1): Call expand_vec_perm_pand_pandn_por.
gcc/testsuite/ChangeLog:
PR target/116675
* gcc.target/i386/pr116675.c: New test.
Haochen Jiang [Fri, 22 Nov 2024 07:57:47 +0000 (15:57 +0800)]
i386/testsuite: Correct AVX10.2 FP8 test mask usage
Under FP8, we should not use AVX512F_LEN_HALF to get the mask size since
it will get 16 instead of 8 and drop into wrong if condition. Correct
the usage for vcvtneph2[b,h]f8[,s] runtime test.
Joseph Myers [Tue, 26 Nov 2024 03:25:44 +0000 (03:25 +0000)]
c: Fix ICEs from invalid atomic compound assignment [PR98195, PR117755]
As reported in bug 98195, there are ICEs from an _Atomic compound
assignment with an incomplete type on the RHS, arising from an invalid
temporary being created with such a type. As reported in bug 117755,
there are also (different) ICEs in cases with complete types where the
binary operation itself is invalid, when inside a nested function,
arising from a temporary being created for the RHS, but then not used
because the binary operation returns error_mark_node, resulting in the
temporary not appearing in a TARGET_EXPR, never getting its
DECL_CONTEXT set by the gimplifier and eventually resulting in an ICE
in nested function processing (trying to find a function context for
the temporary) as a result.
Fix the first ICE with an earlier check for a complete type for the
RHS of an assignment so the problematic temporary is never created for
an incomplete type (which changes the error message three existing
tests get for that case; the new message seems as good as the old
one). Fix the second ICE by ensuring that once a temporary has been
created, it always gets a corresponding TARGET_EXPR even on error.
Bootstrapped with no regressions for x86_64-pc-linux-gnu.
PR c/98195
PR c/117755
gcc/c/
* c-typeck.cc (build_atomic_assign): Always create a TARGET_EXPR
for newval even in case of error from binary operation.
(build_modify_expr): Check early for incomplete type of rhs.
Gaius Mulley [Mon, 25 Nov 2024 22:46:16 +0000 (22:46 +0000)]
PR modula2/117777: m2 does not allow single const string in asm volatile
gm2 does not allow single const string in ASM VOLATILE. The bugfix is to
modify AsmOperands in all passes except P3Build.bnf (which is correct).
The remaining passes need to make the term following the ConstExpression
optional.
gcc/m2/ChangeLog:
PR modula2/117777
* gm2-compiler/P0SyntaxCheck.bnf (AsmOperands): Allow term after
ConstExpression to be optional.
* gm2-compiler/P1Build.bnf (AsmOperands): Ditto.
* gm2-compiler/P2Build.bnf (AsmOperands): Ditto.
* gm2-compiler/PCBuild.bnf (AsmOperands): Ditto.
* gm2-compiler/PHBuild.bnf (AsmOperands): Ditto.
gcc/testsuite/ChangeLog:
PR modula2/117777
* gm2/extensions/asm/pass/conststr.mod: New test.
Andrew Pinski [Mon, 25 Nov 2024 22:03:27 +0000 (14:03 -0800)]
build: Move sstream include above safe-ctype.h {PR117771]
sstream in some versions of libstdc++ include locale which might not have been
included yet. safe-ctype.h defines the toupper, tolower, etc. as macros so the
c++ header files needed to be included before hand as comment in system.h says:
/* Include C++ standard headers before "safe-ctype.h" to avoid GCC
poisoning the ctype macros through safe-ctype.h */
I don't understand how it was working before when memory was included after
safe-ctype.h rather than before. But this makes sstream consistent with the
other C++ headers.
Pushed as obvious after a build for riscv64-elf.
gcc/ChangeLog:
PR target/117771
* system.h: Move the include of sstream above safe-ctype.h.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
H.J. Lu [Sat, 12 Oct 2024 20:53:14 +0000 (04:53 +0800)]
sibcall: Check partial != 0 for BLKmode argument
The outgoing stack slot size may be different from the BLKmode argument
size due to parameter alignment. Check partial != 0 for BLKmode argument
passed on stack.
gcc/
PR middle-end/117098
* calls.cc (store_one_arg): Check partial != 0 for BLKmode argument
passed on stack.
gcc/testsuite/
PR middle-end/117098
* gcc.dg/sibcall-12.c: New test.
hppa: Revise TImode aritmetic patterns to support arith11_operands
2024-11-25 John David Anglin <danglin@gcc.gnu.org>
gcc/ChangeLog:
PR target/117645
* config/pa/pa.md (addti3): Revise pattern to support
arith11_operands. Use "R" operand prefix to print least
significant register of TImode register pair.
(addvti3, subti3, subvti3): Likewise.
(negti2, negvti2): Use "R" operand prefix.
[PR117105][LRA]: Use unique value reload pseudo for early clobber operand
LRA did not generate insn satisfying insn constraints on the PR
test. The reason for this is that LRA assigned the same hard reg for
two conflicting reload pseudos. The two insn reload pseudos are
originated from the same pseudo and LRA tried to optimize as it
assigned the same value for the reload pseudos. It is an LRA
optimization to minimize reload insns. The two reload pseudos
conflict as one of them is an early clobber insn operands. The patch
solves this problem by assigning unique value if the operand is early
clobber one.
gcc/ChangeLog:
PR target/117105
* lra-constraints.cc (get_reload_reg): Create unique value reload
pseudos for early clobbered operands.
gcc/testsuite/ChangeLog:
PR target/117105
* gcc.target/i386/pr117105.c: New test.
Sandra Loosemore [Sat, 23 Nov 2024 23:59:13 +0000 (23:59 +0000)]
nios2: Remove all support for Nios II target.
nios2 target support in GCC was deprecated in GCC 14 as the
architecture has been EOL'ed by the vendor. This patch removes the
entire port for GCC 15
There are still references to "nios2" in libffi and libgo. Since those
libraries are imported into the gcc sources from master copies maintained
by other projects, those will need to be addressed elsewhere.
Steve Kargl [Mon, 25 Nov 2024 02:26:03 +0000 (18:26 -0800)]
Fortran: Check IMPURE in BLOCK inside DO CONCURRENT.
PR fortran/117765
gcc/fortran/ChangeLog:
* resolve.cc (check_pure_function): Check the stack to
see if the function is in a nested BLOCK and, if that
block is inside a DO_CONCURRENT, issue an error.
gcc/testsuite/ChangeLog:
* gfortran.dg/impure_fcn_do_concurrent.f90: New test.
Robin Dapp [Thu, 21 Nov 2024 13:49:53 +0000 (14:49 +0100)]
RISC-V: Ensure vtype for full-register moves [PR117544].
As discussed in PR117544 the VTYPE register is not preserved across
function calls. Even though vmv1r-like instructions operate
independently of the actual vtype they still require a valid vtype. As
we cannot guarantee that the vtype is valid we must make sure to emit a
vsetvl between a function call and a vmv1r.v.
This patch makes the necessary changes by splitting the full-reg-move
insns into patterns that use the vtype register and adding vmov to the
types of instructions requiring a vset.
Robin Dapp [Thu, 21 Nov 2024 14:34:37 +0000 (15:34 +0100)]
genemit: Distribute evenly to files [PR111600].
currently we distribute insn patterns in genemit, partitioning them
by the number of patterns per file. The first 100 into file 1, the
next 100 into file 2, and so on. Depending on the patterns this
can lead to files of very uneven sizes.
Similar to the genmatch split, this patch introduces a dynamic
choose_output () which considers the size of the output files
and selects the shortest one for the next pattern.
gcc/ChangeLog:
PR target/111600
* genemit.cc (handle_arg): Use files instead of filenames.
(main): Ditto.
* gensupport.cc (SIZED_BASED_CHUNKS): Define.
(choose_output): New function.
* gensupport.h (choose_output): Declare.
Richard Biener [Mon, 25 Nov 2024 12:32:15 +0000 (13:32 +0100)]
target/116760 - 416.gamess slowdown with SLP
For the TWOTFF loop vectorization the backend scales constructor
and vector extract cost to make higher VFs less profitable. This
heuristic currently fails to consider VMAT_STRIDED_SLP which we
now get with single-lane SLP, causing a huge regression in SPEC 2k6
416.gamess for the respective loop nest.
The following fixes this, matching behavior to that of GCC 14 by
treating single-lane VMAT_STRIDED_SLP the same as VMAT_ELEMENTWISE.
PR target/116760
* config/i386/i386.cc (ix86_vector_costs::add_stmt_cost):
Scale vec_construct for single-lane VMAT_STRIDED_SLP the
same as VMAT_ELEMENTWISE.
* tree-vect-stmts.cc (vectorizable_store): Pass SLP node
down to costing for vec_to_scalar for VMAT_STRIDED_SLP.
Richard Biener [Fri, 22 Nov 2024 12:58:08 +0000 (13:58 +0100)]
Add extra 64bit SSE vector epilogue in some cases
Similar to the X86_TUNE_AVX512_TWO_EPILOGUES tuning which enables
an extra 128bit SSE vector epilouge when doing 512bit AVX512
vectorization in the main loop the following allows a 64bit SSE
vector epilogue to be generated when the previous vector epilogue
still had a vectorization factor of 16 or larger (which usually
means we are operating on char data).
This effectively applies to 256bit and 512bit AVX2/AVX512 main loops,
a 128bit SSE main loop would already get a 64bit SSE vector epilogue.
Together with X86_TUNE_AVX512_TWO_EPILOGUES this means three
vector epilogues for 512bit and two vector epilogues when enabling
256bit vectorization. I have not added another tunable for this
RFC - suggestions on how to avoid inflation there welcome.
This speeds up 525.x264_r to within 5% of the -mprefer-vector-size=128
speed with -mprefer-vector-size=256 or -mprefer-vector-size=512
(the latter only when -mtune-crtl=avx512_two_epilogues is in effect).
I have not done any further benchmarking, this merely shows the
possibility and looks for guidance on how to expose this to the
uarch tunings or to the user (at all?) if not gating on any uarch
specific tuning.
Note 64bit SSE isn't a native vector size so we rely on emulation
being "complete" (if not epilogue vectorization will only fail, so
it's "safe" in this regard). With AVX512 ISA available an alternative
is a predicated epilog, but due to possible STLF issues user control
would be required here.
* config/i386/i386.cc (ix86_vector_costs::finish_cost): For an
128bit SSE epilogue request a 64bit SSE epilogue if the 128bit
SSE epilogue VF was 16 or higher.
Richard Biener [Mon, 25 Nov 2024 08:46:28 +0000 (09:46 +0100)]
tree-optimization/117767 - VMAT_STRIDED_SLP and alignment
This plugs another hole in alignment checking with VMAT_STRIDED_SLP.
When using an alternate load or store type we have to check whether
that's supported with respect to required vector alignment.
PR tree-optimization/117767
* tree-vect-stmts.cc (vectorizable_store): Check for supported
alignment before using a an alternate store vector type.
(vectorizable_load): Likewise for loads.
Jakub Jelinek [Mon, 25 Nov 2024 08:36:41 +0000 (09:36 +0100)]
libsanitizer: Remove -pedantic from AM_CXXFLAGS [PR117732]
We aren't the master repository for the sanitizers and clearly upstream
introduces various extensions in the code.
All we care about is whether it builds and works fine with GCC, so
-pedantic flag is of no use to us, only maybe to upstream if they
cared about it (which they clearly don't).
The following patch removes those and fixes some whitespace nits at the same
time.
Richard Biener [Wed, 10 Jul 2024 10:45:02 +0000 (12:45 +0200)]
tree-optimization/115825 - improve unroll estimates for volatile accesses
The loop unrolling code assumes that one third of all volatile accesses
can be possibly optimized away which is of course not true. This leads
to excessive unrolling in some cases. The following tracks the number
of stmts with side-effects as those are not eliminatable later and
only assumes one third of the other stmts can be further optimized.
This causes some fallout in the testsuite where we rely on unrolling
even when calls are involved. I have XFAILed g++.dg/warn/Warray-bounds-20.C
but adjusted the others with a #pragma GCC unroll to mimic previous
behavior and retain what the testcase was testing. I've also filed
PR117671 for the case where the size estimation fails to honor the
stmts we then remove by inserting __builtin_unreachable ().
For gcc.dg/tree-ssa/cunroll-2.c the estimate that the code doesn't
grow is clearly bogus and we have explicit code to reject unrolling
for bodies containing calls so I've adjusted the testcase accordingly.
PR tree-optimization/115825
* tree-ssa-loop-ivcanon.cc (loop_size::not_eliminatable_after_peeling):
New.
(loop_size::last_iteration_not_eliminatable_after_peeling): Likewise.
(tree_estimate_loop_size): Count stmts with side-effects as
not optimistically eliminatable.
(estimated_unrolled_size): Compute the number of stmts that can
be optimistically eliminated by followup transforms.
(try_unroll_loop_completely): Adjust.
* gcc.dg/tree-ssa/cunroll-17.c: New testcase.
* gcc.dg/tree-ssa/cunroll-2.c: Adjust to not expect unrolling.
* gcc.dg/pr94600-1.c: Force unrolling.
* c-c++-common/ubsan/unreachable-3.c: Likewise.
* g++.dg/warn/Warray-bounds-20.C: XFAIL cases we rely on
unrolling loops created by new expressions and not inlined
CTOR invocations.
Kito Cheng [Fri, 15 Nov 2024 04:14:54 +0000 (12:14 +0800)]
asan: Support dynamic shadow offset
AddressSanitizer has supported dynamic shadow offsets since 2016[1], but
GCC hasn't implemented this yet because targets using dynamic shadow
offsets, such as Fuchsia and iOS, are mostly unsupported in GCC.
However, RISC-V 64 switched to dynamic shadow offsets this year[2] because
virtual memory space support varies across different RISC-V cores, such as
Sv39, Sv48, and Sv57. We realized that the best way to handle this
situation is by using a dynamic shadow offset to obtain the offset at
runtime.
We introduce a new target hook, TARGET_ASAN_DYNAMIC_SHADOW_OFFSET_P, to
determine if the target is using a dynamic shadow offset, so this change
won't affect the static offset path. Additionally, TARGET_ASAN_SHADOW_OFFSET
continues to work even if TARGET_ASAN_DYNAMIC_SHADOW_OFFSET_P is non-zero,
ensuring that KASAN functions as expected.
This patch set has been verified on the Banana Pi F3, currently one of the
most popular RISC-V development boards. All AddressSanitizer-related tests
passed without introducing new regressions.
It was also verified on AArch64 and x86_64 with no regressions in
AddressSanitizer.
Haochen Jiang [Fri, 22 Nov 2024 06:32:16 +0000 (14:32 +0800)]
i386/testsuite: Do not append AVX10.2 option for check_effective_target
When -avx10.2 meet -march with AVX512 enabled, it will report warning
for vector size conflict. The warning will prevent the test to run on
GCC with arch native build on those platforms when
check_effective_target.
Remove AVX10.2 options since we are using inline asm ad it actually do
not need options. It will eliminate the warning.
Add target-independent store forwarding avoidance pass
This pass detects cases of expensive store forwarding and tries to
avoid them by reordering the stores and using suitable bit insertion
sequences. For example it can transform this:
strb w2, [x1, 1]
ldr x0, [x1] # Expensive store forwarding to larger load.
To:
ldr x0, [x1]
strb w2, [x1]
bfi x0, x2, 0, 8
Assembly like this can appear with bitfields or type punning / unions.
On stress-ng when running the cpu-union microbenchmark the following
speedups have been observed.
The transformation is rejected on cases that cause store_bit_field to
generate subreg expressions on different register classes. Files
avoid-store-forwarding-4.c and avoid-store-forwarding-5.c contain such
cases and have been marked as XFAIL.
Due to biasing of its operands in store_bit_field, there is a special
handling for machines with BITS_BIG_ENDIAN != BYTES_BIG_ENDIAN. The
need for this was exosed by an issue exposed on the H8 architecture,
which uses big-endian ordering, but BITS_BIG_ENDIAN is false. In that
case, the START parameter of store_bit_field needs to be calculated
from the end of the destination register.
gcc/ChangeLog:
* Makefile.in (OBJS): Add avoid-store-forwarding.o.
* common.opt (favoid-store-forwarding): New option.
* common.opt.urls: Regenerate.
* doc/invoke.texi: New param store-forwarding-max-distance.
* doc/passes.texi: Document new pass.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in: Document new pass.
* params.opt (store-forwarding-max-distance): New param.
* passes.def: Add pass_rtl_avoid_store_forwarding before
pass_early_remat.
* target.def (avoid_store_forwarding_p): New DEFHOOK.
* target.h (struct store_fwd_info): Declare.
* targhooks.cc (default_avoid_store_forwarding_p): New function.
* targhooks.h (default_avoid_store_forwarding_p): Declare.
* tree-pass.h (make_pass_rtl_avoid_store_forwarding): Declare.
* avoid-store-forwarding.cc: New file.
* avoid-store-forwarding.h: New file.
* timevar.def (TV_AVOID_STORE_FORWARDING): New timevar.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/avoid-store-forwarding-1.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-2.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-3.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-4.c: New test.
* gcc.target/aarch64/avoid-store-forwarding-5.c: New test.
* gcc.target/x86_64/abi/callabi/avoid-store-forwarding-1.c: New test.
* gcc.target/x86_64/abi/callabi/avoid-store-forwarding-2.c: New test.
Co-authored-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu> Signed-off-by: Konstantinos Eleftheriou <konstantinos.eleftheriou@vrull.eu>
Martin Jambor [Sun, 24 Nov 2024 22:03:43 +0000 (23:03 +0100)]
ipa: Move individual jump function copying to a separate function
When reviewing various IPA bits and pieces I have falsely assumed
that jump function duplication misses copying important bits because
it relies on vec_safe_copy-ing all data in the vector of jump
functions and then just fixes up the few fields it needs to.
Perhaps more importantly, we do want a function to copy one individual
jump function to form jump functions for planned call-graph edges that
model transfer of control to OpenMP outlined regions through calls to
gomp functions.
Therefore, this patch introduces such function and makes
ipa_edge_args_sum_t::duplicate just allocate the new vectors and then
uses the new function to copy the data.
gcc/ChangeLog:
2024-11-01 Martin Jambor <mjambor@suse.cz>
* ipa-prop.cc (ipa_duplicate_jump_function): New function.
(ipa_edge_args_sum_t::duplicate): Move individual jump function
copying to ipa_duplicate_jump_function.
Uros Bizjak [Sun, 24 Nov 2024 21:18:31 +0000 (22:18 +0100)]
testsuite/x86: Add -mfpmath=sse to add_options_for_float16
Add -mfpmath=sse to add_options_for_float16 to avoid error:
'-fexcess-precision=16' is not compatible with '-mfpmath=387'
when compiling gcc.dg/tree-ssa/pow_fold_1.c.
Uros Bizjak [Sun, 24 Nov 2024 21:00:18 +0000 (22:00 +0100)]
i386: x86 can use x >> -y for x >> 32-y [PR36503]
x86 targets mask 32-bit shifts with a 5-bit mask (and 64-bit with 6-bit mask),
so they can use x >> -y instead of x >> 32-y. This form is very common in
bitstream readers, where it's used to read the top N bits from a word.
Andrew Pinski [Sat, 23 Nov 2024 21:42:47 +0000 (13:42 -0800)]
gimplefe: Fix handling of ')'/'}' after a parse error [PR117741]
The problem here is c_parser_skip_until_found stops at a closing nesting
delimiter without consuming it. So if we don't consume it in
c_parser_gimple_compound_statement, we would go into an infinite loop. The C
parser similar code in c_parser_statement_after_labels to handle this specific
case too.
PR c/117741
gcc/c/ChangeLog:
* gimple-parser.cc (c_parser_gimple_compound_statement): Handle
CPP_CLOSE_PAREN/CPP_CLOSE_SQUARE with an error and skipping the token.
gcc/testsuite/ChangeLog:
* gcc.dg/gimplefe-54.c: New test.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
Eric Botcazou [Sun, 24 Nov 2024 19:23:34 +0000 (20:23 +0100)]
Fix vectorization regressions on the SPARC
This fixes the vectorization regressions present on the SPARC by switching
from vcond[u] patterns to vec_cmp[u] + vcond_mask_ patterns. While I was
at it, I merged the patterns for V4HI/V2SI and V8QI enabled with VIS 3/VIS 4
to follow the model of those enabled with VIS 4B, and standardized all the
mnemonics to the version documented in the Oracle SPARC architecture 2015.
Eric Botcazou [Sun, 24 Nov 2024 14:15:54 +0000 (15:15 +0100)]
Adjust error message for initialized variable in .bss
The current message does not make sense with -fno-zero-initialized-in-bss.
gcc/
* doc/invoke.texi (-fno-zero-initialized-in-bss): Adjust for Ada.
* varasm.cc (get_variable_section): Adjust the error message for an
initialized variable in .bss to -fno-zero-initialized-in-bss.
gcc/testsuite/
* gnat.dg/specs/bss1.ads: New test.
PR fortran/117730
* class.cc (add_proc_comp): Only reject a non_overridable if it
has no overridden procedure and the component is already
present in the vtype.
PR fortran/84674
* resolve.cc (resolve_fl_derived): Do not build a vtable for a
derived type extension that is completely empty.
gcc/testsuite/ChangeLog
PR fortran/117730
* gfortran.dg/pr117730_a.f90: New test.
* gfortran.dg/pr117730_b.f90: New test.
PR fortran/84674
* gfortran.dg/pr84674.f90: New test.
Pan Li [Thu, 21 Nov 2024 06:30:49 +0000 (14:30 +0800)]
RISC-V: Refine the vector stride load/store testcases
The rtl expand dump for IFN check of stride load/store testcase is
different between O2 and O3. It it reasonable to leverage target
no-opts/any-opts to filte out, instead of the xfail.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
Pan Li [Thu, 21 Nov 2024 06:30:45 +0000 (14:30 +0800)]
RISC-V: Rearrange the test files for vector SAT_TRUNC [NFC]
The test files of vector SAT_TRUNC only has numbers as the suffix.
Rearrange the file name to -{form number}-{target-type}. For example,
test form 3 for uint32_t SAT_TRUNC will have -3-u32.c for asm check
and -run-3-u32.c for the run test.
Meanwhile, moved all related test files to riscv/rvv/autovec/sat/.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
Pan Li [Thu, 21 Nov 2024 06:30:44 +0000 (14:30 +0800)]
RISC-V: Refactor the testcases for vector SAT_SUB
This patch would like to refactor the testcases of vector SAT_SUB
after move to rvv/autovec/sat folder. Includes:
* Refine the include header files.
* Remove unnecessary optimization options.
* Adjust dg-final by any-opts and/or no-opts if the rtl dump changes
on different optimization options (like O2, O3).
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
Pan Li [Thu, 21 Nov 2024 06:30:43 +0000 (14:30 +0800)]
RISC-V: Rearrange the test files for vector SAT_SUB [NFC]
The test files of vector SAT_SUB only has numbers as the suffix.
Rearrange the file name to -{form number}-{target-type}. For example,
test form 3 for uint32_t SAT_SUB will have -3-u32.c for asm check and
-run-3-u32.c for the run test.
Meanwhile, moved all related test files to riscv/rvv/autovec/sat/.
It is test only patch and obvious up to a point, will commit it
directly if no comments in next 48H.
Lewis Hyatt [Mon, 14 Oct 2024 21:59:46 +0000 (17:59 -0400)]
libcpp: Fix ICE lexing invalid raw string in a deferred pragma [PR117118]
The PR shows that we ICE after lexing an invalid unterminated raw string,
because lex_raw_string() pops the main buffer unexpectedly. Resolve by
handling this case the same way as for other directives.
libcpp/ChangeLog:
PR preprocessor/117118
* lex.cc (lex_raw_string): Treat an unterminated raw string the same
way for a deferred pragma as is done for other directives.
gcc/testsuite/ChangeLog:
PR preprocessor/117118
* c-c++-common/raw-string-directive-3.c: New test.
* c-c++-common/raw-string-directive-4.c: New test.
Lewis Hyatt [Tue, 29 Oct 2024 20:57:12 +0000 (16:57 -0400)]
gimple: Handle tail padding when computing gimple_ops_offset
The array gimple_ops_offset_[], which is used to find the trailing op[]
array for a given gimple struct, is computed assuming that op[] will be
found at sizeof(tree) bytes away from the end of the struct. This is only
correct if the alignment requirement of a pointer is the same as the
alignment requirement of the struct, otherwise there will be padding bytes
that invalidate the calculation. On 64-bit platforms, this generally works
fine because a pointer has 8-byte alignment and none of the structs make use
of more than that. On 32-bit platforms, it also currently works fine because
there are no 64-bit integers in the gimple structs. There are 32-bit
platforms (e.g. sparc) on which a pointer has 4-byte alignment and a
uint64_t has 8-byte alignment. On such platforms, adding a uint64_t to the
gimple structs (as will take place when location_t is changed to be 64-bit)
causes gimple_ops_offset_ to be 4 bytes too large.
It would be nice to use offsetof() to compute the offset exactly, but
offsetof() is not guaranteed to work for these types, because they use
inheritance and so are not standard layout types. This patch attempts to
detect the presence of tail padding by detecting when such padding is reused
by inheritance; the padding should generally be reused for the same reason
that offsetof() is not available, namely that all the relevant types use
inheritance. One could envision systems on which this fix does not go far
enough (e.g., if the ABI forbids reuse of tail padding), but it makes things
better without affecting anything that currently works.
gcc/ChangeLog:
* gimple.cc (get_tail_padding_adjustment): New function.
(DEFGSSTRUCT): Adjust the computation of gimple_ops_offset_ to be
correct in the presence of tail padding.
Lewis Hyatt [Fri, 25 Oct 2024 14:18:12 +0000 (10:18 -0400)]
Support for 64-bit location_t: C++ modules parts
The modules implementation is necessarily sensitive to the internal workings
of class line_map, and so it needed changes in order to handle a 64-bit
location_t. The changes mostly boil down to supporting that in the debug
dumping routines (which is accomplished by using a new custom code %K for
that purpose), and supporting that when streaming in and out from the
module (which is accomplished by using a new loc() function to go along with
existing abstractions like u() or z() for streaming in and out different
data types).
gcc/cp/ChangeLog:
* module.cc (bytes_out::loc): New function.
(bytes_in::loc): New function.
(struct span): Change int fields to location_diff_t.
(range_t): Change from "unsigned int" to "line_map_uint_t".
(struct ord_loc_info): Likewise.
(struct macro_loc_info): Likewise.
(class module_state): Likewise.
(dumper::operator()): Add new code 'K' for dumping a location_t.
(loc_spans::init): Use %K instead of %u for location_t dumps.
(loc_spans::open): Likewise.
(loc_spans::close): Likewise. Adjust bitwise expressions to support
64-bit location_t as well.
(struct module_state_config): Change ordinary_locs and macro_locs
from "unsigned int" to "line_map_uint_t". Reorder fields to improve
packing. Rather than changing the constructor initializer list to
match the new order, switch to NSDMI instead.
(module_state::note_location): Adjust to support 64-bit location_t.
(module_state::write_location): Use %K instead of %u for location_t
dumps. Use loc() instead of u() for streaming location_t.
(module_state::read_location): Likewise.
(module_state::write_ordinary_maps): Likewise.
(module_state::write_macro_maps): Likewise.
(module_state::write_config): Likewise.
(module_state::read_config): Likewise.
(module_state::write_prepare_maps): Use %K instead of %u for
location_t dumps. Adjust variable types and bitwise expressions to
support 64-bit location_t.
(module_state::read_ordinary_maps): Likewise.
(module_state::read_macro_maps): Likewise.
(preprocess_module): Adjust data types to support 64-bit number of
line maps.
Lewis Hyatt [Mon, 28 Oct 2024 16:55:24 +0000 (12:55 -0400)]
Support for 64-bit location_t: Analyzer parts
The analyzer occasionally prints internal location_t values for debugging;
adjust those parts so they will work if location_t is 64-bit. For
simplicity, to avoid hassling with the printf format string, just convert to
(unsigned long long) in either case.
gcc/analyzer/ChangeLog:
* checker-event.cc (checker_event::dump): Support printing either
32- or 64-bit location_t values.
* checker-path.cc (checker_path::inject_any_inlined_call_events):
Likewise.
Lewis Hyatt [Mon, 28 Oct 2024 16:52:23 +0000 (12:52 -0400)]
Support for 64-bit location_t: Frontend parts
The C/C++ frontend code contains a couple instances where a callback
receiving a "location_t" argument is prototyped to take "unsigned int"
instead. This will make a difference once location_t can be configured to a
different type, so adjust that now.
Also remove a comment about -flarge-source-files, which will be removed
shortly.
gcc/c-family/ChangeLog:
* c-indentation.cc (should_warn_for_misleading_indentation): Remove
comment about -flarge-source-files.
* c-lex.cc (cb_ident): Change "unsigned int" argument to type
"location_t".
(cb_def_pragma): Likewise.
(cb_define): Likewise.
(cb_undef): Likewise.
Lewis Hyatt [Mon, 28 Oct 2024 21:57:41 +0000 (17:57 -0400)]
libcpp: Fix potential unaligned access in cpp_buffer
libcpp makes use of the cpp_buffer pfile->a_buff to store things while it is
handling macros. It uses it to store pointers (cpp_hashnode*, for macro
arguments) and cpp_macro objects. This works fine because a cpp_hashnode*
and a cpp_macro have the same alignment requirement on either 32-bit or
64-bit systems (namely, the same alignment as a pointer.)
When 64-bit location_t is enabled on a 32-bit sytem, the alignment
requirement may cease to be the same, because the alignment requirement of a
cpp_macro object changes to that of a uint64_t, which be larger than that of
a pointer. It's not the case for x86 32-bit, but for example, on sparc, a
pointer has 4-byte alignment while a uint64_t has 8. In that case,
intermixing the two within the same cpp_buffer leads to a misaligned
access. The code path that triggers this is the one in _cpp_commit_buff in
which a hash table with its own allocator (i.e. ggc) is not being used, so
it doesn't happen within the compiler itself, but it happens in the other
libcpp clients, such as genmatch.
Fix that up by ensuring _cpp_commit_buff commits a fully aligned chunk of the
buffer, so it's ready for anything it may be used for next.
Also modify CPP_ALIGN so that it guarantees to return an alignment at least
the size of location_t. Currently it returns the max of a pointer and a
double. I am not aware of any platform where a double may have smaller
alignment than a uint64_t, but it does not hurt to add location_t here to be
sure.
libcpp/ChangeLog:
* lex.cc (_cpp_commit_buff): Make sure that the buffer is properly
aligned for the next allocation.
* internal.h (struct dummy): Make sure alignment is large enough for
a location_t, just in case.