For such case, the legacy add clobbers FLAGS_REG so there should have
extra cstore to avoid the flag be reset before using it. If the
instructions between flag producer and user are NF insns, the setcc/
test sequence is not required.
Add a pass to convert legacy flag clobber insns to their NF counterpart.
The convertion only happens when
1. APX_NF enabled.
2. For a BB, cstore was find, and there are insns between such cstore
and next explicit set insn to FLAGS_REG (test or cmp).
3. All the insns found should have NF counterpart.
The pass was added after rtl-ifcvt which eliminates some branch when
profitable, which could cause some flag-clobbering insn put between
cstore and jcc.
gcc/ChangeLog:
* config/i386/i386.md (has_nf): New define_attr, add to all
nf related patterns.
* config/i386/i386-features.cc (apx_nf_convert): New function
to convert Non-NF insns to their NF counterparts.
(class pass_apx_nf_convert): New pass class.
(make_pass_apx_nf_convert): New.
* config/i386/i386-passes.def: Add pass_apx_nf_convert after
rtl_ifcvt.
* config/i386/i386-protos.h (make_pass_apx_nf_convert): Declare.
arm: Fix the expected output of the test pr111235.c [PR115894]
With r15-1619-g3b9b8d6cfdf593, pr111235.c fails due to different
registers used in ldrexd instruction. The key part of this test is that
the compiler generates LDREXD. The registers used for that are pretty
much irrelevant as they are not matched with any other operations within
the test. This patch changes the test to test only for the mnemonic and
not for any of the operands.
The patch add the Zihintntl instructions in the prefetch pattern.
Zicbop has prefetch instructions. Zihintntl has NTL instructions.
Insert NTL instructions before prefetch instruction, if target
has Zihintntl extension.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_print_operand): Add 'L' letter
to print zihintntl instructions string.
* config/riscv/riscv.md (prefetch): Add zihintntl instructions.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/prefetch-zicbop.c: New test.
* gcc.target/riscv/prefetch-zihintntl.c: New test.
With r15-1619-g3b9b8d6cfdf593, there's a XPASS and a FAIL
for this test-case for cris-elf. Looking at the generated
code, _foo is indeed no longer saved in a register for CRIS.
While that looks like a regression, coremark results are the
same around this revision, so simply adjust the test-case:
remove the target-specific exceptions for cris-*-*.
* gcc.dg/tree-ssa/loop-1.c: Remove target-specific test
and xfail to adjust for recent changes in register allocation.
Feng Wang [Tue, 18 Jun 2024 06:13:35 +0000 (06:13 +0000)]
RISC-V: Add md files for vector BFloat16
V3: Add Bfloat16 vector insn in generic-vector-ooo.md
v2: Rebase
Accroding to the BFloat16 spec, some vector iterators and new pattern
are added in md files.
Signed-off-by: Feng Wang <wangfeng@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/generic-vector-ooo.md: Add def_insn_reservation for vector BFloat16.
* config/riscv/riscv.md: Add new insn name for vector BFloat16.
* config/riscv/vector-iterators.md: Add some iterators for vector BFloat16.
* config/riscv/vector.md: Add some attribute for vector BFloat16.
* config/riscv/vector-bfloat16.md: New file. Add insn pattern vector BFloat16.
Feng Wang [Mon, 17 Jun 2024 01:59:57 +0000 (01:59 +0000)]
RISC-V: Add Zvfbfmin and Zvfbfwma intrinsic
v3: Modify warning message in riscv.cc
v2: Rebase
Accroding to the intrinsic doc, the 'Zvfbfmin' and 'Zvfbfwma' intrinsic
functions are added by this patch.
Signed-off-by: Feng Wang <wangfeng@eswincomputing.com>
gcc/ChangeLog:
* config/riscv/riscv-vector-builtins-bases.cc (class vfncvtbf16_f):
Add 'Zvfbfmin' intrinsic in bases.
(class vfwcvtbf16_f): Ditto.
(class vfwmaccbf16): Add 'Zvfbfwma' intrinsic in bases.
(BASE): Add BASE macro for 'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv-vector-builtins-bases.h: Add declaration for 'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv-vector-builtins-functions.def (REQUIRED_EXTENSIONS):
Add builtins def for 'Zvfbfmin' and 'Zvfbfwma'.
(vfncvtbf16_f): Ditto.
(vfncvtbf16_f_frm): Ditto.
(vfwcvtbf16_f): Ditto.
(vfwmaccbf16): Ditto.
(vfwmaccbf16_frm): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (supports_vectype_p):
Add vector intrinsic build judgment for BFloat16.
(build_all): Ditto.
(BASE_NAME_MAX_LEN): Adjust max length.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_F32_OPS):
Add new operand type for BFloat16.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_F32_OPS): Ditto.
(validate_instance_type_required_extensions):
Add required_ext checking for 'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv-vector-builtins.h (enum required_ext):
Add required_ext declaration for 'Zvfbfmin' and 'Zvfbfwma'.
(reqired_ext_to_isa_name): Ditto.
(required_extensions_specified): Ditto.
(struct function_group_info): Add match case for 'Zvfbfmin' and 'Zvfbfwma'.
* config/riscv/riscv.cc (riscv_validate_vector_type):
Add required_ext checking for 'Zvfbfmin' and 'Zvfbfwma'.
Hongyu Wang [Sat, 13 Jul 2024 03:45:31 +0000 (11:45 +0800)]
AVX512BF16: Do not allow permutation with vcvtne2ps2bf16 [PR115889]
According to the instruction spec of AVX512BF16, the convert from float
to BF16 is not a simple truncation. It has special handling for
denormal/nan, even for normal float it will add an extra bias according
to the least significant bit for bf number. This means we cannot use the
vcvtne2ps2bf16 for any bf16 vector shuffle.
The optimization introduced in r15-1368 adds a specific split to convert
HImode permutation with this instruction, so remove it and treat the
BFmode permutation same as HFmode.
Feng Wang [Thu, 13 Jun 2024 00:32:14 +0000 (00:32 +0000)]
RISC-V: Add vector type of BFloat16 format
v3: Rebase
v2: Rebase
The vector type of BFloat16 format is added in this patch,
subsequent extensions to zvfbfmin and zvfwma need to be based
on this patch.
Signed-off-by: Feng Wang <wangfeng@eswincomputing.com>
gcc/ChangeLog:
Roger Sayle [Sun, 14 Jul 2024 16:22:27 +0000 (17:22 +0100)]
i386: Tweak i386-expand.cc to restore bootstrap on RHEL.
This is a minor change to restore bootstrap on systems using gcc 4.8
as a host compiler. The fatal error is:
In file included from gcc/gcc/coretypes.h:471:0,
from gcc/gcc/config/i386/i386-expand.cc:23:
gcc/gcc/config/i386/i386-expand.cc: In function 'void ix86_expand_fp_absneg_operator(rtx_code, machine_mode, rtx_def**)':
./insn-modes.h:315:75: error: temporary of non-literal type 'scalar_float_mode' in a constant expression
#define HFmode (scalar_float_mode ((scalar_float_mode::from_int) E_HFmode))
^
gcc/gcc/config/i386/i386-expand.cc:2179:8: note: in expansion of macro 'HFmode'
case HFmode:
^
The solution is to use the E_?Fmode enumeration constants as case values
in switch statements.
2024-07-14 Roger Sayle <roger@nextmovesoftware.com>
* config/i386/i386-expand.cc (ix86_expand_fp_absneg_operator):
Use E_?Fmode enumeration constants in switch statement.
(ix86_expand_copysign): Likewise.
(ix86_expand_xorsign): Likewise.
Initializing a char array with a string literal of the same length as
the size of the array is usually a mistake. Rarely is the case where
one wants to create a non-terminated character sequence from a string
literal.
In some cases, for writing faster code, one may want to use arrays
instead of pointers, since that removes the need for storing an array of
pointers apart from the strings themselves.
This forces the programmer to specify a size, which might change if a
new entry is later added. Having no way to enforce null termination is
very dangerous, however, so it is useful to have a warning for this, so
that the compiler can make sure that the programmer didn't make any
mistakes. This warning catches the bug above, so that the programmer
will be able to fix it and write:
This warning already existed as part of -Wc++-compat, but this patch
allows enabling it separately. It is also included in -Wextra, since
it may not always be desired (when unterminated character sequences are
wanted), but it's likely to be desired in most cases.
Since Wc++-compat now includes this warning, the test has to be modified
to expect the text of the new warning too, in <gcc.dg/Wcxx-compat-14.c>.
* c-typeck.cc (digest_init): Separate warnings about character
arrays being initialized as unterminated character sequences
with string literals, from -Wc++-compat, into a new warning,
-Wunterminated-string-initialization.
gcc/ChangeLog:
* doc/invoke.texi: Document the new
-Wunterminated-string-initialization.
gcc/testsuite/ChangeLog:
* gcc.dg/Wcxx-compat-14.c: Adapt the test to match the new text
of the warning, which doesn't say anything about C++ anymore.
* gcc.dg/Wunterminated-string-initialization.c: New test.
Acked-by: Doug McIlroy <douglas.mcilroy@dartmouth.edu> Acked-by: Mike Stump <mikestump@comcast.net> Reviewed-by: Sandra Loosemore <sloosemore@baylibre.com> Reviewed-by: Martin Uecker <uecker@tugraz.at> Signed-off-by: Alejandro Colomar <alx@kernel.org> Reviewed-by: Marek Polacek <polacek@redhat.com>
CRIS: Disable late-combine by default, related PR115883
With late-combine, performance for coremark compiled for cris-elf
regresses 2.6% by performance and by size 0.4%, measured at r15-2005-g13757e50ff0b, when compiled with "-O2 -march=v10".
Earlier, at r15-1880-gce34fcc572a0, numbers were by performance 3.2%
and by size 0.4%, even with the proposed patch to PR115883 (TL;DR: a
presumed bug in LRA or combine exposed by late-combine). Without that
patch, about the same performance results (at that revision).
Similarly around the late-combine commit (r15-1579-g792f97b44ffc5e).
I briefly looked at the performance regression for coremark at r15-2005-g13757e50ff0b (with/without this patch) as far as seeing that
the stack-frame grew larger (maxing out on hard registers and needing
one more slot) for at least two of the top three* functions that
regressed the most in terms of cycles per call:
matrix_mul_matrix_bitextract (in coremark, 17% slower) and __subdf3
(in libgcc, 6.7% slower). That makes sense when considering that
late-combine "naturally" stretches register life-times. But, looking
at late_combine::combine_into_uses and late_combine::optimizable_set,
nothing stood out to me. I guess there's improvement opportunities in
late_combine::check_register_pressure.
(*) I opted not to look at _dtoa_r (in newlib) mostly because it's
boring and always shows up when something in gcc goes sideways. (It
maxes out on hard registers and is big. End of story.)
Note that the change of default is done in the
TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE worker, not in the
TARGET_OPTION_OVERRIDE worker for reasons stated in the
comment.
* config/cris/cris.cc (cris_option_override_after_change): New
function. Disable late-combine by default.
(cris_option_override): Call the new function.
Mikael Morin [Sat, 13 Jul 2024 18:21:20 +0000 (20:21 +0200)]
fortran: Correctly evaluate scalar MASK arguments of MINLOC/MAXLOC
Add the preliminary code that the generated expression for MASK may depend
on when generating the inline code to evaluate MINLOC or MAXLOC with a
scalar MASK.
The generated code was only keeping the generated expression but not the
preliminary code, which was sufficient for simple cases such as data
references or simple (scalar) function calls, but was bogus with more
complicated ones.
gcc/fortran/ChangeLog:
* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Add the
preliminary code generated for MASK to the preliminary code of
MINLOC/MAXLOC.
it would be useful to have the @gcc.gnu.org bugzilla account names
in MAINTAINERS. This is because:
(a) Not every non-@gcc.gnu.org email listed in MAINTAINERS is registered
as a bugzilla user.
(b) Only @gcc.gnu.org accounts tend to have full rights to modify tickets.
(c) A maintainer's name and email address aren't always enough to guess
the bugzilla account name.
(d) The users list on bugzilla has many blank entries for "real name".
However, including @gcc.gnu.org to the account name might encourage
people to use it for ordinary email, rather than just for bugzilla.
This patch goes for the compromise of using the unqualified account
name, with some text near the top of the file to explain its usage.
There isn't room in the area maintainer sections for a new column,
so it seemed better to have the account name only in the Write
After Approval section. It's then necessary to list all maintainers
there, even if they have more specific roles as well.
Also, there were some entries that didn't line up with the
prevailing columns (they had one tab too many or one tab too few).
It seemed easier to check for and report this, and other things,
if the file used spaces rather than tabs.
There was one instance of an email address without the trailing ">".
The updates to check-MAINTAINERS.py includes a test for that.
The account names in the file were taken from a trawl of the
gcc-cvs archives, with a very small number of manual edits for
ambiguities. There are a handful of names that I couldn't find;
the new column has "-" for those. The names were then filtered
against the bugzilla @gcc.gnu.org user list, with those not
present again being blanked out with "-".
ChangeLog:
* MAINTAINERS: Replace tabs with spaces. Add a bugzilla account
name column to the Write After Approval section. Line up the
email column and fix an entry that was missing the trailing ">".
contrib/ChangeLog:
* check-MAINTAINERS.py (sort_by_surname): Replace with...
(get_surname): ...this.
(has_tab, is_empty): Delete.
(check_group): Take a list of column positions as argument.
Check that lines conform to these column numbers. Check that the
final column is an email in angle brackets. Record surnames on
the fly.
(top level): Reject tabs. Use paragraph counts to identify which
groups of lines should be checked. Report missing sections.
David Malcolm [Sat, 13 Jul 2024 14:34:51 +0000 (10:34 -0400)]
diagnostics: add highlight-a vs highlight-b in colorization and pp_markup
Since r6-4582-g8a64515099e645 (which added class rich_location), ranges
of quoted source code have been colorized using the following rules:
- the primary range used the same color of the kind of the diagnostic
i.e. "error" vs "warning" etc (defaulting to bold red and bold magenta
respectively)
- secondary ranges alternate between "range1" and "range2" (defaulting
to green and blue respectively)
This works for cases with large numbers of highlighted ranges, but is
suboptimal for common cases.
The following patch adds a pair of color names: "highlight-a" and
"highlight-b", and uses them whenever it makes sense to highlight and
contrast two different things in the source code (e.g. a type mismatch).
These are used by diagnostic-show-locus.cc for highlighting quoted
source. In addition the patch adds colorization to fragments within the
corresponding diagnostic messages themselves, using consistent
colorization between the message and the quoted source code for the two
different things being contrasted.
For example, consider:
demo.c: In function ‘test_bad_format_string_args’:
../../src/demo.c:25:18: warning: format ‘%i’ expects argument of
type ‘int’, but argument 2 has type ‘const char *’ [-Wformat=]
25 | printf("hello %i", msg);
| ~^ ~~~
| | |
| int const char *
| %s
Previously, the types within the message in quotes would be in bold but
not colorized, and the labelled ranges of quoted source code would use
bold magenta for the "int" and non-bold green for the "const char *".
With this patch:
- the "%i" and "int" in the message and the "int" in the quoted source
are all colored bold green
- the "const char *" in the message and in the quoted source are both
colored bold blue
so that the consistent use of contrasting color draws the reader's eyes
to the relationships between the diagnostic message and the source.
I've tried this with gnome-terminal with many themes, including a
variety of light versus dark backgrounds, solarized versus non-solarized
themes, etc, and it was readable in all.
My initial version of the patch used the existing %r and %R facilities
within pretty-print.cc for the messages, but this turned out to be very
uncomfortable, leading to error-prone format strings such as:
error_at (richloc,
"invalid operands to binary %s (have %<%r%T%R%> and %<%r%T%R%>)",
opname,
"highlight-a", type0,
"highlight-b", type1);
To avoid requiring monstrosities such as the above, the patch adds a new
"%e" format code to pretty-print.cc, which expects a pp_element *, where
pp_element is a new abstract base class (actually a pp_markup::element),
along with various useful subclasses. This lets the above be written
as:
which I feel is maintainable and clear to translators; the use of %e and
pp_element * captures the type-unsafe part of the variadic call, and the
subclasses allow for type-safety (so e.g. an element_quoted_type expects
a type and a highlighting color). This approach allows for some nice
simplifications within c-format.cc.
The patch also extends -Wformat to "teach" it about the new %e and
pp_element *. Doing so requires c-format.cc to be able to determine
if a T * is a pp_element * (i.e. if T is a subclass). To do so I added
a new comp_types callback for comparing types, where the C++ frontend
supplies a suitable implementation (and %e will always be wrong for C).
I've manually tested this on many diagnostics with both C and C++ and it
seems a subtle but significant improvement in readability.
I've added a new option -fno-diagnostics-show-highlight-colors in case
people prefer the old behavior.
gcc/c-family/ChangeLog:
* c-common.cc: Include "tree-pretty-print-markup.h".
(binary_op_error): Use pp_markup::element_quoted_type and %e.
(check_function_arguments): Add "comp_types" param and pass it to
check_function_format.
* c-common.h (check_function_arguments): Add "comp_types" param.
(check_function_format): Likewise.
* c-format.cc: Include "tree-pretty-print-markup.h".
(local_pp_element_ptr_node): New.
(PP_FORMAT_CHAR_TABLE): Add entry for %e.
(struct format_check_context): Add "m_comp_types" field.
(check_function_format): Add "comp_types" param and pass it to
check_format_info.
(check_format_info): Likewise, passing it to format_ctx's ctor.
(check_format_arg): Extract m_comp_types from format_ctx and
pass it to check_format_info_main.
(check_format_info_main): Add "comp_types" param and pass it to
arg_parser's ctor.
(class argument_parser): Add "m_comp_types" field.
(argument_parser::check_argument_type): Pass m_comp_types to
check_format_types.
(handle_subclass_of_pp_element_p): New.
(check_format_types): Add "comp_types" param, and use it to
call handle_subclass_of_pp_element_p.
(class element_format_substring): New.
(class element_expected_type_with_indirection): New.
(format_type_warning): Use element_expected_type_with_indirection
to unify the if (wanted_type_name) branches, reducing from four
emit_warning calls to two. Simplify these further using %e.
Doing so also gives suitable colorization of the text within the
diagnostics.
(init_dynamic_diag_info): Initialize local_pp_element_ptr_node.
(selftest::test_type_mismatch_range_labels): Add nullptr for new
param of gcc_rich_location label overload.
* c-format.h (T_PP_ELEMENT_PTR): New.
* c-type-mismatch.cc: Include "diagnostic-highlight-colors.h".
(binary_op_rich_location::binary_op_rich_location): Use
highlight_colors::lhs and highlight_colors::rhs for the ranges.
* c-type-mismatch.h (class binary_op_rich_location): Add comment
about highlight_colors.
gcc/c/ChangeLog:
* c-objc-common.cc: Include "tree-pretty-print-markup.h".
(print_type): Add optional "highlight_color" param and use it
to show highlight colors in "aka" text.
(pp_markup::element_quoted_type::print_type): New.
* c-typeck.cc: Include "tree-pretty-print-markup.h".
(comp_parm_types): New.
(build_function_call_vec): Pass it to check_function_arguments.
(inform_for_arg): Use %e and highlight colors to contrast actual
versus expected.
(convert_for_assignment): Use highlight_colors::actual for the
rhs_label.
(build_binary_op): Use highlight_colors::lhs and highlight_colors::rhs
for the ranges.
gcc/ChangeLog:
* common.opt (fdiagnostics-show-highlight-colors): New option.
* common.opt.urls: Regenerate.
* coretypes.h (pp_markup::element): New forward decl.
(pp_element): New typedef.
* diagnostic-color.cc (gcc_color_defaults): Add "highlight-a"
and "highlight-b".
* diagnostic-format-json.cc (diagnostic_output_format_init_json):
Disable highlight colors.
* diagnostic-format-sarif.cc (diagnostic_output_format_init_sarif):
Likewise.
* diagnostic-highlight-colors.h: New file.
* diagnostic-path.cc (struct event_range): Pass nullptr for
highlight color of m_rich_loc.
* diagnostic-show-locus.cc (colorizer::set_range): Handle ranges
with m_highlight_color.
(colorizer::STATE_NAMED_COLOR): New.
(colorizer::m_richloc): New field.
(colorizer::colorizer): Add richloc param for initializing
m_richloc.
(colorizer::set_named_color): New.
(colorizer::begin_state): Add case STATE_NAMED_COLOR.
(layout::layout): Pass richloc to m_colorizer's ctor.
(selftest::test_one_liner_labels): Pass nullptr for new param of
gcc_rich_location ctor for labels.
(selftest::test_one_liner_labels_utf8): Likewise.
* diagnostic.h (diagnostic_context::set_show_highlight_colors):
New.
* doc/invoke.texi: Add option -fdiagnostics-show-highlight-colors
and highlight-a and highlight-b color caps.
* doc/ux.texi
(Use color consistently when highlighting mismatches): New
subsection.
* gcc-rich-location.cc (gcc_rich_location::add_expr): Add
"highlight_color" param.
(gcc_rich_location::maybe_add_expr): Likewise.
* gcc-rich-location.h (gcc_rich_location::gcc_rich_location):
Split out into a pair of ctors, where if a range_label is supplied
the caller must also supply a highlight color.
(gcc_rich_location::add_expr): Add "highlight_color" param.
(gcc_rich_location::maybe_add_expr): Likewise.
* gcc.cc (driver_handle_option): Handle
OPT_fdiagnostics_show_highlight_colors.
* lto-wrapper.cc (merge_and_complain): Likewise.
(append_compiler_options): Likewise.
(append_diag_options): Likewise.
(run_gcc): Likewise.
* opts-common.cc (decode_cmdline_options_to_array): Add comment
about -fno-diagnostics-show-highlight-colors.
* opts-global.cc (init_options_once): Preserve
pp_show_highlight_colors in case the global_dc's printer is
recreated.
* opts.cc (common_handle_option): Handle
OPT_fdiagnostics_show_highlight_colors.
(gen_command_line_string): Likewise.
* pretty-print-markup.h: New file.
* pretty-print.cc: Include "pretty-print-markup.h" and
"diagnostic-highlight-colors.h".
(pretty_printer::format): Handle %e.
(pretty_printer::pretty_printer): Handle new field
m_show_highlight_colors.
(pp_string_n): New.
(pp_markup::context::begin_quote): New.
(pp_markup::context::end_quote): New.
(pp_markup::context::begin_color): New.
(pp_markup::context::end_color): New.
(highlight_colors::expected): New.
(highlight_colors::actual): New.
(highlight_colors::lhs): New.
(highlight_colors::rhs): New.
(class selftest::test_element): New.
(selftest::test_pp_format): Add tests of %e.
(selftest::test_urlification): Likewise.
* pretty-print.h (pp_markup::context): New forward decl.
(class chunk_info): Add friend class pp_markup::context.
(class pretty_printer): Add friend pp_show_highlight_colors.
(pretty_printer::m_show_highlight_colors): New field.
(pp_show_highlight_colors): New inline function.
(pp_string_n): New decl.
* substring-locations.cc: Include "diagnostic-highlight-colors.h".
(format_string_diagnostic_t::highlight_color_format_string): New.
(format_string_diagnostic_t::highlight_color_param): New.
(format_string_diagnostic_t::emit_warning_n_va): Use highlight
colors.
* substring-locations.h
(format_string_diagnostic_t::highlight_color_format_string): New.
(format_string_diagnostic_t::highlight_color_param): New.
* toplev.cc (general_init): Initialize global_dc's
show_highlight_colors.
* tree-pretty-print-markup.h: New file.
gcc/cp/ChangeLog:
* call.cc: Include "tree-pretty-print-markup.h".
(implicit_conversion_error): Use highlight_colors::percent_h for
the labelled range.
(op_error_string): Split out into...
(concat_op_error_string): ...this.
(binop_error_string): New.
(op_error): Use %e, binop_error_string, highlight_colors::lhs,
and highlight_colors::rhs.
(maybe_inform_about_fndecl_for_bogus_argument_init): Add
"highlight_color" param; use it for the richloc.
(convert_like_internal): Use highlight_colors::percent_h for the
labelled_range, and highlight_colors::percent_i for the call to
maybe_inform_about_fndecl_for_bogus_argument_init.
(build_over_call): Pass cp_comp_parm_types for new "comp_types"
param of check_function_arguments.
(complain_about_bad_argument): Use highlight_colors::percent_h for
the labelled_range, and highlight_colors::percent_i for the call
to maybe_inform_about_fndecl_for_bogus_argument_init.
* cp-tree.h (maybe_inform_about_fndecl_for_bogus_argument_init):
Add optional highlight_color param.
(cp_comp_parm_types): New decl.
(highlight_colors::const percent_h): New decl.
(highlight_colors::const percent_i): New decl.
* error.cc: Include "tree-pretty-print-markup.h".
(highlight_colors::const percent_h): New defn.
(highlight_colors::const percent_i): New defn.
(type_to_string): Add param "highlight_color" and use it.
(print_nonequal_arg): Likewise.
(print_template_differences): Add params "highlight_color_a" and
"highlight_color_b".
(type_to_string_with_compare): Add params "this_highlight_color"
and "peer_highlight_color".
(print_template_tree_comparison): Add params "highlight_color_a"
and "highlight_color_b".
(cxx_format_postprocessor::handle):
Use highlight_colors::percent_h and highlight_colors::percent_i.
(pp_markup::element_quoted_type::print_type): New.
(range_label_for_type_mismatch::get_text): Pass nullptr for new
params of type_to_string_with_compare.
* typeck.cc (cp_comp_parm_types): New.
(cp_build_function_call_vec): Pass it to check_function_arguments.
(convert_for_assignment): Use highlight_colors::percent_h for the
labelled_range.
gcc/testsuite/ChangeLog:
* g++.dg/diagnostic/bad-binary-ops-highlight-colors.C: New test.
* g++.dg/diagnostic/bad-binary-ops-no-highlight-colors.C: New test.
* g++.dg/plugin/plugin.exp (plugin_test_list): Add
show-template-tree-color-no-highlight-colors.C to
show_template_tree_color_plugin.c.
* g++.dg/plugin/show-template-tree-color-labels.C: Update expected
output to reflect use of highlight-a and highlight-b to contrast
mismatches.
* g++.dg/plugin/show-template-tree-color-no-elide-type.C:
Likewise.
* g++.dg/plugin/show-template-tree-color-no-highlight-colors.C:
New test.
* g++.dg/plugin/show-template-tree-color.C: Update expected output
to reflect use of highlight-a and highlight-b to contrast
mismatches.
* g++.dg/warn/Wformat-gcc_diag-1.C: New test.
* g++.dg/warn/Wformat-gcc_diag-2.C: New test.
* g++.dg/warn/Wformat-gcc_diag-3.C: New test.
* gcc.dg/bad-binary-ops-highlight-colors.c: New test.
* gcc.dg/format/colors.c: New test.
* gcc.dg/plugin/diagnostic_plugin_show_trees.c (show_tree): Pass
nullptr for new param of gcc_rich_location::add_expr.
libcpp/ChangeLog:
* include/rich-location.h (location_range::m_highlight_color): New
field.
(rich_location::rich_location): Add optional label_highlight_color
param.
(rich_location::set_highlight_color): New decl.
(rich_location::add_range): Add optional label_highlight_color
param.
(rich_location::set_range): Likewise.
* line-map.cc (rich_location::rich_location): Add
"label_highlight_color" param and pass it to add_range.
(rich_location::set_highlight_color): New.
(rich_location::add_range): Add "label_highlight_color" param.
(rich_location::set_range): Add "highlight_color" param.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with
partial vector usage enabled and AVX512 support.
PR tree-optimization/115868
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly
compute the number of mask copies required for vect_record_loop_mask.
Jeff Law [Fri, 12 Jul 2024 19:11:33 +0000 (13:11 -0600)]
[PR rtl-optimization/115876] Fix one of two ubsan reported issues in new ext-dce.cc code
David Binderman did a bootstrap build with ubsan enabled which triggered a few
errors in the new ext-dce.cc code. This fixes the trivial case of shifting
negative values.
Bootstrapped and regression tested on x86.
Pushing to the trunk.
gcc/
PR rtl-optimization/115876
* ext-dce.cc (carry_backpropagate): Make mask and mmask unsigned.
Marek Polacek [Fri, 12 Jul 2024 18:40:59 +0000 (14:40 -0400)]
doc: remove @opindex for fconcepts-ts
We're getting complaints from the CI system about this removed option.
I suspect I should have removed the @opindex and @itemx for it. This
patch does that.
gcc/ChangeLog:
* doc/invoke.texi: Remove @opindex and @itemx for -fconcepts-ts.
Daniel Bertalan [Tue, 9 Jul 2024 21:34:46 +0000 (23:34 +0200)]
Fix Xcode 16 build break with NULL != nullptr
As of Xcode 16 beta 2 with the macOS 15 SDK, each re-inclusion of the
stddef.h header causes the NULL macro in C++ to be re-defined to an
integral constant (__null). This makes the workaround in d59a576b8
("Redefine NULL to nullptr") ineffective, as other headers that are
typically included after system.h (such as obstack.h) do include
stddef.h too.
This can be seen by running the sample below through `clang++ -E`
Filed as FB14261859 to Apple and added a comment about it on LLVM PR
86843.
This fixes the cases in --enable-languages=c,c++,objc,obj-c++,rust build
where NULL being an integral constant instead of a null pointer literal
(therefore no longer implicitly converting to a pointer when used as a
template function's argument) caused issues.
gcc/value-pointer-equiv.cc:65:43: error: no viable conversion from `pair<typename __unwrap_ref_decay<long>::type, typename __unwrap_ref_decay<long>::type>' to 'const pair<tree, tree>'
Bit of a brown paper bag issue, but: due to the representation
of the insn chain, insn_info::prev_any_insn would sometimes skip
over instructions. This led to an invalid update in the PR when
adding and removing instructions.
I think one of the reasons I failed to spot this when checking
the code is that m_prev_insn_or_last_debug_insn is misnamed:
it's the previous instruction *of the same type* or the last
debug instruction in a group. The patch therefore renames it to
m_prev_sametype_or_last_debug_insn (with the term prev_sametype
already being used in some accessors).
The reason this didn't show up earlier is that (a) prev_any_insn
is rarely used directly, (b) no instructions were lost from the
def-use chains, and (c) only consecutive debug instructions were
skipped when walking the insn chain.
The chaining scheme makes prev_any_insn more complicated than
next_any_insn, prev_nondebug_insn and next_nondebug_insn, but the
object code produced is still relatively simple.
FX Coudert [Fri, 12 Jul 2024 14:39:50 +0000 (15:39 +0100)]
modula2: bootstrap fix for string and vector headers.
This patch fixes the include of headers (<string> and <vector>) which
are included after GCC's system.h has been included. It defines
INCLUDE_STRING before including "system.h". This allows gcc to
bootstrap with Apple clang 15.
gcc/m2/ChangeLog:
* gm2-gcc/m2linemap.cc (INCLUDE_STRING): Define before
include of gcc-consolidation.h.
* gm2spec.cc (INCLUDE_STRING): Define before include of
system.h.
(INCLUDE_VECTOR): Ditto.
Jeff Law [Fri, 12 Jul 2024 13:53:41 +0000 (07:53 -0600)]
[RISC-V] Avoid unnecessary sign extension after memcmp
Similar to the str[n]cmp work, this adjusts the block compare expansion to do
its work in X mode with an appropriate lowpart extraction of the results at the
end of the sequence.
This has gone through my tester on rv32 and rv64, but that's it. Waiting on
pre-commit testing before moving forward.
gcc/
* config/riscv/riscv-string.cc (emit_memcmp_scalar_load_and_compare):
Set RESULT directly rather than using a temporary.
(emit_memcmp_scalar_result_calculation): Similarly.
(riscv_expand_block_compare_scalar): Use CONST0_RTX rather than
generating new RTL.
* config/riscv/riscv.md (cmpmemsi): Pass an X mode temporary to the
expansion routines. If necessary extract low part of the word to store
in final result location.
This fixes an ICE exposed by supporting exported non-function
using-decls. Sometimes when preparing to define a class, xref_tag will
find a using-decl belonging to a different namespace, which triggers the
checking_assert in modules handling.
Ideally I feel that 'lookup_and_check_tag' should be told whether we're
about to define the type and handle erroring on redefinitions itself to
avoid this issue (and provide better diagnostics by acknowledging the
using-declaration), but this is complicated with the current
fragmentation of definition checking. So for this patch we just fixup
the assertion and ensure that pushdecl properly errors on the
conflicting declaration later.
gcc/cp/ChangeLog:
* decl.cc (xref_tag): Move assertion into condition.
* name-lookup.cc (check_module_override): Check for conflicting
types and using-decls.
gcc/testsuite/ChangeLog:
* g++.dg/modules/using-19_a.C: New test.
* g++.dg/modules/using-19_b.C: New test.
Nathaniel Shead [Thu, 27 Jun 2024 01:08:15 +0000 (11:08 +1000)]
c++: Introduce USING_DECLs for non-function usings [PR114683]
With modules, a non-function using-declaration is not completely
interchangable with the declaration that it refers to; in particular,
such a using-declaration may be exported without revealing the name of
the entity it refers to.
This patch fixes this by building USING_DECLs for all using-declarations
that bind a non-function from a different scope. These new decls can
than have purviewness and exportingness attached to them without
affecting the decl that they refer to.
We do this for all such usings, not just usings that may be revealed in
a module; this way we can verify the change in representation against
the (more comprehensive) non-modules testsuites, and in a future patch
we can use the locations of these using-decls to enhance relevant
diagnostics.
Another possible approach would be to reuse OVERLOADs for this, as is
already done within add_binding_entity for modules. I didn't do this
because lots of code (as well as the names of the accessors) makes
assumptions that OVERLOADs refer to function overload sets, and so
splitting this up reduced semantic burden and made it easier to avoid
unintentional changes. This did mean that we need to move out the
definitions of ovl_iterator::{purview,exporting}_p, because the
structures for module decls are declared later on in cp-tree.h.
Building USING_DECLs changed a couple of code paths when adjusting
bindings; in particular, pushdecl recognises global using-declarations
as usings now, and so checks fall through to update_binding. To not
regress g++.dg/lookup/linkage2.C the checks for 'extern' declarations no
longer were sufficient (they don't handle 'extern "C"'); but
duplicate_decls performed all the relevant checks anyway.
Otherwise in general we strip using-decls from all lookup_* functions
where necessary. Over time for diagnostics purposes it would probably
be good to slowly revert this (especially e.g. lookup_elaborated_type
causes some diagnostic quality regressions here) but this patch doesn't
do so to minimise churn.
This patch also tries not to build USING_DECLs when just redeclaring an
existing declaration, and instead reveals that declaration in-place.
This requires reworking some logic handling CONST_DECLs in module
streaming, since a non-using CONST_DECL may now be exported indepenently
of its containing enum.
'add_binding_entity' needs to explicitly write the names of unscoped
enumerators so that lazy loading will trigger when the name is found by
name lookup; it does this by pretending that the enum declarations are
always usings so that it doesn't double-write definitions. By also
checking if the enumerator was marked purview/exported we can use that
to override a non-purview/non-exported TYPE_DECL and ensure it's made
visible regardless.
When reading we should get the exported flag on the enumeration
constant, and so should properly create a binding for it. We don't need
to do anything to handle importedness as that checking is skipped for
EK_USINGs.
Some other places assume that module information for a CONST_DECL
inherits module information from its containing type. This includes:
- get_originating_module_decl, for determining if the name was imported
or has module attachment; I don't /think/ this change should affect
that, so I'm leaving this untouched.
- binding_cmp, for sorting by exportedness; since now an enumerator
could be exported without the containing decl being exported, we need
to handle this here too.
PR c++/114683
gcc/cp/ChangeLog:
* cp-tree.h (class ovl_iterator): Move definitions of purview_p
and exporting_p to name-lookup.cc.
* module.cc (depset::hash::add_binding_entity): Strip
using-decls. Remove workarounds. Handle CONST_DECLs with
different purview/exported from their enum.
(enum ct_bind_flags): Remove unnecessary cbf_wrapped flag.
(module_state::write_cluster): Likewise.
(module_state::read_cluster): Build USING_DECL for non-function
usings.
(binding_cmp): Handle CONST_DECLs with different purview and/or
exported from their enum.
(set_instantiating_module): Support CONST_DECLs.
* name-lookup.cc (get_fixed_binding_slot): Strip USING_DECLs.
(name_lookup::process_binding): Strip USING_DECLs.
(name_lookup::process_module_binding): Remove workaround.
(update_binding): Strip USING_DECLs, remove incorrect check for
non-extern variables.
(ovl_iterator::purview_p): Support USING_DECLs.
(ovl_iterator::exporting_p): Support USING_DECLs.
(walk_module_binding): Handle stat hack type.
(do_nonmember_using_decl): Strip USING_DECLs when comparing;
build USING_DECLs for non-function usings in different scope
rather than binding directly.
(get_namespace_binding): Strip USING_DECLs.
(lookup_name): Strip USING_DECLs.
(lookup_elaborated_type): Strip USING_DECLs.
* decl.cc (poplevel): Still support -Wunused for using-decls.
(lookup_and_check_tag): Remove unnecessary strip_using_decl.
* parser.cc (cp_parser_template_name): Likewise.
(cp_parser_nonclass_name): Likewise.
(cp_parser_class_name): Likewise.
gcc/testsuite/ChangeLog:
* g++.dg/lookup/using29.C: Update errors.
* g++.dg/lookup/using53.C: Update errors, add XFAILs.
* g++.dg/modules/using-22_b.C: Remove xfails.
* g++.dg/warn/Wunused-var-18.C: Update error, add check.
* g++.dg/lookup/using68.C: New test.
* g++.dg/modules/using-24_a.C: New test.
* g++.dg/modules/using-24_b.C: New test.
* g++.dg/modules/using-25_a.C: New test.
* g++.dg/modules/using-25_b.C: New test.
* g++.dg/modules/using-enum-4_a.C: New test.
* g++.dg/modules/using-enum-4_b.C: New test.
* g++.dg/modules/using-enum-4_c.C: New test.
Signed-off-by: Nathaniel Shead <nathanieloshead@gmail.com> Reviewed-by: Jason Merrill <jason@redhat.com>
Currently instructions vgm and vrepi are utilized only for constant
vectors where the element mode equals the element mode of the
corresponding instruction. This patch lifts this restriction by making
use of those instructions for constant vectors even if element modes
do not coincide. For example, the constant vector
(v2di){0x7ffffffe7ffffffe, 0x7ffffffe7ffffffe}
can be loaded via vgmf %v0,1,30. Similar, the constant vector
Analog, if the element mode of a constant vector is smaller than the
element mode of a corresponding instruction, we still may make use of
those instructions. For example, the constant vector
(v4si){0x7fff, 0xfffe0000, 0x7fff, 0xfffe0000}
can be loaded via vgmg %v0,17,46. Similar, the constant vector
(v4si){-1, -16643, -1, -16643}
can be loaded via vrepig %v0,-16643.
Additionally this patch enables vgm, vgbm, vrepi for partial vectors,
i.e., vectors of size less than 16 bytes. Basically this is done by
treating a vector as a full vector resulting in replicating constants
into the ignored bits whereas vgbm sets those to zero.
Furthermore, there is no restriction to integer vectors anymore, i.e.,
supporting scalars of mode up to and including TI and TF and also
floating-point vectors.
Here are some numbers how often instructions are emitted for SPEC 2017:
I expect most (maybe even all) to save us a load from the literal pool.
gcc/ChangeLog:
* config/s390/2964.md: Remove extended mnemonics for vgm.
* config/s390/3906.md: Remove extended mnemonics for vgm.
* config/s390/3931.md: Remove extended mnemonics for vgm.
* config/s390/8561.md: Remove extended mnemonics for vgm.
* config/s390/constraints.md (jKK): Remove constraint.
(jzz): Add constraint.
* config/s390/s390-protos.h (s390_contiguous_bitmask_vector_p):
Add prototype.
(s390_constant_via_vgm_p): Add prototype.
(s390_constant_via_vrepi_p): Add prototype.
* config/s390/s390.cc (s390_contiguous_bitmask_vector_p): New
function.
(s390_constant_via_vgm_vrepi_helper): New function.
(s390_constant_via_vgm_p): New function.
(s390_constant_via_vgbm_p): For the sake of symmetry rename
s390_bytemask_vector_p into s390_constant_via_vgbm_p.
(s390_bytemask_vector_p): Deal with non-integer and partial
vectors.
(s390_constant_via_vrepi_p): New function.
(s390_legitimate_constant_p): Allow partial vectors.
(legitimate_reload_constant_p): Fix indentation.
(legitimate_reload_vector_constant_p): Restrict to constraints
j00, jm1, jxx, jyy, jzz only, i.e., allow partial vectors.
(s390_expand_vec_init): Also make use of vrepi if possible.
(print_operand): Add q,p,r for vgm,vrepi,vgbm, respectively.
Remove e,s,t for constant vectors.
* config/s390/s390.md (movti): Add variants utilizing
vgbm,vgm,vrepi.
* config/s390/vector.md (mov<mode><tf_vr>): Adapt variants
for vgbm,vgm,vrepi for the new scheme.
(mov<mode>): Adapt variants for vgbm,vgm for the new
scheme and add vrepi variant for modes V_8,V_16,V_32,V_64.
gcc/testsuite/ChangeLog:
* gcc.target/s390/vector/vec-copysign.c: Change to non-extended
mnemonic.
* gcc.target/s390/vector/vec-genmask-1.c: Change to non-extended
mnemonic.
* gcc.target/s390/vector/vec-init-1.c: Change to non-extended
mnemonic.
* gcc.target/s390/vector/vec-vrepi-1.c: Change to non-extended
mnemonic.
* gcc.target/s390/zvector/autovec-double-quiet-uneq.c: Change to
non-extended mnemonic.
* gcc.target/s390/zvector/autovec-float-quiet-uneq.c: Change to
non-extended mnemonic.
* gcc.target/s390/zvector/vec-genmask-1.c: Change to
non-extended mnemonic.
* gcc.target/s390/zvector/vec-splat-1.c: Change to non-extended
mnemonic.
* gcc.target/s390/zvector/vec-splat-2.c: Change to non-extended
mnemonic.
* gcc.target/s390/vector/vgbm-double-1.c: New test.
* gcc.target/s390/vector/vgbm-float-1.c: New test.
* gcc.target/s390/vector/vgbm-int128-1.c: New test.
* gcc.target/s390/vector/vgbm-integer-1.c: New test.
* gcc.target/s390/vector/vgbm-longdouble-1.c: New test.
* gcc.target/s390/vector/vgm-df-1.c: New test.
* gcc.target/s390/vector/vgm-di-1.c: New test.
* gcc.target/s390/vector/vgm-hi-1.c: New test.
* gcc.target/s390/vector/vgm-int128-1.c: New test.
* gcc.target/s390/vector/vgm-longdouble-1.c: New test.
* gcc.target/s390/vector/vgm-qi-1.c: New test.
* gcc.target/s390/vector/vgm-sf-1.c: New test.
* gcc.target/s390/vector/vgm-si-1.c: New test.
* gcc.target/s390/vector/vgm-tf-1.c: New test.
* gcc.target/s390/vector/vgm-ti-1.c: New test.
* gcc.target/s390/vector/vrepi-df-1.c: New test.
* gcc.target/s390/vector/vrepi-di-1.c: New test.
* gcc.target/s390/vector/vrepi-hi-1.c: New test.
* gcc.target/s390/vector/vrepi-int128-1.c: New test.
* gcc.target/s390/vector/vrepi-qi-1.c: New test.
* gcc.target/s390/vector/vrepi-sf-1.c: New test.
* gcc.target/s390/vector/vrepi-si-1.c: New test.
* gcc.target/s390/vector/vrepi-tf-1.c: New test.
* gcc.target/s390/vector/vrepi-ti-1.c: New test.
Although for instructions MVI and MVIY it does not make a difference
whether the immediate is interpreted as signed or unsigned, GAS expects
unsigned immediates for instruction format SI_URD.
gcc/ChangeLog:
* config/s390/vector.md (mov<mode>): Fix output template for
movv1qi.
Roger Sayle [Fri, 12 Jul 2024 11:30:56 +0000 (12:30 +0100)]
i386: Some AVX512 ternlog expansion refinements.
This patch replaces the calls to force_reg in ix86_expand_ternlog_binop
and ix86_expand_ternlog with gen_reg_rtx and emit_move_insn.
This patch also cleans up whitespace, consistently uses CONST_VECTOR_P
instead of GET_CODE and tweaks checks for ix86_ternlog_leaf_p (for
example where vpandn may take a memory operand).
2024-07-12 Roger Sayle <roger@nextmovesoftware.com>
Hongtao Liu <hongtao.liu@intel.com>
gcc/ChangeLog
* config/i386/i386-expand.cc (ix86_broadcast_from_constant):
Use CONST_VECTOR_P instead of comparison against GET_CODE.
(ix86_gen_bcst_mem): Likewise.
(ix86_ternlog_leaf_p): Likewise.
(ix86_ternlog_operand_p): ix86_ternlog_leaf_p is always true for
vector_all_ones_operand.
(ix86_expand_ternlog_bin_op): Use CONST_VECTOR_P instead of
equality comparison against GET_CODE. Replace call to force_reg
with gen_reg_rtx and emit_move_insn (for VEC_DUPLICATE broadcast).
Check for !register_operand instead of memory_operand.
Support CONST_VECTORs by calling force_const_mem.
(ix86_expand_ternlog): Fix indentation whitespace.
Allow ix86_ternlog_leaf_p as ix86_expand_ternlog_andnot's second
operand. Use CONST_VECTOR_P instead of equality against GET_CODE.
Use gen_reg_rtx and emit_move_insn for ~a, ~b and ~c cases.
The handling of the target attribute used alloca to allocate
a copy of unverified user input, which could exhaust the stack
if the input is too long. This patch converts it to auto_vecs
instead.
I wondered about converting it to use std::string, which we
already use elsewhere, but that would be more invasive and
controversial.
[libstdc++] [testsuite] require dfprt on some tests
On a target that doesn't enable decimal float components in libgcc
(because the libc doens't define all required FE_* macros), but whose
compiler supports _Decimal* types, the effective target requirement
dfp passes, but several tests won't link because the runtime support
they depend on is missing. State their dfprt requirement.
[alpha] adjust MEM alignment for block move [PR115459]
Before issuing loads or stores for a block move, adjust the MEM
alignments if analysis of the addresses enabled the inference of
stricter alignment. This ensures that the MEMs are sufficiently
aligned for the corresponding insns, which avoids trouble in case of
e.g. substitutions into SUBREGs.
for gcc/ChangeLog
PR target/115459
* config/alpha/alpha.cc (alpha_expand_block_move): Adjust
MEMs to match inferred alignment.
YunQiang Su [Thu, 11 Jul 2024 12:43:54 +0000 (20:43 +0800)]
RISC-V: NO_WARNING preferred else value for RVV
PR target/115840.
In riscv_preferred_else_value, we create an uninitialized tmp var
for else value, instead of the 0 (as default_preferred_else_value)
or the pre-exists VAR (as aarch64 does), so that we can use agnostic
policy.
The problem is that `warn_uninit` will emit a warning:
'({anonymous})' may be used uninitialized
Let's mark this tmp var as NO_WARNING.
This problem is found when I try to build glibc with V extension.
gcc
PR target/115840
* config/riscv/riscv.cc(riscv_preferred_else_value): Mark
tmp_var as NO_WARNING.
gcc/testsuite
* gcc.dg/vect/pr115840.c: New testcase.
Mikael Morin [Thu, 11 Jul 2024 19:55:58 +0000 (21:55 +0200)]
fortran: Factor the evaluation of MINLOC/MAXLOC's BACK argument
Move the evaluation of the BACK argument out of the loop in the inline code
generated for MINLOC or MAXLOC. For that, add a new (scalar) element
associated with BACK to the scalarization loop chain, evaluate the argument
with the context of that element, and let the scalarizer do its job.
The problem was not only a missed optimisation, but also a wrong code
one in the cases where the expression associated with BACK is not free of
side-effects, making multiple evaluations observable.
The new tests check the evaluation count of the BACK argument, and try to
cover all the variations (integral or floating-point type, constant or
unknown shape, absent or scalar or array MASK) supported by the inline
implementation of the functions. Care has been taken to not check the case
of a constant .FALSE. MASK, for which the evaluation of BACK can be elided.
gcc/fortran/ChangeLog:
* trans-intrinsic.cc (gfc_conv_intrinsic_minmaxloc): Create a new
scalar scalarization chain element if BACK is present. Add it to
the loop. Set the scalarization chain before evaluating the
argument.
gcc/testsuite/ChangeLog:
* gfortran.dg/maxloc_5.f90: New test.
* gfortran.dg/minloc_5.f90: New test.
RISC-V: Disable misaligned vector access in hook riscv_slow_unaligned_access[PR115862]
The reason is that in the following code, icode = movmisalignv8si has
already been rejected by TARGET_VECTOR_MISALIGN_SUPPORTED, but it is
allowed by targetm.slow_unaligned_access,which is contradictory.
Kito Cheng [Tue, 9 Jul 2024 07:50:57 +0000 (15:50 +0800)]
RISC-V: Add SiFive extensions, xsfvcp and xsfcease
We have already upstreamed these extensions into binutils, and now we need GCC
to recognize these extensions and pass them to binutils as well. We also plan
to upstream intrinsics in the near future. :)
Kewen Lin [Fri, 12 Jul 2024 06:32:57 +0000 (01:32 -0500)]
rs6000: Remove vcond{,u} expanders
As PR114189 shows, middle-end will obsolete vcond, vcondu
and vcondeq optabs soon. This patch is to remove all
vcond{,u} expanders in rs6000 port and adjust the function
rs6000_emit_vector_cond_expr which is called by those
expanders as static.
PR target/115659
gcc/ChangeLog:
* config/rs6000/rs6000-protos.h (rs6000_emit_vector_cond_expr): Remove.
* config/rs6000/rs6000.cc (rs6000_emit_vector_cond_expr): Add static
qualifier as it is only called by rs6000_emit_swsqrt now.
* config/rs6000/vector.md (vcond<VEC_F:mode><VEC_F:mode>): Remove.
(vcond<VEC_I:mode><VEC_I:mode>): Remove.
(vcondv4sfv4si): Likewise.
(vcondv4siv4sf): Likewise.
(vcondv2dfv2di): Likewise.
(vcondv2div2df): Likewise.
(vcondu<VEC_I:mode><VEC_I:mode>): Likewise.
(vconduv4sfv4si): Likewise.
(vconduv2dfv2di): Likewise.
Richard Biener [Thu, 11 Jul 2024 08:18:55 +0000 (10:18 +0200)]
tree-optimization/115867 - ICE with simdcall vectorization in masked loop
When only a loop mask is to be supplied for the inbranch arg to a
simd function we fail to handle integer mode masks correctly. We
need to guess the number of elements represented by it. This assumes
that excess arguments are all for masks, I wasn't able to create
a simdclone with more than one integer mode mask argument.
The gcc.dg/vect/vect-simd-clone-20.c exercises this with -mavx512vl
PR tree-optimization/115867
* tree-vect-stmts.cc (vectorizable_simd_clone_call): Properly
guess the number of mask elements for integer mode masks.
Jeff Law [Fri, 12 Jul 2024 03:37:34 +0000 (21:37 -0600)]
[committed] Fix m68k bootstrap segfault with late-combine
So the m68k port has failed to bootstrap since the introduction of
late-combine. My suspicion has been this is a backend problem. Sure enough
after bisecting things down (thank goodness for the debug counter!) I'm happy
to report m68k (after this patch) has moved into its stage3 build for the first
time in a month.
Basically late-combine propagated an address calculation to its use points,
generating this insn (dwarf2out.c, I forget what function):
> (define_insn "extendsidi2"
> [(set (match_operand:DI 0 "nonimmediate_operand" "=d,o,o,<")
> (sign_extend:DI
> (match_operand:SI 1 "nonimmediate_src_operand" "rm,rm,r<Q>,rm")))
> (clobber (match_scratch:SI 2 "=X,&d,&d,&d"))]
> ""
> {
> if (which_alternative == 0)
> /* Handle alternative 0. */
> {
> if (TARGET_68020 || TARGET_COLDFIRE)
> return "move%.l %1,%R0\;smi %0\;extb%.l %0";
> else
> return "move%.l %1,%R0\;smi %0\;ext%.w %0\;ext%.l %0";
> }
>
> /* Handle alternatives 1, 2 and 3. We don't need to adjust address by 4
> in alternative 3 because autodecrement will do that for us. */
> operands[3] = adjust_address (operands[0], SImode,
> which_alternative == 3 ? 0 : 4);
> operands[0] = adjust_address (operands[0], SImode, 0);
>
> if (TARGET_68020 || TARGET_COLDFIRE)
> return "move%.l %1,%3\;smi %2\;extb%.l %2\;move%.l %2,%0";
> else
> return "move%.l %1,%3\;smi %2\;ext%.w %2\;ext%.l %2\;move%.l %2,%0";
> }
> [(set_attr "ok_for_coldfire" "yes,no,yes,yes")])
Note the smi/ext instruction pair in the case for alternatives 1..3. Those
clobber the scratch register before we're done consuming inputs. The scratch
register really needs to be marked as an earlyclobber.
That fixes the bootstrap problem, but a cursory review of m68k.md is not
encouraging. I will not be surprised at all if there's more of this kind of
problem lurking.
But happy to at least have m68k bootstrapping again. It's failing the
comparison test, but definitely progress.
* config/m68k/m68k.md (extendsidi2): Add missing early clobbers.
Ian Lance Taylor [Fri, 12 Jul 2024 02:29:04 +0000 (19:29 -0700)]
libbacktrace: avoid infinite recursion
We could get an infinite recursion in an odd case in which a
.gnu_debugdata section was added to a debug file, and mini_debuginfo
was put into the debug file, and the debug file was put into a
/usr/lib/debug directory to be found by build ID. This combination
doesn't really make sense but we shouldn't get an infinite recursion.
* elf.c (elf_add): Don't use .gnu_debugdata if we are already
reading a debuginfo file.
* Makefile.am (m2test_*): New test targets.
(CHECK_PROGRAMS): Add m2test.
(MAKETESTS): Add m2test_minidebug2.
(%_minidebug2): New pattern.
(CLEANFILES): Remove minidebug2 files.
* Makefile.in: Regenerate.
Jonathan Wakely [Thu, 11 Jul 2024 20:23:15 +0000 (21:23 +0100)]
libstdc++: Test that std::atomic_ref<bool> uses the primary template
The previous commit changed atomic_ref<bool> to not use the integral
specialization. This adds a test to verify that change. We can't
directly test that the primary template is used, but we can check that
the member functions of the integral specializations are not present.
libstdc++-v3/ChangeLog:
* testsuite/29_atomics/atomic_ref/bool.cc: New test.
libstdc++: the specialization atomic_ref<bool> should use the primary template
Per [atomics.ref.int] `bool` is excluded from the list of integral types
for which there is a specialization of the `atomic_ref` class template
and [Note 1] clearly states that `atomic_ref<bool>` "uses the primary
template" instead.
libstdc++-v3/ChangeLog:
* include/bits/atomic_base.h (__atomic_ref): Do not use integral
specialization for bool.
Jeff Law [Thu, 11 Jul 2024 18:05:56 +0000 (12:05 -0600)]
[to-be-committed,RISC-V] Eliminate unnecessary sign extension after inlined str[n]cmp
This patch eliminates an unnecessary sign extension for scalar inlined
string comparisons on rv64.
Conceptually this is pretty simple. Prove all the paths which "return"
a value from the inlined string comparison already have sign extended
values.
FINAL_LABEL is the point after the calculation of the return value. So
if we have a jump to FINAL_LABEL, we must have a properly extended
result value at that point.
Second we're going to arrange in the .md part of the expander to use an
X mode temporary for the result. After computing the result we will (if
necessary) extract the low part of the result using a SUBREG tagged with
the appropriate SUBREG_PROMOTED_* bits.
So with that background.
We find a jump to FINAL_LABEL in emit_strcmp_scalar_compare_byte. Since
we know the result is X mode, we can just emit the subtraction of the
two chars in X mode and we'll have a properly sign extended result.
There's 4 jumps to final_label in emit_strcmp_scalar.
The first is just returning zero and needs trivial simplification to not
force the result into SImode.
The second is after calling strcmp in the library. The ABI mandates
that value is sign extended, so there's nothing to do for that case.
The 3rd occurs after a call to
emit_strcmp_scalar_result_calculation_nonul. If we dive into that
routine it needs simplificationq similar to what we did in
emit_strcmp_scalar_compare_byte
The 4th occurs after a call to emit_strcmp_scalar_result_calculation
which again needs trivial adjustment like we've done in the other routines.
Finally, at the end of expand_strcmp, just store the X mode result
sitting in SUB to RESULT.
The net of all that is we know every path has its result properly
extended to X mode. Standard redundant extension removal will take care
of the rest.
We've been running this within Ventana for about 6 months, so naturally
it's been through various QA cycles, dhrystone, spec2017, etc. It's
also been through a build/test cycle in my tester. Waiting on results
from the pre-commit testing before moving forward.
gcc/
* config/riscv/riscv-string.cc
(emit_strcmp_scalar_compare_byte): Set RESULT directly rather
than using a new temporary.
(emit_strcmp_scalar_result_calculation_nonul): Likewise.
(emit_strcmp_scalar_result_calculation): Likewise.
(riscv_expand_strcmp_scalar): Use CONST0_RTX rather than
generating a new node.
(expand_strcmp): Copy directly from SUB to RESULT.
* config/riscv/riscv.md (cmpstrnsi, cmpstrsi): Pass an X
mode temporary to the expansion routines. If necessary
extract low part of the word to store in final result location.
Andrew Pinski [Sat, 22 Jun 2024 04:07:26 +0000 (21:07 -0700)]
Ranger: Mark a few classes as final
I noticed there was a warning from clang about int_range's
dtor being marked as final saying the class cannot be inherited from.
So let's mark the few ranger classes as final for those which we know
will be final.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* value-range.h (class int_range): Mark as final.
(class prange): Likewise.
(class frange): Likewise.
Signed-off-by: Andrew Pinski <quic_apinski@quicinc.com>
recog: Avoid validate_change shortcut for groups [PR115782]
In this PR, due to the -f flags, we ended up with:
bb1: r10=r10
...
bb2: r10=r10
...
bb3: ...=r10
with bb1->bb2 and bb1->bb3.
late-combine successfully combined the bb1->bb2 def-use and set
the insn code to NOOP_MOVE_INSN_CODE. The bb1->bb3 combination
then failed for... reasons. At this point, everything should have
been rewound to its original state.
However, substituting r10=r10 into r10=r10 gives r10=r10, and
validate_change had an early-out for no-op rtl changes. This meant
that validate_change did not register a change for the bb2 insn and
so did not save its old insn code. The NOOP_MOVE_INSN_CODE therefore
persisted even after the attempt had been rewound.
IMO it'd be too cumbersome and error-prone to expect all users of
validate_change to be aware of this possibility. If code is using
validate_change with in_group=1, I think it has a reasonable expectation
that a change will be registered and that the insn code will be saved
(and restored on cancel). This patch therefore limits the shortcut
to the !in_group case.
gcc/
PR rtl-optimization/115782
* recog.cc (validate_change_1): Suppress early exit for no-op
changes that are part of a group.
gcc/testsuite/
PR rtl-optimization/115782
* gcc.dg/pr115782.c: New test.
Eric Botcazou [Thu, 11 Jul 2024 08:49:13 +0000 (10:49 +0200)]
Fix gimplification of ordering comparisons of arrays of bytes
The Ada compiler now defers to the gimplifier for ordering comparisons of
arrays of bytes (Ada parlance for <, >, <= and >=) because the gimplifier
in turn defers to memcmp for them, which implements the required semantics.
However, the gimplifier has a special processing for aggregate types whose
mode is not BLKmode and this processing deviates from the memcmp semantics
when the target is little-endian.
gcc/
* gimplify.cc (gimplify_scalar_mode_aggregate_compare): Add support
for ordering comparisons.
(gimplify_expr) <default>: Call gimplify_scalar_mode_aggregate_compare
only for integral scalar modes.
gcc/testsuite/
* gnat.dg/array42.adb, gnat.dg/array42_pkg.ads: New test.
There are these insns that subtract and zero-extend where
the subtrahend is zero-extended to the mode of the minuend.
This patch uses one insn (and split) with mode iterators
instead of spelling out each variant individually.
This has the additional benefit that u32 - u24 is also supported,
which previously wasn't.
c++/modules: Keep entity mapping info across duplicate_decls [PR99241]
When duplicate_decls finds a match with an existing imported
declaration, it clears DECL_LANG_SPECIFIC of the olddecl and replaces it
with the contents of newdecl; this clears DECL_MODULE_ENTITY_P causing
an ICE if the same declaration is imported again later.
This fixes the issue by ensuring that the flag is transferred to newdecl
before clearing so that it ends up on olddecl again.
For future-proofing we also do the same with DECL_MODULE_KEYED_DECLS_P,
though because we don't yet support textual redefinition merging we
can't yet test this works as intended. I don't expect it's possible for
a new declaration already to have extra keyed decls mismatching that of
the old declaration though, so I don't do anything with 'keyed_map' at
this time.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_arith.h: Add test
helper macros.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_data.h: Add test
data for .SAT_SUB in zip benchmark.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_binary_vx.h: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip-run.c: New test.
* gcc.target/riscv/rvv/autovec/binop/vec_sat_u_sub_zip.c: New test.
Fortran: Fix rejecting class arrays of different ranks as storage association argument and add un/pack_class. [PR96992]
Removing the assert in trans-expr, lead to initial strides not set
which is now fixed. When the array needs repacking, this is done for
class arrays now, too.
Packing class arrays was done using the regular internal pack
function in the past. But that does not use the vptr's copy
function and breaks OOP paradigms (e.g. deep copy). The new
un-/pack_class functions use the vptr's copy functionality to
implement OOP paradigms correctly.
PR fortran/96992
gcc/fortran/ChangeLog:
* trans-array.cc (gfc_trans_array_bounds): Set a starting
stride, when descriptor expects a variable for the stride.
(gfc_trans_dummy_array_bias): Allow storage association for
dummy class arrays, when they are not elemental.
(gfc_conv_array_parameter): Add more general class support
and packing for classes, too.
* trans-array.h (gfc_conv_array_parameter): Add lbound shift
for class arrays.
* trans-decl.cc (gfc_build_builtin_function_decls): Add decls
for internal_un-/pack_class.
* trans-expr.cc (gfc_reset_vptr): Allow supplying a type-tree
to generate the vtab from.
(gfc_class_set_vptr): Allow supplying a class-tree to take the
vptr from.
(class_array_data_assign): Rename to gfc_class_array_data_assign
and make usable from other compile units.
(gfc_class_array_data_assign): Renamed from class_array_data_
assign.
(gfc_conv_derived_to_class): Remove assert to
allow converting derived to class type arrays with assumed
rank. Reduce code base and use gfc_conv_array_parameter also
for classes.
(gfc_conv_class_to_class): Use gfc_class_data_assign.
(gfc_conv_procedure_call): Adapt to new signature of
gfc_conv_derived_to_class.
* trans-io.cc (transfer_expr): Same.
* trans-stmt.cc (trans_associate_var): Same.
* trans.h (gfc_conv_derived_to_class): Signature changed.
(gfc_class_array_data_assign): Made public.
(gfor_fndecl_in_pack_class): Added declaration.
(gfor_fndecl_in_unpack_class): Same.
libgfortran/ChangeLog:
* Makefile.am: Add in_un-/pack_class.c to build.
* Makefile.in: Regenerated from Makefile.am.
* gfortran.map: Added new functions and bumped ABI.
* libgfortran.h (GFC_CLASS_T): Added for generating class
representation at runtime.
* runtime/in_pack_class.c: New file.
* runtime/in_unpack_class.c: New file.
Jørgen Kvalsvik [Fri, 29 Mar 2024 12:01:37 +0000 (13:01 +0100)]
Add function filtering to gcov
Add the --include and --exclude flags to gcov to control what functions
to report on. This is meant to make gcov more practical as an when
writing test suites or performing other coverage experiments, which
tends to focus on a few functions at the time. This really shines in
combination with the -t/--stdout flag. With support for more expansive
metrics in gcov like modified condition/decision coverage (MC/DC) and
path coverage, output quickly gets overwhelming without filtering.
The approach is quite simple: filters are egrep regexes and are
evaluated left-to-right, and the last filter "wins", that is, if a
function matches an --include and a subsequent --exclude, it should not
be included in the output. All of the output machinery works on the
function table, so by optionally (not) adding function makes the even
the json output work as expected, and only minor changes are needed to
suppress the filtered-out functions.
Demo: math.c
int mul (int a, int b) {
return a * b;
}
int sub (int a, int b) {
return a - b;
}
int sum (int a, int b) {
return a + b;
}
Plain matches:
$ gcov -t math --include=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
$ gcov -t math --include=mul
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
Regex match:
$ gcov -t math --include=su
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
#####: 9:int sum (int a, int b) {
#####: 10: return a + b;
-: 11:}
And similar for exclude:
$ gcov -t math --exclude=sum
-: 0:Source:math.c
-: 0:Graph:math.gcno
-: 0:Data:-
-: 0:Runs:0
#####: 1:int mul (int a, int b) {
#####: 2: return a * b;
-: 3:}
#####: 5:int sub (int a, int b) {
#####: 6: return a - b;
-: 7:}
Matching generally work well for mangled names, as the mangled names
also have the base symbol name in it. By default, functions are matched
by the mangled name, which means matching on base names always work as
expected. The -M flag makes the matching work on the demangled name
which is quite useful when you only want to report on specific
overloads and can use the full type names.
Why not just use grep? grep is not really sufficient as grep is very
line oriented, and the reports that benefit the most from filtering
often unpredictably span multiple lines based on the state of coverage.
For example, a condition coverage report for 3 terms/6 outcomes only
outputs 1 line when all conditions are covered, and 7 with no lines
covered.
* lib/gcov.exp: Add filtering test function.
* g++.dg/gcov/gcov-19.C: New test.
* g++.dg/gcov/gcov-20.C: New test.
* g++.dg/gcov/gcov-21.C: New test.
* gcc.misc-tests/gcov-25.c: New test.
* gcc.misc-tests/gcov-26.c: New test.
* gcc.misc-tests/gcov-27.c: New test.
* gcc.misc-tests/gcov-28.c: New test.
Ensure that the function.end_line in the lines vector for the source
file, even if it is not explicitly touched by a basic block. This
ensures consistency with what you would expect. For example, this file
has sources[sum.cc].lines.size () == 23 and main.end_line == 2 without
adjusting sources.lines, which in this case is a no-op.
#####: 17:int main ()
-: 18:{
#####: 19: sum (1, 2);
#####: 20: sum (1.1, 2);
#####: 21: sum (2.2, 2.3);
#####: 22:}
This is a useful property when combined with selective reporting.
gcc/ChangeLog:
* gcov.cc (process_all_functions): Ensure fn.end_line is
included source[fn].lines.
RISC-V: c implies zca, and conditionally zcf & zcd
According to Zc-1.0.4-3.pdf from
https://github.com/riscvarchive/riscv-code-size-reduction/releases/tag/v1.0.4-3
The rule is that:
- C always implies Zca
- C+F implies Zcf (RV32 only)
- C+D implies Zcd
After this patch:
test:
...
.L3:
vle16.v v3,0(a3)
vrsub.vx v5,v2,a6
mv a7,a4
addw a4,a4,t3
vrgather.vv v1,v3,v5
vssubu.vv v1,v1,v6
vrgather.vv v3,v1,v5
vse16.v v3,0(a3)
sub a3,a3,t1
bgtu t4,a4,.L3
...
The below test suites are passed for this patch:
1. The rv64gcv fully regression tests.
2. The rv64gcv build with glibc.
3. The x86 bootstrap tests.
4. The x86 fully regression tests.
gcc/ChangeLog:
* tree-vect-patterns.cc (vect_recog_sat_sub_pattern_transform):
Add new func impl to perform the truncation distribution.
(vect_recog_sat_sub_pattern): Perform above optimize before
generate .SAT_SUB call.
Jonathan Wakely [Wed, 10 Jul 2024 09:27:24 +0000 (10:27 +0100)]
libstdc++: Make std::basic_format_context non-copyable [PR114387]
Users are not supposed to create objects of this type, and there's no
reason it needs to be copyable. LWG 4061 makes it non-copyable and
non-default constructible.
libstdc++-v3/ChangeLog:
PR libstdc++/114387
* include/std/format (basic_format_context): Define copy
operations as deleted, as per LWG 4061.
* testsuite/std/format/context.cc: New test.
Jonathan Wakely [Wed, 10 Jul 2024 16:47:56 +0000 (17:47 +0100)]
libstdc++: Minor optimization for std::locale::encoding()
For the C locale we know the encoding is ASCII, so we can avoid using
newlocale and nl_langinfo_l. Similarly, for an unnamed locale, we aren't
going to get a useful answer, so we can just use a default-constrcuted
std::text_encoding representing an unknown encoding.
libstdc++-v3/ChangeLog:
* src/c++26/text_encoding.cc (__locale_encoding): Add to unnamed
namespace.
(std::locale::encoding): Optimize for "C" and "*" names.
Jonathan Wakely [Wed, 10 Jul 2024 09:29:52 +0000 (10:29 +0100)]
libstdc++: Use direct-initialization for std::vector<bool>'s allocator [PR115854]
The consensus in the standard committee is that this change shouldn't be
necessary, and the Allocator requirements should require conversions
between rebound allocators to be implicit. But we can make it work for
now anyway.
libstdc++-v3/ChangeLog:
PR libstdc++/115854
* include/bits/stl_bvector.h (_Bvector_base): Convert allocator
to rebound type explicitly.
* testsuite/23_containers/vector/allocator/115854.cc: New test.
* testsuite/23_containers/vector/bool/allocator/115854.cc: New test.
Jonathan Wakely [Mon, 8 Jul 2024 09:45:52 +0000 (10:45 +0100)]
libstdc++: ranges::find needs explicit conversion to size_t [PR115799]
For an integer-class type we need to use an explicit conversion to size_t.
libstdc++-v3/ChangeLog:
PR libstdc++/115799
* include/bits/ranges_util.h (__find_fn): Make conversion
from difference type ti size_t explicit.
* testsuite/25_algorithms/find/bytes.cc: Check ranges::find with
__gnu_test::test_contiguous_range.
* testsuite/std/ranges/range.cc: Adjust expected difference_type
for __gnu_test::test_contiguous_range.
* testsuite/util/testsuite_iterators.h (contiguous_iterator_wrapper):
Use __max_diff_type as difference type.
(test_range::sentinel, test_sized_range_sized_sent::sentinel):
Ensure that operator- returns difference_type.
Marek Polacek [Fri, 31 May 2024 12:54:00 +0000 (08:54 -0400)]
c++: remove Concepts TS code
In GCC 14 we deprecated Concepts TS and discussed removing the code
in GCC 15. This patch removes Concepts TS code from the front end,
including support for template-introductions, as in:
The biggest part of this patch is adjusting the testsuite. We don't
want to lose coverage so I've converted most of -fconcepts-ts tests
to C++20. That means they no longer have to be c++17_only. Mostly
this meant turning "concept bool" into "concept" and turning function
concepts into C++20 concepts. I've added missing "auto"s where
required, but "auto"s in template-argument-lists are not supported
anymore so I've removed some of the tests; some of them are still
present to verify we don't crash on such autos. I've also added ()
around "requires" expressions.
I plan to add a porting_to.html entry with a few hints.
I've rebased and tested the patch after the recent r15-1103.
Marek Polacek [Tue, 25 Jun 2024 18:55:08 +0000 (14:55 -0400)]
c: ICE with invalid sizeof [PR115642]
Here we ICE in c_expr_sizeof_expr on an erroneous expr.value. The
code checks for expr.value == error_mark_node but here the e_m_n is
wrapped in a C_MAYBE_CONST_EXPR. I don't think we should have created
such a tree, so let's return earlier in c_cast_expr.
PR c/115642
gcc/c/ChangeLog:
* c-typeck.cc (c_cast_expr): Return error_mark_node if build_c_cast
failed.
Marek Polacek [Thu, 27 Jun 2024 20:39:29 +0000 (16:39 -0400)]
c: ICE on invalid with attribute optimize [PR115549]
I had this PR in my open tabs so why not go ahead and fix it.
decl_attributes gets last_decl, the last already pushed declaration,
to be used in common_handle_aligned_attribute. In C++, we look up
the decl via find_last_decl, which returns NULL_TREE if it finds
a decl that had not been declared. In C, we look up the decl via
lookup_last_decl which returns error_mark_node rather than NULL_TREE
in that case.
The error_mark_node causes a crash in common_handle_aligned_attribute.
We can fix this on the C FE side like in the patch below.
PR c/115549
gcc/c/ChangeLog:
* c-decl.cc (c_decl_attributes): If lookup_last_decl returns
error_mark_node, use NULL_TREE as last_decl.
testsuite: Align testcase with implementation [PR105090]
Since r13-1006-g2005b9b888eeac, the test case copysign_softfloat_1.c
no longer contains any lsr istruction, so drop the check as per
comment 9 in PR105090.
gcc/testsuite/ChangeLog:
PR target/105090
* gcc.target/arm/copysign_softfloat_1.c: Drop check for lsr
Edwin Lu [Wed, 10 Jul 2024 16:44:48 +0000 (09:44 -0700)]
RISC-V: Add support for B standard extension
This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness
across toolchains
https://github.com/riscv/riscv-b/tags
gcc/ChangeLog:
* common/config/riscv/riscv-common.cc: Add imply rules for B extension
* config/riscv/arch-canonicalize: Ditto
expand_fn_using_insn has code to handle SUBREG_PROMOTED_VAR_P
destinations. Specifically, for:
(subreg/v:M1 (reg:M2 R) ...)
it creates a new temporary register T, uses it for the output
operand, then sign- or zero-extends the M1 lowpart of T to M2,
storing the result in R.
This patch splits this handling out into helper routines and
uses them for other instances of:
if (!rtx_equal_p (target, ops[0].value))
emit_move_insn (target, ops[0].value);
It's quite probable that this doesn't help any of the other cases;
in particular, it shouldn't affect vectors. But I think it could
be useful for the CRC work.
Marek Polacek [Tue, 2 Jul 2024 19:22:39 +0000 (15:22 -0400)]
c++: array new with value-initialization [PR115645]
This extends the r11-5179 fix which doesn't work with multidimensional
arrays. In particular,
struct S {
explicit S() { }
};
auto p = new S[1][1]();
should not say "converting to S from initializer list would use
explicit constructor" because there's no {}. However, since we
went into the block where we create a {}, we got confused. We
should not have gotten there but we did because array_p was true.
This patch refines the check once more.
PR c++/115645
gcc/cp/ChangeLog:
* init.cc (build_new): Don't do any deduction for arrays with
bounds if it's value-initialized.
recog: Handle some mode-changing hardreg propagations
insn_propagation would previously only replace (reg:M H) with X
for some hard register H if the uses of H were also in mode M.
This patch extends it to handle simple mode punning too.
The original motivation was to try to get rid of the execution
frequency test in aarch64_split_simd_shift_p, but doing that is
follow-up work.
I tried this on at least one target per CPU directory (as for
the late-combine patches) and it seems to be a small win for
all of them.
The patch includes a couple of updates to the ia32 results.
In pr105033.c, foo3 replaced:
In vect-bfloat16-2b.c, 5 of the vec_extract_v32bf_* routines
(specifically the ones with nonzero even indices) replaced
things like:
movl 28(%esp), %eax
vmovd %eax, %xmm0
with:
vpinsrw $0, 28(%esp), %xmm0, %xmm0
(These functions return a bf16, and so only the low 16 bits matter.)
gcc/
* recog.cc (insn_propagation::apply_to_rvalue_1): Handle simple
cases of hardreg propagation in which the register is set and
used in different modes.
gcc/testsuite/
* gcc.target/i386/pr105033.c: Expect vmovhps for the ia32 version
of foo.
* gcc.target/i386/vect-bfloat16-2b.c: Expect more vpinsrws.
change_insns is used to change multiple instructions at once, so that
the IR on return is valid & self-consistent. These changes can involve
moving instructions, and the new position for one instruction might
be expressed in terms of the old position of another instruction
that is changing at the same time.
change_insns therefore adds placeholder instructions to mark each
new instruction position, then replaces each placeholder with the
corresponding real instruction. This replacement was done in two
steps: removing the old placeholder instruction and inserting the new
real instruction. But it's more convenient for the upcoming fix for
PR115785 if we do the operation as a single step. That should also
be slightly more efficient, since e.g. no splay tree operations are
needed.
This operation happens purely on the rtl-ssa instruction chain.
The placeholders are never represented in rtl.
gcc/
PR rtl-optimization/115785
* rtl-ssa/functions.h (function_info::replace_nondebug_insn): Declare.
* rtl-ssa/insns.h (insn_info::order_node::set_uid): New function.
(insn_info::remove_note): Declare.
* rtl-ssa/insns.cc (insn_info::remove_note): New function.
(function_info::replace_nondebug_insn): Likewise.
* rtl-ssa/changes.cc (function_info::change_insns): Use
replace_nondebug_insn instead of remove_insn + add_insn.
c++, contracts: Fix ICE in create_tmp_var [PR113968]
During contract parsing, in grok_contract(), we proceed even if the
condition contains errors. This results in contracts with embedded errors
which eventually confuse gimplify. Checks for errors have been added in
grok_contract() to exit early if an error is encountered.
PR c++/113968
gcc/cp/ChangeLog:
* contracts.cc (grok_contract): Check for error_mark_node early
exit.
Gaius Mulley [Wed, 10 Jul 2024 14:52:37 +0000 (15:52 +0100)]
PR modula2/115823 Wrong expansion of isnormal optab
The bug fix changes gcc/m2/gm2-gcc/m2builtins.c:m2builtins_BuiltinExists
to recognise both __builtin_<functionname> and functionname as a builtin.
gcc/m2/ChangeLog:
PR modula2/115823
* gm2-gcc/m2builtins.cc (struct builtin_macro_definition): New
field builtinname.
(builtin_function_match): New function.
(builtin_macro_match): Ditto.
(m2builtins_BuiltinExists): Use builtin_function_match and
builtin_macro_match.
(lookup_builtin_macro): Use builtin_macro_match.
(lookup_builtin_function): Use builtin_function_match.
(define_builtin): Assign builtinname field.
gcc/testsuite/ChangeLog:
PR modula2/115823
* gm2/builtins/run/pass/testalloa.mod: New test.
middle-end: Fix stalled swapped condition code value [PR115836]
emit_store_flag_1 calculates scode (swapped condition code) at the
beginning of the function from the value of code variable. However,
code variable may change before scode usage site, resulting in
invalid stalled scode value.
Move calculation of scode value just before its only usage site to
avoid stalled scode value.
PR middle-end/115836
gcc/ChangeLog:
* expmed.cc (emit_store_flag_1): Move calculation of
scode just before its only usage site.