]> git.ipfire.org Git - thirdparty/gcc.git/commit
diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4)
authorDavid Malcolm <dmalcolm@redhat.com>
Wed, 24 Jul 2024 22:07:54 +0000 (18:07 -0400)
committerThomas Koenig <tkoenig@gcc.gnu.org>
Sun, 28 Jul 2024 17:05:54 +0000 (19:05 +0200)
commit6cce58792ba040b435c0943dff5f0781fb9b9e54
tree5aaa6630ded94b04fa70875a3ce448e30f6feea5
parentda87cbedcdf76a6cbc7910fba604efd50e8cc48e
diagnostics: SARIF output: potentially add escaped renderings of source (§3.3.4)

This patch adds support to our SARIF output for cases where
rich_loc.escape_on_output_p () is true, such as for -Wbidi-chars.

In such cases, the pertinent SARIF "location" object gains a property
bag with property "gcc/escapeNonAscii": true, and the "artifactContent"
within the location's physical location's snippet" gains a "rendered"
property (§3.3.4) that escapes non-ASCII text in the snippet, such as:

"rendered": {"text":

where "text" has a string value such as (for a "trojan source" attack):

  "9 |     /*<U+202E> } <U+2066>if (isAdmin)<U+2069> <U+2066> begin admins only */\n"
  "  |       ~~~~~~~~                                ~~~~~~~~                    ^\n"
  "  |       |                                       |                           |\n"
  "  |       |                                       |                           end of bidirectional context\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)         U+2066 (LEFT-TO-RIGHT ISOLATE)\n"

where the escaping is affected by -fdiagnostics-escape-format=; with
-fdiagnostics-escape-format=bytes, the rendered text of the above is:

  "9 |     /*<e2><80><ae> } <e2><81><a6>if (isAdmin)<e2><81><a9> <e2><81><a6> begin admins only */\n"
  "  |       ~~~~~~~~~~~~                                        ~~~~~~~~~~~~                    ^\n"
  "  |       |                                                   |                               |\n"
  "  |       U+202E (RIGHT-TO-LEFT OVERRIDE)                     U+2066 (LEFT-TO-RIGHT ISOLATE)  end of bidirectional context\n"

The patch also refactors/adds enough selftest machinery to be able to
test the snippet generation from within the selftest framework, rather
than just within DejaGnu (where the regex-based testing isn't
sophisticated enough to verify such properties as the above).

gcc/ChangeLog:
* Makefile.in (OBJS-libcommon): Add selftest-json.o.
* diagnostic-format-sarif.cc: Include "selftest.h",
"selftest-diagnostic.h", "selftest-diagnostic-show-locus.h",
"selftest-json.h", and "text-range-label.h".
(class content_renderer): New.
(sarif_builder::m_rules_arr): Convert to std::unique_ptr.
(sarif_builder::make_location_object): Add class
escape_nonascii_renderer.  If rich_loc.escape_on_output_p (),
pass a nonnull escape_nonascii_renderer to
maybe_make_physical_location_object as its snippet_renderer, and
add a property bag property "gcc/escapeNonAscii" to the SARIF
location object.  For other overloads of make_location_object,
pass nullptr for the snippet_renderer.
(sarif_builder::maybe_make_region_object_for_context): Add
"snippet_renderer" param and pass it to
maybe_make_artifact_content_object.
(sarif_builder::make_tool_object): Drop "const".
(sarif_builder::make_driver_tool_component_object): Likewise.
Use typesafe unique_ptr variant of object::set for setting "rules"
property on driver_obj.
(sarif_builder::maybe_make_artifact_content_object): Add param "r"
and use it to potentially set the "rendered" property (§3.3.4).
(selftest::test_make_location_object): New.
(selftest::diagnostic_format_sarif_cc_tests): New.
* diagnostic-show-locus.cc: Include "text-range-label.h" and
"selftest-diagnostic-show-locus.h".
(selftests::diagnostic_show_locus_fixture::diagnostic_show_locus_fixture):
New.
(selftests::test_layout_x_offset_display_utf8): Use
diagnostic_show_locus_fixture to simplify and consolidate setup
code.
(selftests::test_diagnostic_show_locus_one_liner): Likewise.
(selftests::test_one_liner_colorized_utf8): Likewise.
(selftests::test_diagnostic_show_locus_one_liner_utf8): Likewise.
* gcc-rich-location.h (class text_range_label): Move to new file
text-range-label.h.
* selftest-diagnostic-show-locus.h: New file, based on material in
diagnostic-show-locus.cc.
* selftest-json.cc: New file.
* selftest-json.h: New file.
* selftest-run-tests.cc (selftest::run_tests): Call
selftest::diagnostic_format_sarif_cc_tests.
* selftest.h (selftest::diagnostic_format_sarif_cc_tests): New decl.

gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c: Verify
that we have a property bag with property "gcc/escapeNonAscii": true.
Verify that we have a "rendered" property for a snippet.
* gcc.dg/plugin/diagnostic_plugin_test_show_locus.c: Include
"text-range-label.h".

gcc/ChangeLog:
* text-range-label.h: New file, taking class text_range_label from
gcc-rich-location.h.

libcpp/ChangeLog:
* include/rich-location.h
(semi_embedded_vec::semi_embedded_vec): Add copy ctor.
(rich_location::rich_location): Remove "= delete" from decl of
copy ctor.  Add deleted decl of move ctor.
(rich_location::operator=): Remove "= delete" from decl of
copy assignment.  Add deleted decl of move assignment.
(fixit_hint::fixit_hint): Add copy ctor decl.  Add deleted decl of
move.
(fixit_hint::operator=): Add copy assignment decl.  Add deleted
decl of move assignment.
* line-map.cc (rich_location::rich_location): New copy ctor.
(fixit_hint::fixit_hint): New copy ctor.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
14 files changed:
gcc/Makefile.in
gcc/diagnostic-format-sarif.cc
gcc/diagnostic-show-locus.cc
gcc/gcc-rich-location.h
gcc/selftest-diagnostic-show-locus.h [new file with mode: 0644]
gcc/selftest-json.cc [new file with mode: 0644]
gcc/selftest-json.h [new file with mode: 0644]
gcc/selftest-run-tests.cc
gcc/selftest.h
gcc/testsuite/c-c++-common/diagnostic-format-sarif-file-Wbidi-chars.c
gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
gcc/text-range-label.h [new file with mode: 0644]
libcpp/include/rich-location.h
libcpp/line-map.cc