From d205ff82d0f369f26973abe53a7beeebbdda11f6 Mon Sep 17 00:00:00 2001 From: Jeff Lucovsky Date: Fri, 23 Feb 2024 08:51:56 -0500 Subject: [PATCH] doc/transform: Describe the from_base64 transform Issue: 6487 Document the new transform and indicate that it's the preferred way to perform base64 decoding (preferred over base64_decode) --- doc/userguide/rules/base64-keywords.rst | 2 + doc/userguide/rules/transforms.rst | 53 +++++++++++++++++++++++++ doc/userguide/upgrade.rst | 2 + 3 files changed, 57 insertions(+) diff --git a/doc/userguide/rules/base64-keywords.rst b/doc/userguide/rules/base64-keywords.rst index 81db203448..9e4f942e46 100644 --- a/doc/userguide/rules/base64-keywords.rst +++ b/doc/userguide/rules/base64-keywords.rst @@ -10,6 +10,8 @@ base64_decode Decodes base64 data from a buffer and makes it available for the base64_data function. +We recommend using the base64 transform instead -- see :ref:`from_base64 `. + Syntax:: base64_decode:bytes , offset , relative; diff --git a/doc/userguide/rules/transforms.rst b/doc/userguide/rules/transforms.rst index f730f0d2dc..c20bd6e598 100644 --- a/doc/userguide/rules/transforms.rst +++ b/doc/userguide/rules/transforms.rst @@ -188,3 +188,56 @@ Example:: alert http any any -> any any (msg:"HTTP ua only"; http.header_names; \ bsize:16; content:"|0d 0a|User-Agent|0d 0a 0d 0a|"; nocase; sid:1;) + +.. _from_base64: + +from_base64 +----------- + +This transform is similar to the keyword ``base64_decode``: the buffer is decoded using +the optional values for ``mode``, ``offset`` and ``bytes`` and is available for matching +on the decoded data. + +After this transform completes, the buffer will contain only bytes that could be bases64-decoded. +If the decoding process encountered invalid bytes, those will not be included in the buffer. + +The option values must be ``,`` separated and can appear in any order. + +.. note:: ``from_base64`` follows RFC 4648 by default i.e. encounter with any character + that is not found in the base64 alphabet leads to rejection of that character and the + rest of the string. + +Format:: + + from_base64: [[bytes ] [, offset [, mode: strict|rfc4648|rfc2045]]] + +There are defaults for each of the options: +- ``bytes`` defaults to the length of the input buffer +- ``offset`` defaults to ``0`` and must be less than ``65536`` +- ``mode`` defaults to ``rfc4648`` + +Note that both ``bytes`` and ``offset`` may be variables from `byte_extract` and/or `byte_math`. + +Mode ``rfc4648`` applies RFC 4648 decoding logic which is suitable for encoding binary +data that can be safely sent by email, used in a URL, or included with HTTP POST requests. + +Mode ``rfc2045`` applies RFC 2045 decoding logic which supports strings, including those with embedded spaces. + +Mode ``strict`` will fail if an invalid character is found in the encoded bytes. + +The following examples will alert when the buffer contents match (see the +last ``content`` value for the expected strings). + +This example uses the defaults and transforms `"VGhpcyBpcyBTdXJpY2F0YQ=="` to `"This is Suricata"`:: + + content: "VGhpcyBpcyBTdXJpY2F0YQ=="; from_base64; content:"This is Suricata"; + +This example transforms `"dGhpc2lzYXRlc3QK"` to `"thisisatest"`:: + + content:"/?arg=dGhpc2lzYXRlc3QK"; from_base64: offset 6, mode rfc4648; \ + content:"thisisatest"; + +This example transforms `"Zm 9v Ym Fy"` to `"foobar"`:: + + content:"/?arg=Zm 9v Ym Fy"; from_base64: offset 6, mode rfc2045; \ + content:"foobar"; diff --git a/doc/userguide/upgrade.rst b/doc/userguide/upgrade.rst index 4de3971c94..bafecb804c 100644 --- a/doc/userguide/upgrade.rst +++ b/doc/userguide/upgrade.rst @@ -60,6 +60,8 @@ Major changes - It is possible to see an increase of alerts, for the same rule-sets, if you use many stream/payload rules, due to Suricata triggering TCP stream reassembly earlier. +- New transform ``from_base64`` that base64 decodes a buffer and passes the + decoded buffer. It's recommended that ``from_base64`` be used instead of ``base64_decode`` Upgrading 6.0 to 7.0 -------------------- -- 2.47.2