]> git.ipfire.org Git - thirdparty/suricata.git/commitdiff
doc: add doc on internals of inspection of raw data
authorShivani Bhardwaj <shivanib134@gmail.com>
Fri, 20 Jun 2025 09:57:27 +0000 (15:27 +0530)
committerVictor Julien <victor@inliniac.net>
Sat, 13 Sep 2025 06:40:09 +0000 (08:40 +0200)
Explain briefly the internals of inspection of raw data in the following order:
- Stream Engine
- Stream reassembly
- Role of Detection Engine and Applayer Parsers
- High level communication between Stream and Detection Engine
- Relevant suricata.yaml settings

alongwith some diagrams.

Ticket 4351

doc/userguide/configuration/suricata-yaml.rst
doc/userguide/devguide/internals/engines/index.rst
doc/userguide/devguide/internals/engines/stream/inspection_raw_data.rst [new file with mode: 0644]
doc/userguide/devguide/internals/engines/stream/stream_engine/de_se_interaction.png [new file with mode: 0644]
doc/userguide/devguide/internals/engines/stream/stream_engine/inspection_window.png [new file with mode: 0644]
doc/userguide/devguide/internals/engines/stream/stream_engine/post_inspection.png [new file with mode: 0644]

index 94fa3bb488ca88109194d50e8bf23742950f79ef..3f9bff13b7455443d86a18f7754c93b71fee2f68 100644 (file)
@@ -1271,6 +1271,8 @@ UDP, ICMP and default (all other protocols).
       emergency-new: 10
       emergency-established: 100
 
+.. _stream-engine-yaml:
+
 Stream-engine
 ~~~~~~~~~~~~~
 
index dddd16ebaa915f18335f1998559b1105f38e3050..efc7cf1fbc9c6853d59a94aba55e05c1bb8bbc7e 100644 (file)
@@ -7,6 +7,11 @@ Flow
 Stream
 ------
 
+.. toctree::
+   :maxdepth: 2
+
+   stream/inspection_raw_data
+
 Defrag
 ------
 
diff --git a/doc/userguide/devguide/internals/engines/stream/inspection_raw_data.rst b/doc/userguide/devguide/internals/engines/stream/inspection_raw_data.rst
new file mode 100644 (file)
index 0000000..1869ef9
--- /dev/null
@@ -0,0 +1,117 @@
+Inspection of raw stream data
+#############################
+
+Stream Engine
+^^^^^^^^^^^^^
+
+Suricata's Stream Engine tracks and processes all the TCP stream data. Its responsibilities include
+
+* TCP segment reassembly
+* TCP data normalization
+* gap management and handling
+* maintaining internal caches
+* handling of special cases like TCP URG ptr
+* applying user-defined constraints like stream depth etc
+
+for IDS as well as inline mode.
+
+Internal storage of stream data
+===============================
+
+For a stream with small gaps, a Red Black Tree is used to store the streaming buffer blocks.
+
+For a stream with large gaps (>=262144 bytes), regions (list of blocks of data) are used.
+
+For a stream without gaps, one continuous streaming buffer is used (i.e. just one region).
+
+These different data structures are used in a quest to make efficient use of memory in exceptional
+and regular conditions.
+
+Role of stream reassembly
+=========================
+
+TCP stream data can arrive in any manner. For example, 100 bytes of data can arrive as 100 bytes at
+once or 1 byte at a time in 100 segments. The possibilities are insanely high! So, if it's
+100 bytes of data, there are :math:`2^{99}` ways this data can be received in a world where this
+data arrives in order!
+
+Hence, it is important for the engine to reassemble the TCP stream data to avoid unnecessary
+inspection on incomplete data and to avoid leaving room for evasion techniques based on small
+segments. Stream reassembly makes sure that the data to be matched upon is reliable.
+
+Role of Detection Engine and Applayer Parser
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In order to conduct inspection on certain stream data, the Detection Engine has to request the
+Stream Engine for data. Doing this for every parseable data can be expensive and unreliable, so,
+the engine requests data in chunks. The size of these chunks can be defined in :ref:`suricata.yaml <stream-engine-yaml>`.
+It is recommended to randomize the chunk size to avoid possible evasions on predictable boundaries.
+By default, the chunk size is randomized by Suricata.
+
+Note that in some cases these chunk sizes may be too far into the future resulting in delayed
+inspection of data. This could lead to several issues like the one listed in `Bug 7004 <https://redmine.openinfosecfoundation.org/issues/7004>`_.
+To deal with this, most applayer parsers request inspection of data as soon as they have fully and
+reliably parsed a certain entity like a request or a response in a respective direction.
+
+It is important to note that the inspection window can be limited by certain special conditions
+like stream depth being reached or end of stream being reached, etc.
+
+Tracking of inspection
+^^^^^^^^^^^^^^^^^^^^^^
+
+The Stream Engine must keep track of the point until which the inspection is already
+done. This helps the engine know what data has been consumed and can be slid out of the window.
+For a given stream without gaps, in IDS mode, from a very high level, assuming there are no overlaps
+in the tracking/data, no special conditions at play, the oversimplified tracking would look
+like the following.
+
+.. image:: stream_engine/inspection_window.png
+
+On a very high level, the communication that takes place between the Detection Engine and the
+Stream Engine about data inspection is as follows.
+
+.. image:: stream_engine/de_se_interaction.png
+
+Of course this means that the Detection Engine also maintains a copy of the raw progress of the
+data it has consumed so far. After the inspection is completed, the streaming buffer window slides if
+the data was consumed successfully. Additionally, the relative raw tracker is updated.
+
+.. image:: stream_engine/post_inspection.png
+
+Relevant configuration
+^^^^^^^^^^^^^^^^^^^^^^
+
+The following `suricata.yaml` settings can impact the internal inspection of data.
+
+Stream Engine related settings:
+
+::
+
+  stream:
+    memcap: 64 MiB
+    #memcap-policy: ignore
+    checksum-validation: yes      # reject incorrect csums
+    #midstream: false
+    #midstream-policy: ignore
+    inline: auto                  # auto will use inline mode in IPS mode, yes#
+    reassembly:
+      urgent:
+        policy: oob              # drop, inline, oob (1 byte, see RFC 6093, 3.#
+        oob-limit-policy: drop
+      memcap: 256 MiB
+      #memcap-policy: ignore
+      depth: 1 MiB                # reassemble 1 MiB into a stream
+      toserver-chunk-size: 2560
+      toclient-chunk-size: 2560
+      randomize-chunk-size: yes
+      #randomize-chunk-range: 10
+      #raw: yes
+      #segment-prealloc: 2048
+      #check-overlap-different-data: true
+      #max-regions: 8
+
+Prefilter/MPM related settings:
+
+::
+
+  mpm-algo: hs
diff --git a/doc/userguide/devguide/internals/engines/stream/stream_engine/de_se_interaction.png b/doc/userguide/devguide/internals/engines/stream/stream_engine/de_se_interaction.png
new file mode 100644 (file)
index 0000000..a5015e7
Binary files /dev/null and b/doc/userguide/devguide/internals/engines/stream/stream_engine/de_se_interaction.png differ
diff --git a/doc/userguide/devguide/internals/engines/stream/stream_engine/inspection_window.png b/doc/userguide/devguide/internals/engines/stream/stream_engine/inspection_window.png
new file mode 100644 (file)
index 0000000..6dcc0d7
Binary files /dev/null and b/doc/userguide/devguide/internals/engines/stream/stream_engine/inspection_window.png differ
diff --git a/doc/userguide/devguide/internals/engines/stream/stream_engine/post_inspection.png b/doc/userguide/devguide/internals/engines/stream/stream_engine/post_inspection.png
new file mode 100644 (file)
index 0000000..65b5ea7
Binary files /dev/null and b/doc/userguide/devguide/internals/engines/stream/stream_engine/post_inspection.png differ