Victor Julien [Thu, 13 Jan 2022 11:13:43 +0000 (12:13 +0100)]
stream: fix stream pruning being too aggressive
Pruning of StreamBufferBlocks could remove blocks that fell entirely
after the target offset due to a logic error. This could lead to data
being evicted that was still meant to be processed in theapp-layer
parsers.
Jason Ish [Tue, 21 Dec 2021 22:34:05 +0000 (16:34 -0600)]
dns: create transaction even if z-bit was set
It appears that DNS servers will still process a DNS request even if the
z-bit is set, our parser will fail the transaction. So create the
transaction, but still set the event.
Victor Julien [Mon, 25 Oct 2021 17:14:49 +0000 (19:14 +0200)]
flow/worker: run housekeeping for bypassed packets
Run flow eviction and flow inject queues for bypassed packets as well,
to avoid a scenario where these won't get run at all if too much of the
traffic is bypassed.
Jason Ish [Fri, 16 Oct 2020 15:43:29 +0000 (09:43 -0600)]
af-packet: use configured cluster-id when checking for fanout
When testing for fanout support a cluster-id of 1 was always being
used instead of the configured cluster-id. This limited fanout
support to only one Suricata instance.
Instead of hardcoding an ID of 1, use the configured cluster-id.
Also make cluster_id a uint16_t instead of an int in AFPThreadVars.
Eric Leblond [Wed, 17 Nov 2021 15:43:23 +0000 (16:43 +0100)]
profiling: fix profiling with sample rate
Rules profiling was returning invalid results when used with sample
rate. The problem was that the sample condition was run twice in the
packet flow. As a result, the second pass was not initializing the
variable storing the initial CPU ticks and the resulting performance
counters were reporting invalid values.
Jason Ish [Wed, 10 Nov 2021 22:38:35 +0000 (16:38 -0600)]
smtp: log transaction even if no email present
The SMTP transaction logger was not writing the log if the email
portion of the logger failed, such as in the case of STARTTLS
where this is no email decoded.
ThresholdHandlePacketRule may take ownership of an allocated
DetectThresholdEntry, and places it in a position of the
array th_entry. But it never got released
A transaction to client is always considered
complete in the direction to server and vice versa.
Otherwise, transactions are never complete for
AppLayerParserTransactionsCleanup
Jason Ish [Wed, 17 Nov 2021 01:34:11 +0000 (19:34 -0600)]
dhcp: fix url in comment
rustdoc was complaining about the format of the URL in a comment
while trying to generate documentation. Convert the comment to a
non-rustdoc comment for now to satisfy rustdoc.
Philippe Antoine [Thu, 16 Sep 2021 14:54:37 +0000 (16:54 +0200)]
tcp: rejects FIN+SYN packets as invalid
Ticket: #4569
If a FIN+SYN packet is sent, the destination may keep the
connection alive instead of starting to close it.
In this case, a later SYN packet will be ignored by the
destination.
Previously, Suricata considered this a session reuse, and thus
used the sequence number of the last SYN packet, instead of
using the one of the live connection, leading to evasion.
This commit errors on FIN+SYN so that they do not get
processed as regular FIN packets.
Victor Julien [Tue, 5 Oct 2021 12:48:27 +0000 (14:48 +0200)]
stream/tcp: handle RST with MD5 or AO header
Special handling for RST packets if they have an TCP MD5 or AO header option.
The options hash can't be validated. The end host might be able to validate
it, as it can have a key/password that was communicated out of band.
The sender could use this to move the TCP state to 'CLOSED', leading to
a desync of the TCP session.
This patch builds on top of 843d0b7a10bb ("stream: support RST getting lost/ignored")
It flags the receiver as having received an RST and moves the TCP state
into the CLOSED state. It then reverts this if the sender continues to
send traffic. In this case it sets the following event:
Philippe Antoine [Mon, 13 Sep 2021 10:18:34 +0000 (12:18 +0200)]
detect: fixes InspectionBuffer id with transforms
When InspectionBufferGet gets called with base_id
Later InspectionBufferSetup must also be called with base_id
In case there were transforms, we had base_id != list_id
Not calling InspectionBufferSetup with the right id
resulted in leaving a dangling pointer,
because it was not added to det_ctx->inspect.to_clear_queue
Victor Julien [Wed, 20 Oct 2021 11:20:32 +0000 (13:20 +0200)]
flow/bypass: use_cnt desync'd on bypassed flows
Locally bypassed flows had unsafe updates to `Flow::use_cnt` leading to a race
issue. For a packet it would do the flow lookup, attach the flow to the packet,
increment the `use_cnt`. Then it would detect that the flow is in the bypass
state, and unlock it while holding a reference (so alos not decrementing the
`use_cnt`). When the packet was then returned to the packet pool, the flow would
be disconnected from the packet, which would decrement `use_cnt` without holding
the flow lock.
This patch addresses this issue by disconnecting the flow from the packet
immediately when the bypassed state is detected. This moves the `use_cnt`
decrement to within the lock.
Victor Julien [Fri, 5 Nov 2021 19:05:43 +0000 (20:05 +0100)]
packetpool: reset PacketRelease on return to pool
Reset PacketRelease callback to make sure its not set to a capture
specific callback.
As an example:
0x000055e00af09d35 in AFPReleaseDataFromRing (p=0x7f1d884cb830) at source-af-packet.c:653
0x000055e00af09dd0 in AFPReleasePacket (p=0x7f1d884cb830) at source-af-packet.c:678
0x000055e00ab53d7e in TmqhOutputPacketpool (t=0x55e00fb79250, p=0x7f1d884cb830) at tmqh-packetpool.c:465
0x000055e00af08dec in TmThreadsSlotProcessPkt (tv=0x55e00fb79250, s=0x55e012134790, p=0x7f1d884cb830) at tm-threads.h:201
0x000055e00af08e70 in TmThreadsCaptureInjectPacket (tv=0x55e00fb79250, p=0x7f1d884cb830) at tm-threads.h:221
0x000055e00af08f2e in TmThreadsCaptureHandleTimeout (tv=0x55e00fb79250, p=0x0) at tm-threads.h:245
0x000055e00af0ba76 in ReceiveAFPLoop (tv=0x55e00fb79250, data=0x7f1d884ccb60, slot=0x55e01198e4b0) at source-af-packet.c:1321
0x000055e00ab55257 in TmThreadsSlotPktAcqLoop (td=0x55e00fb79250) at tm-threads.c:312
0x00007f1dca9d5609 in start_thread (arg=<optimized out>) at pthread_create.c:477
0x00007f1dca7c6293 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Here the packet was a pseudo packet to handle a timeout condition. But
the ReleasePacket callback was still set to AFPReleasePacket from a
previous use of the Packet.
Victor Julien [Sun, 31 Oct 2021 21:13:19 +0000 (22:13 +0100)]
af-packet: fix soft lockup issues
The Suricata AF_PACKET code opens a socket per thread, then after some minor
setup enters a loop where the socket is poll()'d with a timeout. When the
poll() call returns a non zero positive value, the AF_PACKET ring will be
processed.
The ringbuffer processing logic has a pointer into the ring where we last
checked the ring. From this position we will inspect each frame until we
find a frame with tp_status == TP_STATUS_KERNEL (so essentially 0). This
means the frame is currently owned by the kernel.
There is a special case handling for starting the ring processing but
finding a TP_STATUS_KERNEL immediately. This logic then skip to the next
frame, rerun the check, etc until it either finds an initialized frame or
the last frame of the ringbuffer.
The problem was, however, that the initial uninitialized frame was possibly
(likely?) still being initialized by the kernel. A data race between the
notification through the socket (the poll()) and the updating of the
`tp_status` field in the frame could lead to a valid frame getting skipped.
Of note is that for example libpcap does not do frame scanning. Instead it
simply exits it ring processing loop. Also interesting is that libpcap uses
atomic loads and stores on the tp_status field.
This skipping of frames had 2 bad side effects:
1. in most cases, the buffer would be full enough that the frame would
be processed in the next pass of the ring, but now the frame would
out of order. This might have lead to packets belong to the same
flow getting processed in the wrong order.
2. more severe is the soft lockup case. The skipped frame sits at ring
buffer index 0. The rest of the ring has been cleared, after the
initial frame was skipped. As our pass of the ring stops at the end
of the ring (ptv->frame_offset + 1 == ptv->req.v2.tp_frame_nr) the code
exits the ring processing loop at goes back to poll(). However, poll()
will not indicate that there is more data, as the stale frame in the
ring blocks the kernel from populating more frames beyond it. This
is now a dead lock, as the kernel waits for Suricata and Suricata
never touches the ring until it hears from the kernel.
The scan logic will scan the whole ring at most once, so it won't
reconsider the stale frame either.
This patch addresses the issues in several ways:
1. the startup "discard" logic was fixed to not skip over kernel
frames. Doing so would get us in a bad state at start up.
2. Instead of scanning the ring, we now enter a busy wait loop
when encountering a kernel frame where we didn't expect one. This
means that if we got a > 0 poll() result, we'll busy wait until
we get at least one frame.
3. Error handling is unified and cleaned up. Any frame error now
returns the frame to the kernel and progresses the frame pointer.
4. If we find a frame that is owned by us (TP_STATUS_USER_BUSY) we
yield to poll() immediately, as the next expected status of that
frame is TP_STATUS_KERNEL.
5. the ring is no longer processed until the "end" of the ring (so
highest index), but instead we process at most one full ring size
per run.
6. Work with a copy of `tp_status` instead of accessing original touched
also by the kernel.
Victor Julien [Sun, 7 Nov 2021 05:25:31 +0000 (06:25 +0100)]
flow/manager: fix flows not evicted & freed in time
Flows have been shown to linger for a long time w/o giving up their
resources. This would lead to higher memory use and memcaps getting
reached.
Three main causes have been identified:
Slow passes hash passes. By default the flow manager will scan the
flow hash slowly. It is based on the flow timeout settings, and with
the default config it will take 4 minutes for a full scan to be
complete. This leaves a window for flows that are timed out to linger
for minutes longer than expected.
Flow Manager yields under pressure. The per row TryLock causes work
to be delayed more. The Flow manager will use trylock on a hash row
and will yield immediately if the row is busy. This means that it will
take a full pass before the row is revisited again. If the row holds
busy flows, this could happen many times in a row.
Flow Manager favors evicted flows over active flows. The Flow Manager
will only process the evicted flows if they are present. These flows
have been evicted by workers. The active flows on that hash row will
have to wait until the next hash pass. Of course by then there could
be more evicted flows.
Combined these factors could lead to flows not being considered for
freeing and logging for a very long time, potentially even indefinitly.
The patch addresses the latter two flow manager issues by no longer
using TryLock. It will now simply wait for the lock to be released and
then do its work on it. Additionally for each row both the evicted list
and the active flow list will be processed.
Jason Ish [Tue, 5 Oct 2021 16:44:03 +0000 (10:44 -0600)]
github-ci: pin macos build to 10.15
There is currently a build failure with macos-latest (recently updated)
to 11 in the libhtp test suite code. Not sure if there are other
build issues in libhtp or Suricata at this time.
Victor Julien [Wed, 21 Apr 2021 13:20:49 +0000 (15:20 +0200)]
app-layer/pd: only consider actual available data
For size limit checks consider only available data at the stream start
and before any GAPS.
The old check would consider too much data if there were temporary gaps,
like when a data packet was in-window but (far) ahead of the expected
segment.
Victor Julien [Mon, 4 Oct 2021 14:01:47 +0000 (16:01 +0200)]
flow: free spare pool more aggressively
The flows exceeding the spare pools config setting would be freed
per at max 100 flows a second. After a high speed test this would
lead to excessive memory use for a long time.
This patch updates the logic to free 10% of the excess flows per
run, freeing multiple blocks of flows as needed.
Victor Julien [Fri, 1 Oct 2021 11:20:02 +0000 (13:20 +0200)]
flow: process evicted flows on low/no traffic
In a scenario where there was suddenly no more traffic flowing, flows
in a threads `flow_queue` would not be processed. The easiest way to
see this would be in a traffic replay scenario. After the replay is done
no more packets come in and these evicted flows got stuck.
In workers mode, the capture part handles timeout this was updated to
take the `ThreadVars::flow_queue` into account.
The autofp mode the logic that puts a flow into a threads `flow_queue`
would already wake a thread up, but the `flow_queue` was then ignored.
This has been updated to take the `flow_queue` into account.
In both cases a "capture timeout" packet is pushed through the pipeline
to "flush" the queues.
Victor Julien [Tue, 14 Sep 2021 08:35:18 +0000 (10:35 +0200)]
detect: track prefilter by progress, not engine
Fix FNs in case of too many prefilter engines. A transaction was tracking
which engines have run using a u64 bit array. The engines 'local_id' was
used to set and check this bit. However the bit checking code didn't
handle int types correctly, leading to an incorrect left shift result of
a u32 to a u64 bit value.
This commit addresses that by fixing the int handling, but also by
changing how the engines are tracked.
To avoid wasting prefilter engine tracking bit space, track what
ran by the progress they are registered at, instead of the individual
engine id's. While we can have many engines, the protocols use far
fewer unique progress values. So instead of tracking for dozens of
prefilter id's, we track for the handful of progress values.
To allow for this the engine array is sorted by tx_min_progress, then
app_proto and finally local_id. A new field is added to "know" when
the last relevant engine for a progress value is reached, so that we
can set the prefilter bit then.
A consquence is that the progress values have a ceiling now that
needs to fit in a 64 bit bitarray. The values used by parsers currently
does not exceed 5, so that seems to be ok.
Victor Julien [Fri, 3 Sep 2021 15:04:02 +0000 (17:04 +0200)]
detect: unify alert handling; fix bugs
Unify handling of signature matches between various rule types and
between noalert and regular rules.
"noalert" sigs are added to the alert queue initially, but removed
from it after handling their actions. This way all actions are applied
from a single place.
Make sure flow drop and pass are mutually exclusive.
The above addresses issue with pass and drops not getting applied
correctly in various cases.
If an HTTP2 file was within only ont DATA frame, the filetracker
would open it and close it in the same call, preventing the
firther call to incr_files_opened
Victor Julien [Thu, 18 Mar 2021 10:05:35 +0000 (11:05 +0100)]
nfs: don't reuse file transactions
After a file has been closed (CLOSE, COMMIT command or EOF/SYNC part of
READ/WRITE data block) mark it as such so that new file commands on that
file do not reuse the transaction.
When a file transfer is completed it will be flagged as such and not be
found anymore by the NFSState::get_file_tx_by_handle() method. This forces
a new transaction to be created.
Victor Julien [Thu, 18 Mar 2021 13:38:33 +0000 (14:38 +0100)]
filestore: store chunks in packet direction
Storing too early can lead to files being considered TRUNCATED if the
TCP state is not yet CLOSED when logging is triggered. This has been
observed with FTP-DATA and might also be an issue with simple HTTP.