For logging streaming TCP data so far the individual segments where
used. However since the last big stream changes, the segments are
no longer the proper place for this. Segments can now have overlaps
etc.
This patch introduces a new tracker. Next to the existing 'app' and
'raw' trackers, the new tracker is 'log'. When the TCP logging is
used, a flag in the config is set and the log tracker is used to
determine how much of the stream window can be moved.
Now that detect moves the raw progress forward, it's important
to deal with the case where detect don't consider raw inspection.
If no 'stream' rules are active, disable raw. For this the disable
raw flag is now per stream.
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:
1. the messages had a fixed size, so blocks of data bigger than ~4k
would be cut into multiple messages
2. it lead to lots of data copying and unnecessary memory use
3. the StreamMsgs used a central pool
The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.
The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.
To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.
Make stream engine use the streaming buffer API for it's data storage.
This means that the data is stored in a single reassembled sliding
buffer. The subleties of the reassembly, e.g. overlap handling, are
taken care of at segment insertion.
The TcpSegments now have a StreamingBufferSegment that contains an
offset and a length. Using this the segment data can be retrieved
per segment.
Redo segment insertion. The insertion code is moved to it's own file
and is simplified a lot.
A major difference with the previous implementation is that the segment
list now contains overlapping segments if the traffic is that way.
Previously there could be more and smaller segments in the memory list
than what was seen on the wire.
Due to the matching of in memory segments and on the wire segments,
the overlap with different data detection (potential mots attacks)
is much more accurate.
Raw and App reassembly progress is no longer tracked per segment using
flags, but there is now a progress tracker in the TcpStream for each.
When pruning we make sure we don't slide beyond in-use segments. When
both app-layer and raw inspection are beyond the start of the segment
list, the segments might not be freed even though the data in the
streaming buffer is already gone. This is caused by the 'in-use' status
that the segments can implicitly have. This patch accounts for that
when calculating the 'left_edge' of the streaming window.
Raw reassembly still sets up 'StreamMsg' objects for content
inspection. They are set up based on either the full StreamingBuffer,
or based on the StreamingBufferBlocks if there are gaps in the data.
Reworked 'stream needs work' logic. When a flow times out the flow
engine checks whether a TCP flow still needs work. The
StreamNeedsReassembly function is used to test if a stream still has
unreassembled segments or uninspected stream chunks.
This patch updates the function to consider the app and/or raw
progress. It also cleans the function up and adds more meaningful
debug messages. Finally it makes it non-inline.
Unittests have been overhauled, and partly moved into their own files.
Remove lots of dead code.
When the session/flow was flagged as 'no applayer inspect', which
could happen as a result various reasons, packets would still be
considered by the app layer reassembly.
When ACK'd, they would be removed again. Depending also on the raw
reassembly.
In very long sessions however, this meganism could fail leading to
virtually endlessly growing segment lists.
This patch makes sure that segments that come in on a 'no app layer'
session are tagged properly or even not added at all.
Use a new ssn flag instead of flow flag for no app tracking.
StreamIterator implementation for iterating over ACKed segments.
Flag each segment as logged when the log function has been called for it.
Set a 'OPEN' flag for the first segment in both directions.
Set a 'CLOSE' flag when the stream ends. If the last segment was already
logged, a empty CLOSE call is performed with NULL data.
For netflow logging track TCP flags per stream direction. As the struct
had no more space left without expanding it, the flags and wscale
fields are now compressed.
Implement StreamTcpSetDisableRawReassemblyFlag() which stops raw
reassembly for _NEW_ segments in a stream direction.
It is used only by TLS/SSL now, to flag the streams as encrypted.
Existing segments will still be reassembled and inspected, while
new segments won't be. This allows for pattern based inspection
of the TLS handshake.
Like is the case with completely disabled 'raw' reassembly, the
logic is that the segments are flagged as completed for 'raw' right
away. So they are not considered in raw reassembly anymore.
As no new segments will be considered, the chunk limit check will
return true on the next call.
Raw reassembly is used only by the detection engine. For users only
caring about logging it's a significant overhead, both in cpu and
memory usage.
The option is called 'raw' and lives under the stream.reassembly
options.
stream:
memcap: 32mb
checksum-validation: yes # reject wrong csums
inline: auto # auto will use inline mode in IPS mode, yes or no set it statically
reassembly:
memcap: 64mb
depth: 1mb # reassemble 1mb into a stream
toserver-chunk-size: 2560
toclient-chunk-size: 2560
randomize-chunk-size: yes
#randomize-chunk-range: 10
raw: false # <- new option
In case of recursive call to protocol detection from within protocol
detection, and the recursively invoked stream still hasn't been ack'ed
yet, protocol detection doesn't take place. In such cases we will end up
still calling the app layer with the wrong direction data. Introduce a
check to not call app layer with wrong direction data.
When sockets are re-used reset all relevant vars correctly.
This commit fixes a bug where we were not reseting app proto detection
vars.
While fixing #989, we discovered some other bugs which have also been
fixed, or rather some features which are now updated. One of the feature
update being if we recieve wrong direction data first, we don't reset the
protocol values for the flow. We let the flow retain the detected
values.
Unittests have been modified to accomodate the above change.
Use per thread pools to store and retrieve SSN's from. Uses PoolThread
API.
Remove max-sessions setting. Pools are set to unlimited, but TCP memcap
limits the amount of sessions.
The prealloc_session settings now applies to each thread, so lowered the
default from 32k to 2k.
Until now, when processing the TCP 3 way handshake (3whs), retransmissions
of SYN/ACKs are silently accepted, unless they are different somehow. If
the SEQ or ACK values are different they are considered wrong and events
are set. The stream events rules will match on this.
In some cases, this is wrong. If the client missed the SYN/ACK, the server
may send a different one with a different SEQ. This commit deals with this.
As it is impossible to predict which one the client will accept, each is
added to a list. Then on receiving the final ACK from the 3whs, the list
is checked and the state is updated according to the queued SYN/ACK.
Ssn flag STREAMTCP_FLAG_ZERO_TIMESTAMP was used in stream only. Due to
it's value it did not conflict with a real stream flag. Renamed it to
STREAMTCP_STREAM_FLAG_ZERO_TIMESTAMP.
The STREAMTCP_FLAG_TIMESTAMP flag is a ssn flag, however it was used in
the stream flag field. As it has the same value as
STREAMTCP_STREAM_FLAG_DEPTH_REACHED it's possible that stream reassembly
got confused by the timestamp.
Removal of per flow 'aldata' array. It contained a ptr for each ALPROTO. Instead now we have 2 ptrs in the flow: alparser and alstate.
Various cleanups and dead code removal from the app layer API.
Should safe 100+ bytes memory per flow on 64 bit.
Updated lots of unittests to reflect these changes.
Split stream reassembly in 2 parts: a part that sends ack'd data to the app
layer parsers as soon as it's available, and another part that queues up
data into larger chunks for raw inspection.