Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Previously the MPM/Prefilter engines would suggest the same rule
candidates multiple times.
For example, while processing the request body, the http headers
would be inspected by MPM multiple times.
The mask check was one way to quickly decide which rules could be
skipped.
Now that the MPM engines normally return a rule just once, this
mask check no longer makes sense. If the rule meets the ip/port/
direction based conditions, it needs to be evaluated if the MPM
said so. Even if not all conditions are yet true.
WIP disable mask as it no longer makes sense
WIP redo mask match
In various scenarios buffers would be checked my MPM more than
once. This was because the buffers would be inspected for a
certain progress value or higher.
For example, for each packet in a file upload, the engine would
not just rerun the 'http client body' MPM on the new data, it
would also rerun the method, uri, headers, cookie, etc MPMs.
This was obviously inefficent, so this patch changes the logic.
The patch only runs the MPM engines when the progress is exactly
the intended progress. If the progress is beyond the desired
value, it is run once. A tracker is added to the app layer API,
where the completed MPMs are tracked.
Implemented for HTTP, TLS and SSH.
Instead of killing all reassembly instantly do things slightly more
gracefully:
1. disable app-layer reassembly immediately
2. flag raw reassembly not to accept new data
This will allow the current data to be inspected still.
After detect as run the raw reassembly will be fully disabled and
thus all reassembly will be as well.
If raw reassembly falls behind, for example because no raw mpm is
active, then we need to sync up to the app progress if that is
available, or to the generic tcp tracking otherwise.
Now that detect moves the raw progress forward, it's important
to deal with the case where detect don't consider raw inspection.
If no 'stream' rules are active, disable raw. For this the disable
raw flag is now per stream.
Implement the inline mode for raw content inspection. Packets
are leading, and when a packet's payload has been added to the
stream, the packet is inspected in the context of the stream.
Reassembly will return a buffer with the packet data with older
data in front of it and after it, if available.
At flow timeout, we no longer need to first run reassembly in
one dir, then inspection in the other. We can do both in single
packet now.
Disable pseudo packets when receiving stream end packets. Instead
call the app-layer parser in the packet direction for stream end
packets and flow end packets.
These changes in handling of those stream end packets make the
pseudo packets unnecessary.
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:
1. the messages had a fixed size, so blocks of data bigger than ~4k
would be cut into multiple messages
2. it lead to lots of data copying and unnecessary memory use
3. the StreamMsgs used a central pool
The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.
The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.
To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.