Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines
This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.
When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.
Special handling for the stream engine:
If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.
This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.
Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.
When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.
Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.
Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.
Many cleanups and much refactoring.
Analyze flowbits to find which bits are only checked.
Track whether they are set and checked on the same level of 'statefulness'
for later used.
Dump flowbits to json including the sids that set/check etc the bit.
Use expectation to be able to identify connections that are
ftp data. It parses the PASV response, STOR message and the
RETR message to provide extraction of files.
Implementation in Rust of FTP messages parsing is available.
Also this patch changes some var name prefixed by ssh to ftp.
Reimplement keyword to match on SHA-1 fingerprint of TLS
certificate as a mpm keyword.
alert tls any any -> any (msg:"TLS cert fingerprint test";
tls_cert_fingerprint;
content:"4a:a3:66:76:82:cb:6b:23:bb:c3:58:47:23:a4:63:a7:78:a4:a1:18";
sid:12345;)
Since the parser now also does nfs2, the name nfs3 became confusing.
As it's still in beta, we can rename so this patch renames all 'nfs3'
logic to simply 'nfs'.
The target keyword allows rules writer to specify information about
target of the attack. Using this keyword in a signature causes
some fields to be added in the EVE output. It also fixes ambiguity
in the Prelude output.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
In preparation of turning input to keyword parsers to const add
options to the common rule parser to enforce and strip double
quotes and parse negation support.
At registration, the keyword can register 3 extra flags:
SIGMATCH_QUOTES_MANDATORY: value to keyword must be quoted
SIGMATCH_QUOTES_OPTIONAL: value to keyword may be quoted
SIGMATCH_HANDLE_NEGATION: leading ! is parsed
In all cases leading spaces are removed. If the 'quote' flags are
set, the quotes are removed from the input as well.
Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Now that detect moves the raw progress forward, it's important
to deal with the case where detect don't consider raw inspection.
If no 'stream' rules are active, disable raw. For this the disable
raw flag is now per stream.
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:
1. the messages had a fixed size, so blocks of data bigger than ~4k
would be cut into multiple messages
2. it lead to lots of data copying and unnecessary memory use
3. the StreamMsgs used a central pool
The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.
The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.
To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.
Implement common code to easily add more per HTTP header detection
keywords.
Implement http_accept sticky buffer. It operates on the HTTP Accept
header.
When detection is running flags are set on flows to indicate if file
hashing is needed. This is based on global output settings and rules.
In the case of --disable-detection this was not happening, so all
files where hashed with all methods. This has a significant
performance impact.
This patch adds logic to set the flow flags in --disable-detect mode.
Match on TLS certificate serial number using tls_cert_serial
keyword, e.g.:
alert tls any any -> any any (msg:"TLS cert serial test";
tls_cert_serial; content:"5C:19:B7:B1:32:3B:1C:A1";
sid:12345;)
Flowvars were already using a temporary store in the detect thread
ctx.
Use the same facility for pktvars. The reasons are:
1. packet is not always available, e.g. when running pcre on http
buffers.
2. setting of vars should be done post match. Until now it was also
possible that it is done on a partial match.
Until now variable names, such as flowbit names, were local to a detect
engine. This made sense as they were only ever used in that context.
For the purpose of logging these names, this needs a different approach.
The loggers live outside of the detect engine. Also, in the case of
reloads and multi-tenancy, there are even multiple detect engines, so
it would be even more tricky to access them from the outside.
This patch brings a new approach. A any time, there is a single active
hash table mapping the variable names and their id's. For multiple
tenants the table is shared between tenants.
The table is set up in a 'staging' area, where locking makes sure that
multiple loading threads don't mess things up. Then when the preparing
of a detection engine is ready, but before the detect threads are made
aware of the new detect engine, the active varname hash is swapped with
the staging instance.
For this to work, all the mappings from the 'current' or active mapping
are added to the staging table.
After the threads have reloaded and the new detection engine is active,
the old table can be freed.
For multi tenancy things are similar. The staging area is used for
setting up until the new detection engines / tenants are applied to
the system.
This patch also changes the variable 'id'/'idx' field to uint32_t. Due
to data structure padding and alignment, this should have no practical
drawback while allowing for a lot more vars.
Matches on the start of a HTTP request or response.
Uses a buffer constructed from the request line and normalized request
headers, including the Cookie header.
Or for the response side, it uses the response line plus the
normalized response headers, including the Set-Cookie header.
Both buffers are terminated by an extra \r\n.