The 'tx_id' variable was used to be passed into the IterFunc as a
minumum tx to return. The IterFunc could then return either the tx
for that id, or a later one if that turned out to be the first available
tx.
The tx_id however, was still used for some things as if it was the
current tx id. Most importantly for setting the tx id for alert
ammending. So this could lead to alerts with missing or wrong
applayer records.
As we can have multiple files per TX we use the multi inspect
buffer support.
By using this API file_data supports transforms.
Redo part of the flash decompression as a hard coded built-in sort
of transform.
Introduce InspectionBuffer a structure for passing data between
prefilters, transforms and inspection engines.
At rule parsing time, we'll register new unique 'DetectBufferType's
for a 'parent' buffer (e.g. pure file_data) with its transformations.
Each unique combination of buffer with transformations gets it's
own buffer id.
Similarly, mpm registration and inspect engine registration will be
copied from the 'parent' (again, e.g. pure file_data) to the new id's.
The transforms are called from within the prefilter engines themselves.
Provide generic MPM matching and setup callbacks. Can be used by
keywords to avoid needless code duplication. Supports transformations.
Use unique name for profiling, to distinguish between pure buffers
and buffers with transformation.
Add new registration calls for mpm/prefilters and inspect engines.
Inspect engine api v2: Pass engine to itself. Add generic engine that
uses GetData callback and other registered settings.
The generic engine should be usable for every 'simple' case where
there is just a single non-streaming buffer. For example HTTP uri.
The v2 API assumes that registered MPM implements transformations.
Add util func to set new transform in rule and add util funcs for rule
parsing.
Until now, the transaction space is assumed to be terse. Transactions
are handled sequentially so the difference between the lowest and highest
active tx id's is small. For this reason the logic of walking every id
between the 'minimum' and max id made sense. The space might look like:
[..........TTTT]
Here the looping starts at the first T and loops 4 times.
This assumption isn't a great fit though. A protocol like NFS has 2 types
of transactions. Long running file transfer transactions and short lived
request/reply pairs are causing the id space to be sparse. This leads to
a lot of unnecessary looping in various parts of the engine, but most
prominently: detection, tx house keeping and tx logging.
[.T..T...TTTT.T]
Here the looping starts at the first T and loops for every spot, even
those where no tx exists anymore.
Cases have been observed where the lowest tx id was 2 and the highest
was 50k. This lead to a lot of unnecessary looping.
This patch add an alternative approach. It allows a protocol to register
an iterator function, that simply returns the next transaction until
all transactions are returned. To do this it uses a bit of state the
caller must keep.
The registration is optional. If no iterator is registered the old
behaviour will be used.
The detect engine would bypass packets that are set as dropped. This
seems sane, as these packets are going to be dropped anyway.
However, it lead to the following corner case: stream events that
triggered the drop could not be matched on the rules. The packet
with the event wouldn't make it to the detect engine due to the bypass.
This patch changes the logic to not bypass DROP packets anymore.
Packets that are dropped by the stream engine will set the no payload
inspection flag, so avoid needless cost.
Fix the inspection of multiple files in a single TX, where new files
may be added to the TX after inspection started.
Assign the hard coded id DE_STATE_FLAG_FILE_INSPECT to the file
inspect engine.
Make sure that sigs that do file inspection and don't match on the
current file always store a detailed state. This state will include
the DE_STATE_FLAG_FILE_INSPECT flag.
When the app-layer indicates a new file is available, for each sig
that has the DE_STATE_FLAG_FILE_INSPECT flag set, reset part of the
state so that the sig is evaluated again.
Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines
This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.
When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.
Special handling for the stream engine:
If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.
This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.
Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.
When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.
Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.
Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.
Many cleanups and much refactoring.
Propagate inspection limits from anchered keywords to the rest of
a rule.
Examples:
content:"A"; depth:1; is anchored, it can only match in the first byte
content:"A"; depth:1; content:"BC"; distance:0; within:2;
"BC" can only be in the 2nd and 3rd byte of the payload. So effectively
it has an implicite offset of 1 and an implicit depth of 3.
content:"A"; depth:1; content:"BC"; distance:0; can assume offset:1; for
the 2nd content.
content:"A"; depth:1; pcre:"/B/R"; content:"C"; distance:0; can assume
at least offset:1; for content "C". We can't analyzer the pcre pattern
(yet), so we assume it matches with 0 bytes.
Add lots of test cases.
Reimplement keyword to match on SHA-1 fingerprint of TLS
certificate as a mpm keyword.
alert tls any any -> any (msg:"TLS cert fingerprint test";
tls_cert_fingerprint;
content:"4a:a3:66:76:82:cb:6b:23:bb:c3:58:47:23:a4:63:a7:78:a4:a1:18";
sid:12345;)
Rules can contain conflicting statements and lead to a unmatchable rule.
2 examples are rejected by this patch:
1. dsize < content
2. dsize < content@offset
Bug #2187
Since the parser now also does nfs2, the name nfs3 became confusing.
As it's still in beta, we can rename so this patch renames all 'nfs3'
logic to simply 'nfs'.
The target keyword allows rules writer to specify information about
target of the attack. Using this keyword in a signature causes
some fields to be added in the EVE output. It also fixes ambiguity
in the Prelude output.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Previously the MPM/Prefilter engines would suggest the same rule
candidates multiple times.
For example, while processing the request body, the http headers
would be inspected by MPM multiple times.
The mask check was one way to quickly decide which rules could be
skipped.
Now that the MPM engines normally return a rule just once, this
mask check no longer makes sense. If the rule meets the ip/port/
direction based conditions, it needs to be evaluated if the MPM
said so. Even if not all conditions are yet true.
WIP disable mask as it no longer makes sense
WIP redo mask match