Fix non-continious matches with content and pcre modifiers setting up
multiple buffers.
To address this store whether a buffer is multi-capable and if not reuse
an earlier buffer if possible.
Bug: #6397.
Fixes: ad88efc2d8 ("detect: support multi buffer matching")
Ticket: #6104
The approach in master branch is to change the prototype of
SigMatchAppendSMToList so that it allocates itself the new SigMatch
This approach requires to change all the 100-ish calls to
SigMatchAppendSMToList and is thus quite a big change.
For branch 7, we still wanted to avoid the buffer overflow, but
did not want such an intrusive change, and still wanted to make
the signature invalid. Instead of changing the prototype of the
function, we make it return early, and set a flag in the signature
which can be later checked by SigValidate
Issue: 6239
This commit moves the global variables associated with engine analysis
into the detect engine context. Doing so provides encapsulation of the
analysis components as well as thread-safe operation in a multi-tenant
(context) deployment.
Move pcre2 data structures used for parsing into the detect engine
context, so that multiple tenant loading threads don't use the same
data structures.
Bug: #4797.
Instead of using flags to indicate a rule type, use an explicit `type`
field.
This will make it more clean in code paths what paths a rule is taking,
and will allow easier debugging as well as analyzer output.
Define the following fields:
- SIG_TYPE_IPONLY: sig meets IP-only criteria and is handled by the IP-only
engine.
- SIG_TYPE_PDONLY: sig inspects protocol detection results only.
- SIG_TYPE_DEONLY: sig inspects decoder events only.
- SIG_TYPE_PKT: sig is inspected per packet.
- SIG_TYPE_PKT_STREAM: sig is inspected against either packet payload or
stream payload.
- SIG_TYPE_STREAM: sig is inspected against the reassembled stream
- SIG_TYPE_APPLAYER: sig is inspected against an app-layer property, but not
against a tx engine.
- SIG_TYPE_APP_TX: sig is inspected the tx aware inspection engine(s).
Ticket: #6085.
Performance measurement of rules is important on live Suricata
as bad rules can cause severe performance regression. This patch
introduces the --enable-profiling-rules that activate profiling
for the rules. This reduces the performance impact of full
profiling and provide visiblity on the rules performance at
the same time.
Multi buffer matching is implemented as a way for a rule to match
on multiple buffers within the same transaction.
Before this patch a rule like:
dns.query; content:"example"; dns.query; content:".com";
would be equivalent to:
dns.query; content:"example"; content:".com";
If a DNS query would request more than one name, e.g.:
DNS: [example.net][something.com]
Eeach would be inspected to have both patterns present. Otherwise,
it would not be a match. So the rule above would not match, as neither
example.net and somthing.com satisfy both conditions at the same time.
This patch changes this behavior. Instead of the above, each time the
sticky buffer is specified, it creates a separate detection unit. Each
buffer is a "multi buffer" sticky buffer will now be evaluated against
each "instance" of the sticky buffer.
To continue with the above example:
DNS: [example.net] <- matches 'dns.query; content:"example";'
DNS: [something.com] <- matches 'dns.query; content:".com"'
So this would now be a match.
To make sure both patterns match in a single query string, the expression
'dns.query; content:"example"; content:".com";' still works for this.
This patch doesn't yet enable the behavior for the keywords. That is
done in a follow up patch.
To be able to implement this the internal storage of parsed rules
is changed. Until this patch and array of lists was used, where the
index was the buffer id (e.g. http_uri, dns_query). Therefore there
was only one list of matches per buffer id. As a side effect this
array was always very sparsely populated as many buffers could not
be mixed.
This patch changes the internal representation. The new array is densely
packed:
dns.query; content:"1"; dns.query; bsize:1; content:"2";
[type: dns_query][list: content:"1";]
[type: dns_query][list: bsize:1; content:"2";]
The new scheme allows for multiple instances of the same buffer.
These lists are then translated into multiple inspection engines
during the final setup of the rule.
Ticket: #5784.
Instead of tracking ip only rules by the internal signum, track them by
a separate counter that starts at zero. This results in dense
SigNumArrays instead of sparse ones and a much smaller max_idx.
Issue: 4578
Instead of a shared mpm context for just "file.data" or "file.magic"
use per alproto mpms. This way http file.data rules won't affect smb
file.data performance.
Ticket: #4378.
Inspect individual chunks in lossy traffic.
Don't use the frame idx as the inspection buffer idx. Engines are running
per frame, so multi inspection can be used for stream chunks instead.
Ticket: #4977.
Use the lzma-rs crate for decompressing swf/lzma files instead of
the lzma decompressor in libhtp. This decouples suricata from libhtp
except for actual http parsing, and means libhtp no longer has to
export a lzma decompression interface.
Ticket: #5638
Update APIs to store files in transactions instead of the per flow state.
Goal is to avoid the overhead of matching up files and transactions in
cases where there are many of both.
Update all protocol implementations to support this.
Update file logging logic to account for having files in transactions. Instead
of it acting separately on file containers, it is now tied into the
transaction logging.
Update the filestore keyword to consider a match if filestore output not
enabled.
Rules that look like they should be IP-only but contain a negated rule
address are now marked with an LIKE_IPONLY flag. This is so they are
treated like IPONLY rules with respect to flow action, but don't
interfere with other IPONLY processing like using the radix tree.
Ticket: #5361