Instead of the linked list of engines setup an array
with the engines. This should provide better locality.
Also shrink the engine structure so that we can fit
2 on a cacheline.
Remove the FreeFunc from the runtime engines. Engines
now have a 'gid' (global id) that can be used to look
up the registered Free function.
Inspect engines are called per signature per sigmatch list. Most
wrap around DetectEngineContentInspection, but it's more generic.
Until now, the inspect engines were setup in a large per ipproto,
per alproto, per direction table. For stateful inspection each
engine needed a global flag.
This approach had a number of issues:
1. inefficient: each inspection round walked the table and then
checked if the inspect engine was even needed for the current
rule.
2. clumsy registration with global flag registration.
3. global flag space was approaching the need for 64 bits
4. duplicate registration for alprotos supporting both TCP and
TCP (DNS).
This patch introduces a new approach.
First, it does away with the per ipproto engines. This wasn't used.
Second, it adds a per signature list of inspect engine containing
only those engines that actually apply to the rule.
Third, it gets rid of the global flags and replaces it with flags
assigned per rule per engine.
Register keywords globally at start up.
Create a map of the registery per detection engine. This we need because
the sgh_mpm_context value is set per detect engine.
Remove APP_MPMS_MAX.
If there are many rules for TCP flags these rules would be inspected
against each TCP packet. Even though the flags check is not expensive,
the combined cost of inspecting multiple rules against each and every
packet is high.
This patch implements a prefilter engine for flags. If a rule group
has rules looking for specific flags and engine for that flag or
flags combination is set up. This way those rules are only inspected
if the flag is actually present in the packet.
Introduce prefilter keyword to force a keyword to be used as prefilter.
e.g.
alert tcp any any -> any any (content:"A"; flags:R; prefilter; sid:1;)
alert tcp any any -> any any (content:"A"; flags:R; sid:2;)
alert tcp any any -> any any (content:"A"; dsize:1; prefilter; sid:3;)
alert tcp any any -> any any (content:"A"; dsize:1; sid:4;)
In sid 2 and 4 the content keyword is used in the MPM engine.
In sid 1 and 3 the flags and dsize keywords will be used.
In preparation of the introduction of more general purpose prefilter
engines, rename PatternMatcherQueue to PrefilterRuleStore. The new
engines will fill this structure a similar way to the current mpm
prefilters.
This adds a new keyword which permits to call the
bypass callback when a sig is matched.
The callback must be called when the match of the sig
is complete.
Detection plugin for TLS certificate fields notBefore and notAfter.
Supports equal to, less than, greater than, and range operations
for both keywords. Dates can be represented as either ISO 8601 or
epoch (Unix time).
Examples:
alert tls [...] tls_cert_notafter:1445852105; [...]
alert tls [...] tls_cert_notbefore:<2015-10-22T23:59:59; [...]
alert tls [...] tls_cert_notbefore:>2015-10-22; [...]
alert tls [...] tls_cert_notafter:2000-10-22<>2020-05-15; [...]
Moved and adapted code from detect-filemd5 to util-detect-file-hash,
generalised code to work with SHA-1 and SHA-256 and added necessary
flags and other constants.
Many rules have the same address vars, so instead of parsing them
each time use a hash to store the string and the parsed result.
Rules now reference the stored result in the hash table.
Make the file storage use the streaming buffer API.
As the individual file chunks were not needed by themselves, this
approach uses a chunkless implementation.
Convert HTTP body handling to use the Streaming Buffer API. This means
the HtpBodyChunks no longer maintain their own data segments, but
instead add their data to the StreamingBuffer instance in the HtpBody
structure.
In case the HtpBodyChunk needs to access it's data it can do so still
through the Streaming Buffer API.
Updates & simplifies the various users of the reassembled bodies:
multipart parsing and the detection engine.
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:
alert tls any any -> any any (msg:"SNI test"; tls_sni;
content:"example.com"; sid:12345;)
This new API allows for different SPM implementations, using a function
pointer table like that used for MPM.
This change also switches over the paths that make use of
DetectContentData (which previously used BoyerMoore directly) to the new
API.
Make the port grouping whitelisting configurable. A whitelisted port
ends up in it's own port group.
detect:
grouping:
tcp-whitelist: 80, 443
udp-whitelist: 53, 5060
No portranges are allowed at this point.
Create a hash table of unique DetectPort objects before trying to
create a unique list of these objects. This safes a lot of cycles
in the creation of the list.
SGH's for tcp and udp are now always only per proto and per direction.
This means we can simply reuse the packet and stream mpm pointers.
The SGH's for the other protocols already used a directionless catch
all mpm pointer.
Dump a json record containing all sigs that need to be inspected after
prefilter. Part of profiling. Only dump if threshold is met, which is
currently set by:
--set detect.profiling.inspect-logging-threshold=200
A file called packet_inspected_rules.json is created in the default
log dir.
Per rule group tracking of checks, use of lists, mpm matches,
post filter counts.
Logs SGH id so it can be compared with the rule_group.json output.
Implemented both in a human readable text format and a JSON format.
Turn list of mpm_ctx pointers into a union so that we don't waste
space. The sgh's for tcp and udp are in one direction only, so the
ts and tc ones are now in the union.
Instead of the binary yes/no whitelisting used so far, use different
values for different sorts of whitelist reasons. The port list will
be sorted by whitelist value first, then by rule count.
The goal is to whitelist groups that have weak sigs:
- 1 byte pattern groups
- SYN sigs
Rules that check for SYN packets are mostly scan detection rules.
They will be checked often as SYN packets are very common.
e.g. alert tcp any any -> any 22 (flags:S,12; sid:123;)
This patch adds whitelisting for SYN-sigs, so that the sigs end up
in as unique groups as possible.
- negated mpm sigs
Currently negated mpm sigs are inspected often, so they are quite
expensive. For this reason, try to whitelist them.
These values are set during 'stage 1', rule preprocessing.
Since SYN inspecting rules are expensive, this patch splits the
'non-mpm' list (i.e. the rules that are always considered) into
a 'syn' and 'non-syn' list. The SYN list is only inspected if the
packet has the SYN flag set, otherwise the non-syn list is used.
The syn-list contains _all_ rules. The non-syn list contains all
minus the rules requiring the SYN bit in a packet.
Replace tree based approach for rule grouping with a per port (tcp/udp)
and per protocol approach.
Grouping now looks like:
+----+
|icmp+--->
+----+
|gre +--->
+----+
|esp +--->
+----+
other|... |
+----->-----+
| |N +--->
| +----+
|
| tcp +----+ +----+
+----->+ 80 +-->+ 139+-->
| +----+ +----+
|
| udp +----+ +----+
+---+----->+ 53 +-->+ 135+-->
| +----+ +----+
|toserver
+--->
|toclient
|
+--->
So the first 'split' in the rules is the direction: toserver or toclient.
Rules that don't have a direction, are in both branches.
Then the split is between tcp/udp and the other protocols. For tcp and
udp port lists are used. For the other protocols, grouping is simply per
protocol.
The ports used are the destination ports for toserver sigs and source
ports for toclient sigs.
Denotes the max detection list so that rule validation can
allow post-detection lists to come after base64_data, but
disallow detection lists to come after it.
The mpm_uricontent_maxlen logic was meant to track the shortest
possible pattern in the MPM of a SGH. So a minlen more than a maxlen.
This patch replaces the complicated tracking logic by a simpler
scheme. When the SGH's are finalize, the minlen is calculated.
It also fixes a small corner case where the calculated "maxlen" could
be wrong. This would require a smaller pattern in a rule to be forced
as fast pattern.
To speed up startup with many tenants, tenant loading will be parallelized.
As no tempary threads should be used for these memory allocation heavy
tasks, this patch adds new type of 'command' thread that can be used to
load and reload tenants.
This patch hardcodes the number of loaders to 4. Future work will make it
dynamic.
The loader thread essentially sleeps constantly. When a tasks is sent to
it, it will wake up and execute it.
Make available to live mode and unix socket mode.
register-tenant:
Loads a new YAML, does basic validation.
Loads a new detection engine
Loads rules
Add new de_ctx to master store and stores tenant id in the de_ctx so
we can look it up by tenant id later.
unregister-tenant:
Gets the de_ctx, moves it to the freelist
Removes config
Introduce DetectEngineGetByTenantId, which gets a reference to the
detect engine by tenant id.
This commit do a find and replace of the following:
- DETECT_SM_LIST_HSBDMATCH by DETECT_SM_LIST_FILEDATA
sed -i 's/DETECT_SM_LIST_HSBDMATCH/DETECT_SM_LIST_FILEDATA/g' src/*
- HSBD by FILEDATA:
sed -i 's/HSBDMATCH/FILEDATA/g' src/*