Many rules have the same address vars, so instead of parsing them
each time use a hash to store the string and the parsed result.
Rules now reference the stored result in the hash table.
Make the file storage use the streaming buffer API.
As the individual file chunks were not needed by themselves, this
approach uses a chunkless implementation.
Convert HTTP body handling to use the Streaming Buffer API. This means
the HtpBodyChunks no longer maintain their own data segments, but
instead add their data to the StreamingBuffer instance in the HtpBody
structure.
In case the HtpBodyChunk needs to access it's data it can do so still
through the Streaming Buffer API.
Updates & simplifies the various users of the reassembled bodies:
multipart parsing and the detection engine.
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:
alert tls any any -> any any (msg:"SNI test"; tls_sni;
content:"example.com"; sid:12345;)
This new API allows for different SPM implementations, using a function
pointer table like that used for MPM.
This change also switches over the paths that make use of
DetectContentData (which previously used BoyerMoore directly) to the new
API.
Make the port grouping whitelisting configurable. A whitelisted port
ends up in it's own port group.
detect:
grouping:
tcp-whitelist: 80, 443
udp-whitelist: 53, 5060
No portranges are allowed at this point.
Create a hash table of unique DetectPort objects before trying to
create a unique list of these objects. This safes a lot of cycles
in the creation of the list.
SGH's for tcp and udp are now always only per proto and per direction.
This means we can simply reuse the packet and stream mpm pointers.
The SGH's for the other protocols already used a directionless catch
all mpm pointer.
Dump a json record containing all sigs that need to be inspected after
prefilter. Part of profiling. Only dump if threshold is met, which is
currently set by:
--set detect.profiling.inspect-logging-threshold=200
A file called packet_inspected_rules.json is created in the default
log dir.
Per rule group tracking of checks, use of lists, mpm matches,
post filter counts.
Logs SGH id so it can be compared with the rule_group.json output.
Implemented both in a human readable text format and a JSON format.
Turn list of mpm_ctx pointers into a union so that we don't waste
space. The sgh's for tcp and udp are in one direction only, so the
ts and tc ones are now in the union.
Instead of the binary yes/no whitelisting used so far, use different
values for different sorts of whitelist reasons. The port list will
be sorted by whitelist value first, then by rule count.
The goal is to whitelist groups that have weak sigs:
- 1 byte pattern groups
- SYN sigs
Rules that check for SYN packets are mostly scan detection rules.
They will be checked often as SYN packets are very common.
e.g. alert tcp any any -> any 22 (flags:S,12; sid:123;)
This patch adds whitelisting for SYN-sigs, so that the sigs end up
in as unique groups as possible.
- negated mpm sigs
Currently negated mpm sigs are inspected often, so they are quite
expensive. For this reason, try to whitelist them.
These values are set during 'stage 1', rule preprocessing.
Since SYN inspecting rules are expensive, this patch splits the
'non-mpm' list (i.e. the rules that are always considered) into
a 'syn' and 'non-syn' list. The SYN list is only inspected if the
packet has the SYN flag set, otherwise the non-syn list is used.
The syn-list contains _all_ rules. The non-syn list contains all
minus the rules requiring the SYN bit in a packet.
Replace tree based approach for rule grouping with a per port (tcp/udp)
and per protocol approach.
Grouping now looks like:
+----+
|icmp+--->
+----+
|gre +--->
+----+
|esp +--->
+----+
other|... |
+----->-----+
| |N +--->
| +----+
|
| tcp +----+ +----+
+----->+ 80 +-->+ 139+-->
| +----+ +----+
|
| udp +----+ +----+
+---+----->+ 53 +-->+ 135+-->
| +----+ +----+
|toserver
+--->
|toclient
|
+--->
So the first 'split' in the rules is the direction: toserver or toclient.
Rules that don't have a direction, are in both branches.
Then the split is between tcp/udp and the other protocols. For tcp and
udp port lists are used. For the other protocols, grouping is simply per
protocol.
The ports used are the destination ports for toserver sigs and source
ports for toclient sigs.
Denotes the max detection list so that rule validation can
allow post-detection lists to come after base64_data, but
disallow detection lists to come after it.