The order of keyword registration currently affects inspect engine
registration order and ultimately the order of inspect engines per
rule. Which in turn affects state keeping.
This patch makes sure the ordering is the same as with older
releases.
Inspect engines are called per signature per sigmatch list. Most
wrap around DetectEngineContentInspection, but it's more generic.
Until now, the inspect engines were setup in a large per ipproto,
per alproto, per direction table. For stateful inspection each
engine needed a global flag.
This approach had a number of issues:
1. inefficient: each inspection round walked the table and then
checked if the inspect engine was even needed for the current
rule.
2. clumsy registration with global flag registration.
3. global flag space was approaching the need for 64 bits
4. duplicate registration for alprotos supporting both TCP and
TCP (DNS).
This patch introduces a new approach.
First, it does away with the per ipproto engines. This wasn't used.
Second, it adds a per signature list of inspect engine containing
only those engines that actually apply to the rule.
Third, it gets rid of the global flags and replaces it with flags
assigned per rule per engine.
Register keywords globally at start up.
Create a map of the registery per detection engine. This we need because
the sgh_mpm_context value is set per detect engine.
Remove APP_MPMS_MAX.
Introduce prefilter keyword to force a keyword to be used as prefilter.
e.g.
alert tcp any any -> any any (content:"A"; flags:R; prefilter; sid:1;)
alert tcp any any -> any any (content:"A"; flags:R; sid:2;)
alert tcp any any -> any any (content:"A"; dsize:1; prefilter; sid:3;)
alert tcp any any -> any any (content:"A"; dsize:1; sid:4;)
In sid 2 and 4 the content keyword is used in the MPM engine.
In sid 1 and 3 the flags and dsize keywords will be used.
This adds a new keyword which permits to call the
bypass callback when a sig is matched.
The callback must be called when the match of the sig
is complete.
To be able to add a transaction counter we will need a ThreadVars
in the AppLayerParserParse function.
This function is massively used in unittests
and this result in an long commit.
Detection plugin for TLS certificate fields notBefore and notAfter.
Supports equal to, less than, greater than, and range operations
for both keywords. Dates can be represented as either ISO 8601 or
epoch (Unix time).
Examples:
alert tls [...] tls_cert_notafter:1445852105; [...]
alert tls [...] tls_cert_notbefore:<2015-10-22T23:59:59; [...]
alert tls [...] tls_cert_notbefore:>2015-10-22; [...]
alert tls [...] tls_cert_notafter:2000-10-22<>2020-05-15; [...]
The first packet in both directions of a flow looks up the rule group
(sgh) and stores it in the flow. This makes sure the lookup doesn't
have to be performed for each packet.
ICMPv4 error messages are connected to the TCP or UDP flow they apply
to. In the case of such an ICMP error being the first packet in a
flow's direction, this would lead to an issue.
The packet would look up the rule group based on the ICMP protocol,
not based on the embedded TCP/UDP. This makes sense, as the ICMP
packet is inspected as ICMP packet. The consequence however, was that
this rule group pointer (sgh) would be stored in the flow. This is
wrong, as TCP/UDP packets that follow the ICMP packet would have no sgh
or the wrong sgh.
In normal traffic this shouldn't normally happen, but it could be
used to evade Suricata's inspection.
Initial version of the 'FlowWorker' thread module. This module
combines Flow handling, TCP handling, App layer handling and
Detection in a single module. It does all flow related processing
under a single flow lock.
To simplify locking, move all locking out of the individual detect
code. Instead at the start of detection lock the flow, and at the
end of detection unlock it.
The lua code can be called without a lock still (from the output
code paths), so still pass around a lock hint to take care of this.
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:
alert tls any any -> any any (msg:"SNI test"; tls_sni;
content:"example.com"; sid:12345;)
detect.c:3801:13: warning: Value stored to 'tmplist2_tail' is never read
tmplist2_tail = joingr;
^ ~~~~~~
detect.c:3804:13: warning: Value stored to 'tmplist2_tail' is never read
tmplist2_tail = joingr;
^ ~~~~~~
2 warnings generated.
Make the rule grouping dump to rule_group.json configurable.
detect:
profiling:
grouping:
dump-to-disk: false
include-rules: false # very verbose
include-mpm-stats: false
Make the port grouping whitelisting configurable. A whitelisted port
ends up in it's own port group.
detect:
grouping:
tcp-whitelist: 80, 443
udp-whitelist: 53, 5060
No portranges are allowed at this point.
Create a hash table of unique DetectPort objects before trying to
create a unique list of these objects. This safes a lot of cycles
in the creation of the list.
Dump a json record containing all sigs that need to be inspected after
prefilter. Part of profiling. Only dump if threshold is met, which is
currently set by:
--set detect.profiling.inspect-logging-threshold=200
A file called packet_inspected_rules.json is created in the default
log dir.
Per rule group tracking of checks, use of lists, mpm matches,
post filter counts.
Logs SGH id so it can be compared with the rule_group.json output.
Implemented both in a human readable text format and a JSON format.
The idea is: if mpm is negated, it's both on mpm and nonmpm sid lists
and we can kick it out in that case during the merge sort.
It only works for patterns that are 'independent'. This means that the
rule doesn't need to only match if the negated mpm pattern is limited
to the first 10 bytes for example.
Or more generally, an negated mpm pattern that has depth, offset,
distance or within settings can't be handled this way. These patterns
are not added to the mpm at all, but just to to non-mpm list. This
makes sense as they will *always* need manual inspection.
Similarly, a pattern that is 'chopped' always needs validation. This
is because in this case we only inspect a part of the final pattern.
Instead of the binary yes/no whitelisting used so far, use different
values for different sorts of whitelist reasons. The port list will
be sorted by whitelist value first, then by rule count.
The goal is to whitelist groups that have weak sigs:
- 1 byte pattern groups
- SYN sigs
Rules that check for SYN packets are mostly scan detection rules.
They will be checked often as SYN packets are very common.
e.g. alert tcp any any -> any 22 (flags:S,12; sid:123;)
This patch adds whitelisting for SYN-sigs, so that the sigs end up
in as unique groups as possible.
- negated mpm sigs
Currently negated mpm sigs are inspected often, so they are quite
expensive. For this reason, try to whitelist them.
These values are set during 'stage 1', rule preprocessing.
Since SYN inspecting rules are expensive, this patch splits the
'non-mpm' list (i.e. the rules that are always considered) into
a 'syn' and 'non-syn' list. The SYN list is only inspected if the
packet has the SYN flag set, otherwise the non-syn list is used.
The syn-list contains _all_ rules. The non-syn list contains all
minus the rules requiring the SYN bit in a packet.
Update port grouping logic. Previously it would create one consistent
list w/o overlap. It largely still does this, except for the 'catch
all' port group at the end of the list. This port group contains all
the sigs that didn't fit into the other groups.
Replace tree based approach for rule grouping with a per port (tcp/udp)
and per protocol approach.
Grouping now looks like:
+----+
|icmp+--->
+----+
|gre +--->
+----+
|esp +--->
+----+
other|... |
+----->-----+
| |N +--->
| +----+
|
| tcp +----+ +----+
+----->+ 80 +-->+ 139+-->
| +----+ +----+
|
| udp +----+ +----+
+---+----->+ 53 +-->+ 135+-->
| +----+ +----+
|toserver
+--->
|toclient
|
+--->
So the first 'split' in the rules is the direction: toserver or toclient.
Rules that don't have a direction, are in both branches.
Then the split is between tcp/udp and the other protocols. For tcp and
udp port lists are used. For the other protocols, grouping is simply per
protocol.
The ports used are the destination ports for toserver sigs and source
ports for toclient sigs.
Cppcheck 1.72 gives a warning on the following code pattern:
char blah[32] = "";
snprintf(blah, sizeof(blah), "something");
The warning is:
(error) Buffer is accessed out of bounds.
While this appears to be a FP, in most cases the initialization to ""
was unnecessary as the snprintf statement immediately follows the
variable declaration.
The mpm_uricontent_maxlen logic was meant to track the shortest
possible pattern in the MPM of a SGH. So a minlen more than a maxlen.
This patch replaces the complicated tracking logic by a simpler
scheme. When the SGH's are finalize, the minlen is calculated.
It also fixes a small corner case where the calculated "maxlen" could
be wrong. This would require a smaller pattern in a rule to be forced
as fast pattern.
Due to a broken sequence number check, detect could fail to process
smsgs in case of a sequence wrap. This could lead to excessive use
of smsg's but also of segments, since these aren't cleared until the
smsg containing them is.
Store the tenant id in the flow and use the stored id when setting
up pesudo packets.
For tunnel and defrag packets, get tenant from parent. This will only
pass tenant_id's set at capture time.
For defrag packets, the tenant selector based on vlan id will still
work as the vlan id(s) are stored in the defrag tracker before being
passed on.
If a flow was 'pass'd, it means that no packet of it will flow be handled
by the detection engine. A side effect of this was that the per flow
inspect_id would never be moved forward. This in turn lead to a situation
where transactions wouldn't be freed.
This patch addresses this case by incrementing the inspect_id anyway for
the pass case.
Stream GAPs and stream reassembly depth are tracked per direction. In
many cases they will happen in one direction, but not in the other.
Example:
HTTP requests a generally smaller than responses. So on the response
side we may hit the depth limit, but not on the request side.
The asynchronious 'disruption' has a side effect in the transaction
engine. The 'progress' tracking would never mark such transactions
as complete, and thus some inspection and logging wouldn't happen
until the very last moment: when EOF's are passed around.
Especially in proxy environments with _very_ many transactions in a
single TCP connection, this could lead to serious resource issues. The
EOF handling would suddenly have to handle thousands or more
transactions. These transactions would have been stored for a long time.
This patch introduces the concept of disruption flags. Flags passed to
the tx progress logic that are and indication of disruptions in the
traffic or the traffic handling. The idea is that the progress is
marked as complete on disruption, even if a tx is not complete. This
allows the detection and logging engines to process the tx after which
it can be cleaned up.
The app layer state 'version' field is incremented with each update
to the state. It is used by the detection engine to see if the current
version of the state has already been inspected. Since app layer and
detect always run closely together there is no need for a big number
here. The detect code really only checks for equal/not-equal, so wrap
arounds are not an issue.
Instead, intruduce StreamTcpDisableAppLayer to disable app layer
tracking and reassembly. StreamTcpAppLayerIsDisabled can be used
to check it.
Replace all uses of FlowSetSessionNoApplayerInspectionFlag and
the FLOW_NO_APPLAYER_INSPECTION.
If for a packet we have a TX N that has detect state and a TX N+1 that
has no detect state, but does have 'progress', we have a corner case
in stateful detection.
ContinueDetection inspects TX N, but cannot flag the rule in the
de_state_sig_array as the next (TX N+1) has already started and needs
to be inspected. 'StartDetection' however, is then unaware of the fact
that ContinueDetection already inspected the rule. It uses the per
session 'inspect_id' that is only moved forward at the end of the
detection run.
This patch adds a workaround. It uses the DetectEngineThreadCtx::
de_state_sig_array to store an offset between the 'base' inspect_id
and the inspect_id that StartDetection should use. The data type is
limited, so if the offset would be too big, a search based fall back
is implemented as well.
This commit do a find and replace of the following:
- DETECT_SM_LIST_HSBDMATCH by DETECT_SM_LIST_FILEDATA
sed -i 's/DETECT_SM_LIST_HSBDMATCH/DETECT_SM_LIST_FILEDATA/g' src/*
- HSBD by FILEDATA:
sed -i 's/HSBDMATCH/FILEDATA/g' src/*
Initalize detection engine by configuration prefix.
DetectEngineCtxInitWithPrefix(const char *prefix)
Takes the detection engine configuration from:
<prefix>.<config>
If prefix is NULL the regular config will be used.
Update sure that DetectLoadCompleteSigPath considers the prefix when
retrieving the configuration.
Use separate data structures for storing TX and FLOW (AMATCH) detect
state.
- move state storing into util funcs
- remove de_state_m
- simplify reset state logic on reload
Set actions that are set directly from Signatures using the new
utility function DetectSignatureApplyActions. This will apply
the actions and also store info about the 'drop' that first made
the rule drop.
When reaching the end of either list, merging is no longer required,
simply walk down the other list.
If the non-MPM list can't have duplicates, it would be worth removing
the duplicate check for the non-MPM list when it is the only non-empty list
remaining.
Create a copy of the SigMatch data in the sig_lists linked-lists and store
it in an array for faster access and not next and previous pointers. The
array is then used when calling the Match() functions.
Gives a 7.7% speed up on one test.
The Match functions don't need a pointer to the SigMatch object, just the
context pointer contained inside, so pass the Context to the Match function
rather than the SigMatch object. This allows for further optimization.
Change SigMatch->ctx to have type SigMatchCtx* rather than void* for better
type checking. This requires adding type casts when using or assigning it.
The SigMatch contex should not be changed by the Match() funciton, so pass it
as a const SigMatchCtx*.
Read one signature pointer ahead to prefetch the value.
Use a variable, sflags, for s->flags, since it is used many times and the
compiles doesn't know that the signatures structure doesn't change, so it
will reload s->flags.
Use MPM and non-MPM lists to build our match array. Both lists are
sorted, and are merged and sorted into the match array.
This disables the old match array building code and thus also bypasses
the mask checking.
Array of rule id's that are not using MPM prefiltering. These will be
merged with the MPM results array. Together these should lead to a
list of all the rules that can possibly match.
Pmq add rule list: Array of uint32_t's to store (internal) sids from the MPM.
AC: store sids in the pattern list, append to Pmq::rule_id_array on match.
Detect: sort rule_id_array after it was set up by the MPM. Rule id's
(Signature::num) are ordered, and the rule's with the lowest id are to
be inspected first. As the MPM doesn't fill the array in order, but instead
'randomly' we need this sort step to assure proper inspection order.
Have -T / --init-errors-fatal process all rules so that it's easier
to debug problems in ruleset. Otherwise it can be a lengthy fix, test
error cycle if multiple rules have issues.
Convert empty rulefile error into a warning.
Bug #977
Add the modbus.function and subfunction) keywords for public function match in rules (Modbus layer).
Matching based on code function, and if necessary, sub-function code
or based on category (assigned, unassigned, public, user or reserved)
and negation is permitted.
Add the modbus.access keyword for read/write Modbus function match in rules (Modbus layer).
Matching based on access type (read or write),
and/or function type (discretes, coils, input or holding)
and, if necessary, read or write address access,
and, if necessary, value to write.
For address and value matching, "<", ">" and "<>" is permitted.
Based on TLS source code and file size source code (address and value matching).
Signed-off-by: David DIALLO <diallo@et.esia.fr>
doesn't need the gpu results and to release the packet for the next run.
Previously the inspection engine wouldn't inform the cuda module, if it
didn't need the results. As a consequence, when the packet is next taken
for re-use, and if the packet is still being processed by the cuda module,
the engine would wait till the cuda module frees the packet.
This commits updates this functionality to inform the cuda module to
release the packet for the afore-mentioned case.