Initial version of the 'FlowWorker' thread module. This module
combines Flow handling, TCP handling, App layer handling and
Detection in a single module. It does all flow related processing
under a single flow lock.
To simplify locking, move all locking out of the individual detect
code. Instead at the start of detection lock the flow, and at the
end of detection unlock it.
The lua code can be called without a lock still (from the output
code paths), so still pass around a lock hint to take care of this.
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:
alert tls any any -> any any (msg:"SNI test"; tls_sni;
content:"example.com"; sid:12345;)
detect.c:3801:13: warning: Value stored to 'tmplist2_tail' is never read
tmplist2_tail = joingr;
^ ~~~~~~
detect.c:3804:13: warning: Value stored to 'tmplist2_tail' is never read
tmplist2_tail = joingr;
^ ~~~~~~
2 warnings generated.
Make the rule grouping dump to rule_group.json configurable.
detect:
profiling:
grouping:
dump-to-disk: false
include-rules: false # very verbose
include-mpm-stats: false
Make the port grouping whitelisting configurable. A whitelisted port
ends up in it's own port group.
detect:
grouping:
tcp-whitelist: 80, 443
udp-whitelist: 53, 5060
No portranges are allowed at this point.
Create a hash table of unique DetectPort objects before trying to
create a unique list of these objects. This safes a lot of cycles
in the creation of the list.
Dump a json record containing all sigs that need to be inspected after
prefilter. Part of profiling. Only dump if threshold is met, which is
currently set by:
--set detect.profiling.inspect-logging-threshold=200
A file called packet_inspected_rules.json is created in the default
log dir.
Per rule group tracking of checks, use of lists, mpm matches,
post filter counts.
Logs SGH id so it can be compared with the rule_group.json output.
Implemented both in a human readable text format and a JSON format.
The idea is: if mpm is negated, it's both on mpm and nonmpm sid lists
and we can kick it out in that case during the merge sort.
It only works for patterns that are 'independent'. This means that the
rule doesn't need to only match if the negated mpm pattern is limited
to the first 10 bytes for example.
Or more generally, an negated mpm pattern that has depth, offset,
distance or within settings can't be handled this way. These patterns
are not added to the mpm at all, but just to to non-mpm list. This
makes sense as they will *always* need manual inspection.
Similarly, a pattern that is 'chopped' always needs validation. This
is because in this case we only inspect a part of the final pattern.
Instead of the binary yes/no whitelisting used so far, use different
values for different sorts of whitelist reasons. The port list will
be sorted by whitelist value first, then by rule count.
The goal is to whitelist groups that have weak sigs:
- 1 byte pattern groups
- SYN sigs
Rules that check for SYN packets are mostly scan detection rules.
They will be checked often as SYN packets are very common.
e.g. alert tcp any any -> any 22 (flags:S,12; sid:123;)
This patch adds whitelisting for SYN-sigs, so that the sigs end up
in as unique groups as possible.
- negated mpm sigs
Currently negated mpm sigs are inspected often, so they are quite
expensive. For this reason, try to whitelist them.
These values are set during 'stage 1', rule preprocessing.
Since SYN inspecting rules are expensive, this patch splits the
'non-mpm' list (i.e. the rules that are always considered) into
a 'syn' and 'non-syn' list. The SYN list is only inspected if the
packet has the SYN flag set, otherwise the non-syn list is used.
The syn-list contains _all_ rules. The non-syn list contains all
minus the rules requiring the SYN bit in a packet.
Update port grouping logic. Previously it would create one consistent
list w/o overlap. It largely still does this, except for the 'catch
all' port group at the end of the list. This port group contains all
the sigs that didn't fit into the other groups.
Replace tree based approach for rule grouping with a per port (tcp/udp)
and per protocol approach.
Grouping now looks like:
+----+
|icmp+--->
+----+
|gre +--->
+----+
|esp +--->
+----+
other|... |
+----->-----+
| |N +--->
| +----+
|
| tcp +----+ +----+
+----->+ 80 +-->+ 139+-->
| +----+ +----+
|
| udp +----+ +----+
+---+----->+ 53 +-->+ 135+-->
| +----+ +----+
|toserver
+--->
|toclient
|
+--->
So the first 'split' in the rules is the direction: toserver or toclient.
Rules that don't have a direction, are in both branches.
Then the split is between tcp/udp and the other protocols. For tcp and
udp port lists are used. For the other protocols, grouping is simply per
protocol.
The ports used are the destination ports for toserver sigs and source
ports for toclient sigs.
Cppcheck 1.72 gives a warning on the following code pattern:
char blah[32] = "";
snprintf(blah, sizeof(blah), "something");
The warning is:
(error) Buffer is accessed out of bounds.
While this appears to be a FP, in most cases the initialization to ""
was unnecessary as the snprintf statement immediately follows the
variable declaration.
The mpm_uricontent_maxlen logic was meant to track the shortest
possible pattern in the MPM of a SGH. So a minlen more than a maxlen.
This patch replaces the complicated tracking logic by a simpler
scheme. When the SGH's are finalize, the minlen is calculated.
It also fixes a small corner case where the calculated "maxlen" could
be wrong. This would require a smaller pattern in a rule to be forced
as fast pattern.
Due to a broken sequence number check, detect could fail to process
smsgs in case of a sequence wrap. This could lead to excessive use
of smsg's but also of segments, since these aren't cleared until the
smsg containing them is.
Store the tenant id in the flow and use the stored id when setting
up pesudo packets.
For tunnel and defrag packets, get tenant from parent. This will only
pass tenant_id's set at capture time.
For defrag packets, the tenant selector based on vlan id will still
work as the vlan id(s) are stored in the defrag tracker before being
passed on.
If a flow was 'pass'd, it means that no packet of it will flow be handled
by the detection engine. A side effect of this was that the per flow
inspect_id would never be moved forward. This in turn lead to a situation
where transactions wouldn't be freed.
This patch addresses this case by incrementing the inspect_id anyway for
the pass case.
Stream GAPs and stream reassembly depth are tracked per direction. In
many cases they will happen in one direction, but not in the other.
Example:
HTTP requests a generally smaller than responses. So on the response
side we may hit the depth limit, but not on the request side.
The asynchronious 'disruption' has a side effect in the transaction
engine. The 'progress' tracking would never mark such transactions
as complete, and thus some inspection and logging wouldn't happen
until the very last moment: when EOF's are passed around.
Especially in proxy environments with _very_ many transactions in a
single TCP connection, this could lead to serious resource issues. The
EOF handling would suddenly have to handle thousands or more
transactions. These transactions would have been stored for a long time.
This patch introduces the concept of disruption flags. Flags passed to
the tx progress logic that are and indication of disruptions in the
traffic or the traffic handling. The idea is that the progress is
marked as complete on disruption, even if a tx is not complete. This
allows the detection and logging engines to process the tx after which
it can be cleaned up.
The app layer state 'version' field is incremented with each update
to the state. It is used by the detection engine to see if the current
version of the state has already been inspected. Since app layer and
detect always run closely together there is no need for a big number
here. The detect code really only checks for equal/not-equal, so wrap
arounds are not an issue.
Instead, intruduce StreamTcpDisableAppLayer to disable app layer
tracking and reassembly. StreamTcpAppLayerIsDisabled can be used
to check it.
Replace all uses of FlowSetSessionNoApplayerInspectionFlag and
the FLOW_NO_APPLAYER_INSPECTION.
If for a packet we have a TX N that has detect state and a TX N+1 that
has no detect state, but does have 'progress', we have a corner case
in stateful detection.
ContinueDetection inspects TX N, but cannot flag the rule in the
de_state_sig_array as the next (TX N+1) has already started and needs
to be inspected. 'StartDetection' however, is then unaware of the fact
that ContinueDetection already inspected the rule. It uses the per
session 'inspect_id' that is only moved forward at the end of the
detection run.
This patch adds a workaround. It uses the DetectEngineThreadCtx::
de_state_sig_array to store an offset between the 'base' inspect_id
and the inspect_id that StartDetection should use. The data type is
limited, so if the offset would be too big, a search based fall back
is implemented as well.
This commit do a find and replace of the following:
- DETECT_SM_LIST_HSBDMATCH by DETECT_SM_LIST_FILEDATA
sed -i 's/DETECT_SM_LIST_HSBDMATCH/DETECT_SM_LIST_FILEDATA/g' src/*
- HSBD by FILEDATA:
sed -i 's/HSBDMATCH/FILEDATA/g' src/*
Initalize detection engine by configuration prefix.
DetectEngineCtxInitWithPrefix(const char *prefix)
Takes the detection engine configuration from:
<prefix>.<config>
If prefix is NULL the regular config will be used.
Update sure that DetectLoadCompleteSigPath considers the prefix when
retrieving the configuration.
Use separate data structures for storing TX and FLOW (AMATCH) detect
state.
- move state storing into util funcs
- remove de_state_m
- simplify reset state logic on reload
Set actions that are set directly from Signatures using the new
utility function DetectSignatureApplyActions. This will apply
the actions and also store info about the 'drop' that first made
the rule drop.
When reaching the end of either list, merging is no longer required,
simply walk down the other list.
If the non-MPM list can't have duplicates, it would be worth removing
the duplicate check for the non-MPM list when it is the only non-empty list
remaining.
Create a copy of the SigMatch data in the sig_lists linked-lists and store
it in an array for faster access and not next and previous pointers. The
array is then used when calling the Match() functions.
Gives a 7.7% speed up on one test.
The Match functions don't need a pointer to the SigMatch object, just the
context pointer contained inside, so pass the Context to the Match function
rather than the SigMatch object. This allows for further optimization.
Change SigMatch->ctx to have type SigMatchCtx* rather than void* for better
type checking. This requires adding type casts when using or assigning it.
The SigMatch contex should not be changed by the Match() funciton, so pass it
as a const SigMatchCtx*.
Read one signature pointer ahead to prefetch the value.
Use a variable, sflags, for s->flags, since it is used many times and the
compiles doesn't know that the signatures structure doesn't change, so it
will reload s->flags.
Use MPM and non-MPM lists to build our match array. Both lists are
sorted, and are merged and sorted into the match array.
This disables the old match array building code and thus also bypasses
the mask checking.
Array of rule id's that are not using MPM prefiltering. These will be
merged with the MPM results array. Together these should lead to a
list of all the rules that can possibly match.
Pmq add rule list: Array of uint32_t's to store (internal) sids from the MPM.
AC: store sids in the pattern list, append to Pmq::rule_id_array on match.
Detect: sort rule_id_array after it was set up by the MPM. Rule id's
(Signature::num) are ordered, and the rule's with the lowest id are to
be inspected first. As the MPM doesn't fill the array in order, but instead
'randomly' we need this sort step to assure proper inspection order.
Have -T / --init-errors-fatal process all rules so that it's easier
to debug problems in ruleset. Otherwise it can be a lengthy fix, test
error cycle if multiple rules have issues.
Convert empty rulefile error into a warning.
Bug #977
Add the modbus.function and subfunction) keywords for public function match in rules (Modbus layer).
Matching based on code function, and if necessary, sub-function code
or based on category (assigned, unassigned, public, user or reserved)
and negation is permitted.
Add the modbus.access keyword for read/write Modbus function match in rules (Modbus layer).
Matching based on access type (read or write),
and/or function type (discretes, coils, input or holding)
and, if necessary, read or write address access,
and, if necessary, value to write.
For address and value matching, "<", ">" and "<>" is permitted.
Based on TLS source code and file size source code (address and value matching).
Signed-off-by: David DIALLO <diallo@et.esia.fr>
doesn't need the gpu results and to release the packet for the next run.
Previously the inspection engine wouldn't inform the cuda module, if it
didn't need the results. As a consequence, when the packet is next taken
for re-use, and if the packet is still being processed by the cuda module,
the engine would wait till the cuda module frees the packet.
This commits updates this functionality to inform the cuda module to
release the packet for the afore-mentioned case.
Instead of error phrone externs with macro's, use functions with a local
static enum var instead.
- EngineModeIsIPS(): in IPS mode
- EngineModeIsIDS(): in IDS mode
To set the modes:
- EngineModeSetIDS(): IDS mode (default)
- EngineModeSetIPS(): IPS mode
Bug #1177.
Previously, the alstate use in the main detect loop was unsafe. The
alstate pointer would be set duing a lock, but it would again be used
after one or more lock/unlock cycles. If the data pointed to would
disappear, a dangling pointer would be the result.
Due to they way flows are cleaned up using reference counting and
such, changes of this happening were very small. However, at least
one path can lead to this situation. So it had to be fixed.
Make DeStateDetectContinueDetection get it's own alstate pointer instead
of using the one that was passed to it. We now get and use it only
inside a flow lock.
Make DeStateDetectStartDetection get it's own alstate pointer instead
of using the one that was passed to it. We now get and use it only
inside a flow lock.
In case of 'alert ip' rules that have ports, the port checks would
be bypassed for non-port protocols, such as ICMP. This would lead to
a rule matching: a false positive.
This patch adds a check. If the rule has a port setting other than
'any' and the protocol is not TCP, UDP or SCTP, then we rule won't
match.
Rules with 'alert ip' and ports are rare, so the impact should be
minimal.
Bug #611.
A number of unittests would lead to clang build errors because
of unsafe det_ctx ptr usage. This patch fixes these and inits
det_ctx to NULL in the other detect tests.
app-layer.[ch], app-layer-detect-proto.[ch] and app-layer-parser.[ch].
Things addressed in this commit:
- Brings out a proper separation between protocol detection phase and the
parser phase.
- The dns app layer now is registered such that we don't use "dnstcp" and
"dnsudp" in the rules. A user who previously wrote a rule like this -
"alert dnstcp....." or
"alert dnsudp....."
would now have to use,
alert dns (ipproto:tcp;) or
alert udp (app-layer-protocol:dns;) or
alert ip (ipproto:udp; app-layer-protocol:dns;)
The same rules extend to other another such protocol, dcerpc.
- The app layer parser api now takes in the ipproto while registering
callbacks.
- The app inspection/detection engine also takes an ipproto.
- All app layer parser functions now take direction as STREAM_TOSERVER or
STREAM_TOCLIENT, as opposed to 0 or 1, which was taken by some of the
functions.
- FlowInitialize() and FlowRecycle() now resets proto to 0. This is
needed by unittests, which would try to clean the flow, and that would
call the api, AppLayerParserCleanupParserState(), which would try to
clean the app state, but the app layer now needs an ipproto to figure
out which api to internally call to clean the state, and if the ipproto
is 0, it would return without trying to clean the state.
- A lot of unittests are now updated where if they are using a flow and
they need to use the app layer, we would set a flow ipproto.
- The "app-layer" section in the yaml conf has also been updated as well.
of the archaic features we use in the app layer. We will reintroduce this
parser shortly. Also do note that keywords that rely on the ssh parser
would now be disabled.
Detect when default_packet_size is zero, which enables zero-copy mode for
pfring and in that case, do what AF Packet does and set pkt_ext pointer to
the data and set PKT_ZERO_COPY flag.
The uint8_t *pkt in the Packet structure always points to the memory
immediately following the Packet structure. It is better to simply
calculate that value every time than store the 8 byte pointer.
In SigMatchSignatures, the value p->flow doens't change, but GCC can't
figure that out, so it reloads p->flow many times during the function.
When p->flow is loaded into the variable pflow once at the start of the
function, the compile then doesn't need to reload it.
Fix sigatures not supporting [10.0.0.0/24, !10.1.1.1] notation when
used directly in a rule instead of through a variable.
Add tests for Bugs #815 and #920.
When the PKT_NOPAYLOAD_INSPECTION flag is set, don't apply it to smsgs.
This way we can still inspect the outstanding smsgs.
The PKT_NOPAYLOAD_INSPECTION is set for encrypted traffic, and is combined
with disabling stream reassembly. So we only inspect the smsgs up to the
point of the disable detection point.
Keep a separate checksum for IPV4, since a packet can have both an IPV4
checksum and a TCPV4 checksum, or IPV4 and UDPV4 checksum.
This will allow future sharing of more values.
Use PACKET_RESET_CHECKSUMS() in Unit Tests in place of setting the
individual checksum values.
Packets that are rejected by the stream engine are not considered
part of an established tcp session. By allowing them to inspect
an smsg, some smsgs would not be properly inspected.
Copied SigTest26TCPV4Keyword and added check for invalid IPV4 checksums.
Created new SigTest26TCPV4AndIPV4Keyword test with a new packet with valid
IPV4 checksums.
When generating an alert and storing it in the packet, store the tx_id
as well. This way the output modules can log the tx_id and access the
proper tx for logging.
Issue #904.
Move SIMD the implementations of SigMatchSignaturesBuildMatchArray()
for SSE3 and Tile out of detect.c to reduce the size of the file.
Also moved SIMD unit tests to detect-simd.c
Makes use of 8-wide byte compare instructions in signature matching.
For allocating aligned memory, _mm_malloc() is SSE only, so added
check for __tile__ to use memalign() instead.
Shows a 13% speed up.
In debug validation mode, it is required to call application layer
parsing and other functions with a lock on flow. This patch updates
the code to do so.
Use test macro instead of direct access to action field.
This patch has been obtained by using the following
spatch file:
@@
Packet *p;
expression E;
@@
- p->action & E
+ TEST_PACKET_ACTION(p, E)
The action field in Packet structure should not be accessed
directly as the tunneled packet needs to update the root packet
and not the initial packet.
This patch is fixing issue #819 where suricata was not able to
drop fragmented packets in AF_PACKET IPS mode. It also fixes
drop capability for tunneled packets.
sigs(priority wise) inside staging.
Previously we would assign signums before sig ordering, and hence the
order didn't actually reflect the order of the sig in the
sig_list(assuming sig reordering changed the sig_list). Staging would
use the old sig_nums to decide the priority of sigs.
2. Fix sig ordering for flowvar, flowbits, flowint, pktvar sigs. We have
introduced a new priority to treat sigs with set + read as lower
priority compared to set only sigs.
3. Previously we treated sigs with a "priority(keyword)" > another sig's
priority, as a sig with greater priority than the later. We have
reversed it. Now the sig priority ordering is 1,2,.etc. Updated
sigordering unittests to reflect the same.
Improved accuracy, improved performance. Performance improvement
noticeable with http heavy traffic and ruleset.
A lot of other cosmetic changes carried out as well. Wrappers introduced
for a lot of app layer functions.
Failing dce unittests disabled. Will be reintroduced in the updated dce
engine.
Cross transaction matching taken care of. FPs emanating from these
matches have now disappeared. Double inspection of transactions taken
care of as well.
Bug #802
Flowvars are set from pcre, and lock the flow when being set. However
when HTTP buffers were inspected, flow was already locked: deadlock.
This patch introduces a post-match list in the detection engine thread
ctx, where store candidates are kept. Then a post-match function is used
to finalize the storing if the rule matches.
Solves the deadlock and brings the handling of flowvars more in line
with flowbits and flowints.
Rename struct DetectFigureFPAndId_t_ to DetectFPAndItsId_ and move it's
definition from inside the function where it's used to the global namespace,
as requested on #suricata.
Rename DetectEngineContentModifiedBufferSetup to DetectEngineContentModifierBufferSetup.
Also rename DetectFigureFPAndId() to DetectSetFastPatternAndItsId().
Updated DetectSetFastPatternAndItsId() to not exit on failure and return error.
All fp id assignment now happens in one go.
Also noticing a slight perf increase, probably emanating from improved cache
perf.
Removed irrelevant unittests as well.
Adds support for match-on conditions (src, dst, any, both)
Uses GEOIP_MEMORY_CACHE for performance reasons
Adds support for negation and multiple countries in the same rule
Bug fixes
Changed to take flow direction from rule, if present
Comments addressed. Unit tests added.
Treat sigs with negated addresses as non ip-only.
This fix exposes bug #608, which results in 2 failed unittest which
have now been disabled by this commit. Would be reenabled when we
have #608 fix in.
This patch update the glafs list to be able to indicate that a
flag is not supported. This information is used by list-keyword to
display information to the user.
The output of the list-keyword is modified to include the url to
the keyword documentation when this is available. All documented
keywords should have their link set.
list-keyword can be used with an optional value:
no option or short: display list of keywords
csv: display a csv output on info an all keywords
all: display a human readable output of keywords info
$KWD: display the info about one keyword.
This patch update the list-keyword command. Without any option,
the previous behavior is conserved. If 'all' is used as option,
suricata print a csv formatted output of keyword information:
name;features;description
If a keyword name is used as argument, suricata print a readable
message:
tls.subject
Features: state inspecting
Description: Match TLS/SSL certificate Subject field
When handling error case on SCMallog, SCCalloc or SCStrdup
we are in an unlikely case. This patch adds the unlikely()
expression to indicate this to gcc.
This patch has been obtained via coccinelle. The transformation
is the following:
@istested@
identifier x;
statement S1;
identifier func =~ "(SCMalloc|SCStrdup|SCCalloc)";
@@
x = func(...)
... when != x
- if (x == NULL) S1
+ if (unlikely(x == NULL)) S1