For SIEM analysis it is often useful to refer to the actual rules to
find out why a specific alert has been triggered when the signature
message does not convey enough information.
Turn on the new rule flag to include the rule text in eve alert output.
The feature is turned off by default.
With a rule like this:
alert dns $HOME_NET any -> 8.8.8.8 any (msg:"Google DNS server contacted"; sid:42;)
The eve alert output might look something like this (pretty-printed for
readability):
{
"timestamp": "2017-08-14T12:35:05.830812+0200",
"flow_id": 1919856770919772,
"in_iface": "eth0",
"event_type": "alert",
"src_ip": "10.20.30.40",
"src_port": 50968,
"dest_ip": "8.8.8.8",
"dest_port": 53,
"proto": "UDP",
"alert": {
"action": "allowed",
"gid": 1,
"signature_id": 42,
"rev": 0,
"signature": "Google DNS server contacted",
"category": "",
"severity": 3,
"rule": "alert dns $HOME_NET any -> 8.8.8.8 any (msg:\"Google DNS server contacted\"; sid:43;)"
},
"app_proto": "dns",
"flow": {
"pkts_toserver": 1,
"pkts_toclient": 0,
"bytes_toserver": 81,
"bytes_toclient": 0,
"start": "2017-08-14T12:35:05.830812+0200"
}
}
Feature #2020
This patch updates the Signature structure so it contains the
metadata under a key value form.
Later patch will make that dictionary available in the events.
Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines
This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.
When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.
Special handling for the stream engine:
If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.
This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.
Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.
When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.
Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.
Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.
Many cleanups and much refactoring.
Analyze flowbits to find which bits are only checked.
Track whether they are set and checked on the same level of 'statefulness'
for later used.
Dump flowbits to json including the sids that set/check etc the bit.
Use expectation to be able to identify connections that are
ftp data. It parses the PASV response, STOR message and the
RETR message to provide extraction of files.
Implementation in Rust of FTP messages parsing is available.
Also this patch changes some var name prefixed by ssh to ftp.
Reimplement keyword to match on SHA-1 fingerprint of TLS
certificate as a mpm keyword.
alert tls any any -> any (msg:"TLS cert fingerprint test";
tls_cert_fingerprint;
content:"4a:a3:66:76:82:cb:6b:23:bb:c3:58:47:23:a4:63:a7:78:a4:a1:18";
sid:12345;)
Since the parser now also does nfs2, the name nfs3 became confusing.
As it's still in beta, we can rename so this patch renames all 'nfs3'
logic to simply 'nfs'.
The target keyword allows rules writer to specify information about
target of the attack. Using this keyword in a signature causes
some fields to be added in the EVE output. It also fixes ambiguity
in the Prelude output.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
In preparation of turning input to keyword parsers to const add
options to the common rule parser to enforce and strip double
quotes and parse negation support.
At registration, the keyword can register 3 extra flags:
SIGMATCH_QUOTES_MANDATORY: value to keyword must be quoted
SIGMATCH_QUOTES_OPTIONAL: value to keyword may be quoted
SIGMATCH_HANDLE_NEGATION: leading ! is parsed
In all cases leading spaces are removed. If the 'quote' flags are
set, the quotes are removed from the input as well.
Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Now that detect moves the raw progress forward, it's important
to deal with the case where detect don't consider raw inspection.
If no 'stream' rules are active, disable raw. For this the disable
raw flag is now per stream.
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:
1. the messages had a fixed size, so blocks of data bigger than ~4k
would be cut into multiple messages
2. it lead to lots of data copying and unnecessary memory use
3. the StreamMsgs used a central pool
The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.
The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.
To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.
Implement common code to easily add more per HTTP header detection
keywords.
Implement http_accept sticky buffer. It operates on the HTTP Accept
header.
When detection is running flags are set on flows to indicate if file
hashing is needed. This is based on global output settings and rules.
In the case of --disable-detection this was not happening, so all
files where hashed with all methods. This has a significant
performance impact.
This patch adds logic to set the flow flags in --disable-detect mode.
Match on TLS certificate serial number using tls_cert_serial
keyword, e.g.:
alert tls any any -> any any (msg:"TLS cert serial test";
tls_cert_serial; content:"5C:19:B7:B1:32:3B:1C:A1";
sid:12345;)
Flowvars were already using a temporary store in the detect thread
ctx.
Use the same facility for pktvars. The reasons are:
1. packet is not always available, e.g. when running pcre on http
buffers.
2. setting of vars should be done post match. Until now it was also
possible that it is done on a partial match.
Until now variable names, such as flowbit names, were local to a detect
engine. This made sense as they were only ever used in that context.
For the purpose of logging these names, this needs a different approach.
The loggers live outside of the detect engine. Also, in the case of
reloads and multi-tenancy, there are even multiple detect engines, so
it would be even more tricky to access them from the outside.
This patch brings a new approach. A any time, there is a single active
hash table mapping the variable names and their id's. For multiple
tenants the table is shared between tenants.
The table is set up in a 'staging' area, where locking makes sure that
multiple loading threads don't mess things up. Then when the preparing
of a detection engine is ready, but before the detect threads are made
aware of the new detect engine, the active varname hash is swapped with
the staging instance.
For this to work, all the mappings from the 'current' or active mapping
are added to the staging table.
After the threads have reloaded and the new detection engine is active,
the old table can be freed.
For multi tenancy things are similar. The staging area is used for
setting up until the new detection engines / tenants are applied to
the system.
This patch also changes the variable 'id'/'idx' field to uint32_t. Due
to data structure padding and alignment, this should have no practical
drawback while allowing for a lot more vars.
Matches on the start of a HTTP request or response.
Uses a buffer constructed from the request line and normalized request
headers, including the Cookie header.
Or for the response side, it uses the response line plus the
normalized response headers, including the Set-Cookie header.
Both buffers are terminated by an extra \r\n.
A sticky buffer that allows content inspection on a contructed buffer
of HTTP header names. The buffer starts with \r\n, the names are
separated by \r\n and the end of the buffer contains an extra \r\n.
E.g. \r\nHost\r\nUser-Agent\r\n\r\n
The leading \r\n is to make sure one can match on a full name in all
cases.
Some keywords need a scratch space where they can do store the results
of expensive operations that remain valid for the time of a packets
journey through the detection engine.
An example is the reconstructed 'http_header' field, that is needed
in MPM, and then for each rule that manually inspects it. Storing this
data in the flow is a waste, and reconstructing multiple times on
demand as well.
This API allows for registering a keyword with an init and free function.
It it mean to be used an initialization time, when the keyword is
registered.
Introduce 'Protocol detection'-only rules. These rules will only be
fully evaluated when the protocol detection completed. To allow
mixing of the app-layer-protocol keyword with other types of matches
the keyword can also inspect the flow's app-protos per packet.
Implement prefilter for the 'PD-only' rules.
Add support for the ENIP/CIP Industrial protocol
This is an app layer implementation which uses the "enip" protocol
and "cip_service" and "enip_command" keywords
Implements AFL entry points
Instead of the linked list of engines setup an array
with the engines. This should provide better locality.
Also shrink the engine structure so that we can fit
2 on a cacheline.
Remove the FreeFunc from the runtime engines. Engines
now have a 'gid' (global id) that can be used to look
up the registered Free function.
Inspect engines are called per signature per sigmatch list. Most
wrap around DetectEngineContentInspection, but it's more generic.
Until now, the inspect engines were setup in a large per ipproto,
per alproto, per direction table. For stateful inspection each
engine needed a global flag.
This approach had a number of issues:
1. inefficient: each inspection round walked the table and then
checked if the inspect engine was even needed for the current
rule.
2. clumsy registration with global flag registration.
3. global flag space was approaching the need for 64 bits
4. duplicate registration for alprotos supporting both TCP and
TCP (DNS).
This patch introduces a new approach.
First, it does away with the per ipproto engines. This wasn't used.
Second, it adds a per signature list of inspect engine containing
only those engines that actually apply to the rule.
Third, it gets rid of the global flags and replaces it with flags
assigned per rule per engine.
Register keywords globally at start up.
Create a map of the registery per detection engine. This we need because
the sgh_mpm_context value is set per detect engine.
Remove APP_MPMS_MAX.
If there are many rules for TCP flags these rules would be inspected
against each TCP packet. Even though the flags check is not expensive,
the combined cost of inspecting multiple rules against each and every
packet is high.
This patch implements a prefilter engine for flags. If a rule group
has rules looking for specific flags and engine for that flag or
flags combination is set up. This way those rules are only inspected
if the flag is actually present in the packet.
Introduce prefilter keyword to force a keyword to be used as prefilter.
e.g.
alert tcp any any -> any any (content:"A"; flags:R; prefilter; sid:1;)
alert tcp any any -> any any (content:"A"; flags:R; sid:2;)
alert tcp any any -> any any (content:"A"; dsize:1; prefilter; sid:3;)
alert tcp any any -> any any (content:"A"; dsize:1; sid:4;)
In sid 2 and 4 the content keyword is used in the MPM engine.
In sid 1 and 3 the flags and dsize keywords will be used.
In preparation of the introduction of more general purpose prefilter
engines, rename PatternMatcherQueue to PrefilterRuleStore. The new
engines will fill this structure a similar way to the current mpm
prefilters.
This adds a new keyword which permits to call the
bypass callback when a sig is matched.
The callback must be called when the match of the sig
is complete.
Detection plugin for TLS certificate fields notBefore and notAfter.
Supports equal to, less than, greater than, and range operations
for both keywords. Dates can be represented as either ISO 8601 or
epoch (Unix time).
Examples:
alert tls [...] tls_cert_notafter:1445852105; [...]
alert tls [...] tls_cert_notbefore:<2015-10-22T23:59:59; [...]
alert tls [...] tls_cert_notbefore:>2015-10-22; [...]
alert tls [...] tls_cert_notafter:2000-10-22<>2020-05-15; [...]
Moved and adapted code from detect-filemd5 to util-detect-file-hash,
generalised code to work with SHA-1 and SHA-256 and added necessary
flags and other constants.
Many rules have the same address vars, so instead of parsing them
each time use a hash to store the string and the parsed result.
Rules now reference the stored result in the hash table.
Make the file storage use the streaming buffer API.
As the individual file chunks were not needed by themselves, this
approach uses a chunkless implementation.
Convert HTTP body handling to use the Streaming Buffer API. This means
the HtpBodyChunks no longer maintain their own data segments, but
instead add their data to the StreamingBuffer instance in the HtpBody
structure.
In case the HtpBodyChunk needs to access it's data it can do so still
through the Streaming Buffer API.
Updates & simplifies the various users of the reassembled bodies:
multipart parsing and the detection engine.
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:
alert tls any any -> any any (msg:"SNI test"; tls_sni;
content:"example.com"; sid:12345;)
This new API allows for different SPM implementations, using a function
pointer table like that used for MPM.
This change also switches over the paths that make use of
DetectContentData (which previously used BoyerMoore directly) to the new
API.
Make the port grouping whitelisting configurable. A whitelisted port
ends up in it's own port group.
detect:
grouping:
tcp-whitelist: 80, 443
udp-whitelist: 53, 5060
No portranges are allowed at this point.
Create a hash table of unique DetectPort objects before trying to
create a unique list of these objects. This safes a lot of cycles
in the creation of the list.
SGH's for tcp and udp are now always only per proto and per direction.
This means we can simply reuse the packet and stream mpm pointers.
The SGH's for the other protocols already used a directionless catch
all mpm pointer.