Reimplement keyword to match on SHA-1 fingerprint of TLS
certificate as a mpm keyword.
alert tls any any -> any (msg:"TLS cert fingerprint test";
tls_cert_fingerprint;
content:"4a:a3:66:76:82:cb:6b:23:bb:c3:58:47:23:a4:63:a7:78:a4:a1:18";
sid:12345;)
Rules can contain conflicting statements and lead to a unmatchable rule.
2 examples are rejected by this patch:
1. dsize < content
2. dsize < content@offset
Bug #2187
Since the parser now also does nfs2, the name nfs3 became confusing.
As it's still in beta, we can rename so this patch renames all 'nfs3'
logic to simply 'nfs'.
The target keyword allows rules writer to specify information about
target of the attack. Using this keyword in a signature causes
some fields to be added in the EVE output. It also fixes ambiguity
in the Prelude output.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Previously the MPM/Prefilter engines would suggest the same rule
candidates multiple times.
For example, while processing the request body, the http headers
would be inspected by MPM multiple times.
The mask check was one way to quickly decide which rules could be
skipped.
Now that the MPM engines normally return a rule just once, this
mask check no longer makes sense. If the rule meets the ip/port/
direction based conditions, it needs to be evaluated if the MPM
said so. Even if not all conditions are yet true.
WIP disable mask as it no longer makes sense
WIP redo mask match
If raw reassembly falls behind, for example because no raw mpm is
active, then we need to sync up to the app progress if that is
available, or to the generic tcp tracking otherwise.
Now that detect moves the raw progress forward, it's important
to deal with the case where detect don't consider raw inspection.
If no 'stream' rules are active, disable raw. For this the disable
raw flag is now per stream.
Remove the 'StreamMsg' approach from the engine. In this approach the
stream engine would create a list of chunks for inspection by the
detection engine. There were several issues:
1. the messages had a fixed size, so blocks of data bigger than ~4k
would be cut into multiple messages
2. it lead to lots of data copying and unnecessary memory use
3. the StreamMsgs used a central pool
The Stream engine switched over to the streaming buffer API, which
means that the reassembled data is always available. This made the
StreamMsg approach even clunkier.
The new approach exposes the streaming buffer data to the detection
engine. It has to pay attention to an important issue though: packet
loss. The data may have gaps. The streaming buffer API tracks the
blocks of continuous data.
To access the data for inspection a callback approach is used. The
'StreamReassembleRaw' function is called with a callback and data.
This way it runs the MPM and individual rule inspection code. At
the end of each detection run the stream engine is notified that it
can move forward it's 'progress'.
Make stream engine use the streaming buffer API for it's data storage.
This means that the data is stored in a single reassembled sliding
buffer. The subleties of the reassembly, e.g. overlap handling, are
taken care of at segment insertion.
The TcpSegments now have a StreamingBufferSegment that contains an
offset and a length. Using this the segment data can be retrieved
per segment.
Redo segment insertion. The insertion code is moved to it's own file
and is simplified a lot.
A major difference with the previous implementation is that the segment
list now contains overlapping segments if the traffic is that way.
Previously there could be more and smaller segments in the memory list
than what was seen on the wire.
Due to the matching of in memory segments and on the wire segments,
the overlap with different data detection (potential mots attacks)
is much more accurate.
Raw and App reassembly progress is no longer tracked per segment using
flags, but there is now a progress tracker in the TcpStream for each.
When pruning we make sure we don't slide beyond in-use segments. When
both app-layer and raw inspection are beyond the start of the segment
list, the segments might not be freed even though the data in the
streaming buffer is already gone. This is caused by the 'in-use' status
that the segments can implicitly have. This patch accounts for that
when calculating the 'left_edge' of the streaming window.
Raw reassembly still sets up 'StreamMsg' objects for content
inspection. They are set up based on either the full StreamingBuffer,
or based on the StreamingBufferBlocks if there are gaps in the data.
Reworked 'stream needs work' logic. When a flow times out the flow
engine checks whether a TCP flow still needs work. The
StreamNeedsReassembly function is used to test if a stream still has
unreassembled segments or uninspected stream chunks.
This patch updates the function to consider the app and/or raw
progress. It also cleans the function up and adds more meaningful
debug messages. Finally it makes it non-inline.
Unittests have been overhauled, and partly moved into their own files.
Remove lots of dead code.
Implement common code to easily add more per HTTP header detection
keywords.
Implement http_accept sticky buffer. It operates on the HTTP Accept
header.
Issue:
https://redmine.openinfosecfoundation.org/issues/2041
One approach to fixing this issue to just validate the
checksum instead of regenerating it and comparing it. This
method is used in some kernels and other network tools.
When validating, the current checksum is passed in as an
initial argument which will cause the final checksum to be 0
if OK. If generating a checksum, 0 is passed and the result
is the generated checksum.
When detection is running flags are set on flows to indicate if file
hashing is needed. This is based on global output settings and rules.
In the case of --disable-detection this was not happening, so all
files where hashed with all methods. This has a significant
performance impact.
This patch adds logic to set the flow flags in --disable-detect mode.
Match on TLS certificate serial number using tls_cert_serial
keyword, e.g.:
alert tls any any -> any any (msg:"TLS cert serial test";
tls_cert_serial; content:"5C:19:B7:B1:32:3B:1C:A1";
sid:12345;)
Flowvars were already using a temporary store in the detect thread
ctx.
Use the same facility for pktvars. The reasons are:
1. packet is not always available, e.g. when running pcre on http
buffers.
2. setting of vars should be done post match. Until now it was also
possible that it is done on a partial match.
Until now variable names, such as flowbit names, were local to a detect
engine. This made sense as they were only ever used in that context.
For the purpose of logging these names, this needs a different approach.
The loggers live outside of the detect engine. Also, in the case of
reloads and multi-tenancy, there are even multiple detect engines, so
it would be even more tricky to access them from the outside.
This patch brings a new approach. A any time, there is a single active
hash table mapping the variable names and their id's. For multiple
tenants the table is shared between tenants.
The table is set up in a 'staging' area, where locking makes sure that
multiple loading threads don't mess things up. Then when the preparing
of a detection engine is ready, but before the detect threads are made
aware of the new detect engine, the active varname hash is swapped with
the staging instance.
For this to work, all the mappings from the 'current' or active mapping
are added to the staging table.
After the threads have reloaded and the new detection engine is active,
the old table can be freed.
For multi tenancy things are similar. The staging area is used for
setting up until the new detection engines / tenants are applied to
the system.
This patch also changes the variable 'id'/'idx' field to uint32_t. Due
to data structure padding and alignment, this should have no practical
drawback while allowing for a lot more vars.
Matches on the start of a HTTP request or response.
Uses a buffer constructed from the request line and normalized request
headers, including the Cookie header.
Or for the response side, it uses the response line plus the
normalized response headers, including the Set-Cookie header.
Both buffers are terminated by an extra \r\n.
A sticky buffer that allows content inspection on a contructed buffer
of HTTP header names. The buffer starts with \r\n, the names are
separated by \r\n and the end of the buffer contains an extra \r\n.
E.g. \r\nHost\r\nUser-Agent\r\n\r\n
The leading \r\n is to make sure one can match on a full name in all
cases.
Some keywords need a scratch space where they can do store the results
of expensive operations that remain valid for the time of a packets
journey through the detection engine.
An example is the reconstructed 'http_header' field, that is needed
in MPM, and then for each rule that manually inspects it. Storing this
data in the flow is a waste, and reconstructing multiple times on
demand as well.
This API allows for registering a keyword with an init and free function.
It it mean to be used an initialization time, when the keyword is
registered.
To replace the hardcoded SigMatch list id's, use this API to register
and query lists by name.
Also allow for registering descriptions and whether mpm is supported.
Registration is only allowed at startup.
The code to get the rule group (sgh) would return the group for
IP proto 0 instead of nothing. This lead to certain types of rules
unintentionally matching (False Positive).
Since the packets weren't actually IP, the logged alert records
were missing the IP header.
Bug #2017.
Luajit has a strange memory requirement, it's 'states' need to be in the
first 2G of the process' memory.
This patch improves the pool approach by moving it to the front of the
start up.
A new config option 'luajit.states' is added to control how many states
are preallocated. It defaults to 128.
Add a warning when more states are used then preallocated. This may fail
if flow/stream/detect engines use a lot of memory. Add hint at exit that
gives the max states in use if it's higher than the default.
Introduce 'Protocol detection'-only rules. These rules will only be
fully evaluated when the protocol detection completed. To allow
mixing of the app-layer-protocol keyword with other types of matches
the keyword can also inspect the flow's app-protos per packet.
Implement prefilter for the 'PD-only' rules.
Add support for the ENIP/CIP Industrial protocol
This is an app layer implementation which uses the "enip" protocol
and "cip_service" and "enip_command" keywords
Implements AFL entry points
The order of keyword registration currently affects inspect engine
registration order and ultimately the order of inspect engines per
rule. Which in turn affects state keeping.
This patch makes sure the ordering is the same as with older
releases.
Inspect engines are called per signature per sigmatch list. Most
wrap around DetectEngineContentInspection, but it's more generic.
Until now, the inspect engines were setup in a large per ipproto,
per alproto, per direction table. For stateful inspection each
engine needed a global flag.
This approach had a number of issues:
1. inefficient: each inspection round walked the table and then
checked if the inspect engine was even needed for the current
rule.
2. clumsy registration with global flag registration.
3. global flag space was approaching the need for 64 bits
4. duplicate registration for alprotos supporting both TCP and
TCP (DNS).
This patch introduces a new approach.
First, it does away with the per ipproto engines. This wasn't used.
Second, it adds a per signature list of inspect engine containing
only those engines that actually apply to the rule.
Third, it gets rid of the global flags and replaces it with flags
assigned per rule per engine.
Register keywords globally at start up.
Create a map of the registery per detection engine. This we need because
the sgh_mpm_context value is set per detect engine.
Remove APP_MPMS_MAX.