Buffers with transforms are based on the non-transformed "base"
buffer, with a new ID assigned and the transform callbacks added.
This patch stores the id of the original buffer in the new buffer
inspect and prefilter structures. This way the buffers with and
without transforms can share some of the logic are progression
of file and body inspection trackers.
Related tickets: #4361#4199#3616
In some cases, the InspectionBufferGet function would be followed by
a failure to set the buffer up, for example due to a HTTP body limit
not yet being reached. Yet each call to InspectionBufferGet would lead
to the matching list_id to be added to the
DetectEngineThreadCtx::inspect.to_clear_queue. This array is sized to
add each list only once, but in this case the same id could be added
multiple times, potentially overflowing the array.
This commit adds support for transform-specific options. During Setup,
transforms have the signature string available for options detection.
When a transform detects an option, it should convert the option into an
internal format and supply a pointer to this format as the last argument
to DetectSignatureAddTransform.
Transforms that support options must provide a function in their
Sigmatch table entry. When the transform is freed, a pointer to the
internal format of the option is passed to this function.
atoi() and related functions lack a mechanism for reporting errors for
invalid values. Replace them with calls to the appropriate
ByteExtractString* functions.
Partially closes redmine ticket #3053.
In case of transform issues (transform not consumed before pkt_data
for example), the code would hit an ugly BUG_ON.
Address this by a more graceful error message, that will still
invalidate the sig but not crash the engine.
When registing a detection engine, check that the app-layer
protocol supports tx detect flags. Exit with a fatal
error if it does not as this is a code implementation
error that should be resolved during development.
Instead of the hardcode L4 matching in MPM that was recently introduced,
add an API similar to the AppLayer MPM and inspect engines.
Share part of the registration code with the AppLayer.
Implement for the tcp.hdr and udp.hdr keywords.
Prepare MPM part of the detection engine for a new type of per
packet matching, where the L4 header will be inspected.
Preparation for TCP header inspection keyword.
Instead of hard coded calls to the inspection logic for
payload inspection and 'MATCH'-list inspection use a callback
approach. This will register a callback per 'sm_list' much like
how app-layer inspect engines are registered.
This will allow for adding more types later without adding
runtime overhead.
Implement the callback for the PMATCH and MATCH logic.
Fix and Optimize cleanup. For the simple single inspect buffer optimize
the cleanup by keeping track of the actually used buffers. This avoid
looping over unused buffers.
Fix the case of cleaning not being done after a tx if the next tx is
also inspected in the context of the same packet.
Fix cleanup of the multi-inspect buffers. Optimize in 2 ways. First
like with single keep track of which multi-inspect buffers have been
used. Second, keep a max of the buffers within a multi-inspect buffer.
Use this max to limit (nested) looping.
Add device to tenant mapping support:
mappings:
- device: ens5f0
tenant-id: 1
- device: ens5f1
tenant-id: 23
Implemented by assigning the tenant id to the 'livedev', which means
it's only supported for capture methods that use the livedev API.
It's also currently not supported for IPS. In a case like 'eth0 -> eth1'
it's unclear which tenant should be used for the return traffic in a
flow, where the incoming device is 'eth1'.
Last multi-detect changes broken delayed-detect by refusing to reload
a 'stub' detect engine. This patch distinguishes between a stub for
multi-tenancy and for delayed detect.
There are 3 types of detect engine objects:
1. normal
The normal detection engine if no multi-tenancy is in use
2. tenant
A per tenant detection engine
3. stub
A stub (or minimal as it was called before) detect engine
that is needed to have something in place when there are
only tenants.
A stub is also used in case of 'delayed detect', where we
need a minimal detect engine to start up which is replaced
by a full (normal type) detect engine after startup.
This patch adds a new field 'type' to the DetectEngineCtx object
to distinguish between the types. This replaces the boolean 'minimal'.
The global keyword registration and per thread init handling used
the lock from the DetectEngineMasterCtx. This lead to a dead lock
situation at multi-tenancy tenant reloads.
The lock was unnecessary however, as the only time the registration
list is updated is at engine initialization. At that time Suricata
is still running in a single thread. After this, the data structure
doesn't change anymore.
Bug #2516.
As we can have multiple files per TX we use the multi inspect
buffer support.
By using this API file_data supports transforms.
Redo part of the flash decompression as a hard coded built-in sort
of transform.
Move previously global table into detect engine ctx. Now that we
can register buffers at rule loading time we need to take concurrency
into account.
Move DetectBufferType to detect.h and update DetectBufferCtx API calls
to include a detect engine ctx reference.
Introduce InspectionBuffer a structure for passing data between
prefilters, transforms and inspection engines.
At rule parsing time, we'll register new unique 'DetectBufferType's
for a 'parent' buffer (e.g. pure file_data) with its transformations.
Each unique combination of buffer with transformations gets it's
own buffer id.
Similarly, mpm registration and inspect engine registration will be
copied from the 'parent' (again, e.g. pure file_data) to the new id's.
The transforms are called from within the prefilter engines themselves.
Provide generic MPM matching and setup callbacks. Can be used by
keywords to avoid needless code duplication. Supports transformations.
Use unique name for profiling, to distinguish between pure buffers
and buffers with transformation.
Add new registration calls for mpm/prefilters and inspect engines.
Inspect engine api v2: Pass engine to itself. Add generic engine that
uses GetData callback and other registered settings.
The generic engine should be usable for every 'simple' case where
there is just a single non-streaming buffer. For example HTTP uri.
The v2 API assumes that registered MPM implements transformations.
Add util func to set new transform in rule and add util funcs for rule
parsing.
During the inspection phase actually is not possible to catch
an error if it occurs.
This patch permits to store events in the detection engine
such that we can match on events and catch them.
Metadata of the signature can now conditionaly put in the alert
events. This will allow user to get more context about the events
generated by the alert.
detect-metadata: conditional parsing
Only parses metadata if an output module will use the information.
Patch also adds a unittest to check metadata is not parsed if not
asked to.
output-json-alert: optional output keys as array
Update rule metadata configuration to have an option to output
value as array. Also adds an option to log only a series of keys
as array. This is useful in the case of some ruleset where from
instance the `tag` key is used multiple time.
(Jason Ish) rule metadata: always log as lists
After review of rule metadata, we can't make assumptions
on what should be a list or not. So log everything as a list.
Fix the inspection of multiple files in a single TX, where new files
may be added to the TX after inspection started.
Assign the hard coded id DE_STATE_FLAG_FILE_INSPECT to the file
inspect engine.
Make sure that sigs that do file inspection and don't match on the
current file always store a detailed state. This state will include
the DE_STATE_FLAG_FILE_INSPECT flag.
When the app-layer indicates a new file is available, for each sig
that has the DE_STATE_FLAG_FILE_INSPECT flag set, reset part of the
state so that the sig is evaluated again.
Use per tx detect_flags to track prefilter. Detect flags are used for 2
things:
1. marking tx as fully inspected
2. tracking already run prefilter (incl mpm) engines
This supercedes the MpmIDs API for directionless tracking
of the prefilter engines.
When we have no SGH we have to flag the txs that are 'complete'
as inspected as well.
Special handling for the stream engine:
If a rule mixes TX inspection and STREAM inspection, we can encounter
the case where the rule is evaluated against multiple transactions
during a single inspection run. As the stream data is exactly the same
for each of those runs, it's wasteful to rerun inspection of the stream
portion of the rule.
This patch enables caching of the stream 'inspect engine' result in
the local 'RuleMatchCandidateTx' array. This is valid only during the
live of a single inspection run.
Remove stateful inspection from 'mask' (SignatureMask). The mask wasn't
used in most cases for those rules anyway, as there we rely on the
prefilter. Add a alproto check to catch the remaining cases.
When building the active non-mpm/non-prefilter list check not just
the mask, but also the alproto. This especially helps stateful rules
with negated mpm.
Simplify AppLayerParserHasDecoderEvents usage in detection to only
return true if protocol detection events are set. Other detection is done
in inspect engines.
Move rule group lookup and handling into it's own function. Handle
'post lookup' tasks immediately, instead of after the first detect
run. The tasks were independent of the initial detection.
Many cleanups and much refactoring.
Remove the DONE state to fix a problem with state not being
changed correctly when multiple reload were done. As DONE was
not really useful, we can remove it.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
Now that MPM runs when the TX progress is right, stateful detection
operates differently.
Changes:
1. raw stream inspection is now also an inspect engine
Since this engine doesn't take the transactions into account, it
could potentially run multiple times on the same data. To avoid
this, basic result caching is in place.
2. the engines are sorted by progress, but the 'MPM' engine is first
even if the progress is higher
If MPM flags a rule to be inspected, the inspect engine for that
buffer runs first. If this step fails, the rule is no longer
evaluated. No state is stored.
Until now variable names, such as flowbit names, were local to a detect
engine. This made sense as they were only ever used in that context.
For the purpose of logging these names, this needs a different approach.
The loggers live outside of the detect engine. Also, in the case of
reloads and multi-tenancy, there are even multiple detect engines, so
it would be even more tricky to access them from the outside.
This patch brings a new approach. A any time, there is a single active
hash table mapping the variable names and their id's. For multiple
tenants the table is shared between tenants.
The table is set up in a 'staging' area, where locking makes sure that
multiple loading threads don't mess things up. Then when the preparing
of a detection engine is ready, but before the detect threads are made
aware of the new detect engine, the active varname hash is swapped with
the staging instance.
For this to work, all the mappings from the 'current' or active mapping
are added to the staging table.
After the threads have reloaded and the new detection engine is active,
the old table can be freed.
For multi tenancy things are similar. The staging area is used for
setting up until the new detection engines / tenants are applied to
the system.
This patch also changes the variable 'id'/'idx' field to uint32_t. Due
to data structure padding and alignment, this should have no practical
drawback while allowing for a lot more vars.
Some keywords need a scratch space where they can do store the results
of expensive operations that remain valid for the time of a packets
journey through the detection engine.
An example is the reconstructed 'http_header' field, that is needed
in MPM, and then for each rule that manually inspects it. Storing this
data in the flow is a waste, and reconstructing multiple times on
demand as well.
This API allows for registering a keyword with an init and free function.
It it mean to be used an initialization time, when the keyword is
registered.
To replace the hardcoded SigMatch list id's, use this API to register
and query lists by name.
Also allow for registering descriptions and whether mpm is supported.
Registration is only allowed at startup.
For lists that are registered multiple times, like http_header and
http_cookie, making the engines owner of the lists is complicated.
Multiple engines in a sig may be pointing to the same list. To
address this the 'free' code needs to be extra careful about not
double freeing, so it takes an approach to first fill an array
of the to-free pointers before freeing them.
Add support for the ENIP/CIP Industrial protocol
This is an app layer implementation which uses the "enip" protocol
and "cip_service" and "enip_command" keywords
Implements AFL entry points
Move engine and registration into the keyword file.
Register as 'ALPROTO_UNKNOWN' instead of per alproto. The
registration will only apply it to those rules that have
events set.
Inspect engines are called per signature per sigmatch list. Most
wrap around DetectEngineContentInspection, but it's more generic.
Until now, the inspect engines were setup in a large per ipproto,
per alproto, per direction table. For stateful inspection each
engine needed a global flag.
This approach had a number of issues:
1. inefficient: each inspection round walked the table and then
checked if the inspect engine was even needed for the current
rule.
2. clumsy registration with global flag registration.
3. global flag space was approaching the need for 64 bits
4. duplicate registration for alprotos supporting both TCP and
TCP (DNS).
This patch introduces a new approach.
First, it does away with the per ipproto engines. This wasn't used.
Second, it adds a per signature list of inspect engine containing
only those engines that actually apply to the rule.
Third, it gets rid of the global flags and replaces it with flags
assigned per rule per engine.
Register keywords globally at start up.
Create a map of the registery per detection engine. This we need because
the sgh_mpm_context value is set per detect engine.
Remove APP_MPMS_MAX.
Many rules have the same address vars, so instead of parsing them
each time use a hash to store the string and the parsed result.
Rules now reference the stored result in the hash table.
When running in live mode, the new default 'auto' value of
unix-command.enabled causes unix-command to be activated. This
will allow users of live capture to benefit from the feature and
result in no side effect for user running in offline capture.
Make the file storage use the streaming buffer API.
As the individual file chunks were not needed by themselves, this
approach uses a chunkless implementation.
Convert HTTP body handling to use the Streaming Buffer API. This means
the HtpBodyChunks no longer maintain their own data segments, but
instead add their data to the StreamingBuffer instance in the HtpBody
structure.
In case the HtpBodyChunk needs to access it's data it can do so still
through the Streaming Buffer API.
Updates & simplifies the various users of the reassembled bodies:
multipart parsing and the detection engine.
Initial version of the 'FlowWorker' thread module. This module
combines Flow handling, TCP handling, App layer handling and
Detection in a single module. It does all flow related processing
under a single flow lock.
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:
alert tls any any -> any any (msg:"SNI test"; tls_sni;
content:"example.com"; sid:12345;)
This new API allows for different SPM implementations, using a function
pointer table like that used for MPM.
This change also switches over the paths that make use of
DetectContentData (which previously used BoyerMoore directly) to the new
API.