This patch introduces new option to dataset keyword.
Where regular dataset allows match from sets, dataset with json
format allows the same but also adds JSON data to the alert
event. This data is coming from the set definition it self.
For example, an ipv4 set will look like:
[{"ip": "10.16.1.11", "test": "success","context":3}]
The syntax is a JSON array but it can also be a JSON object
with an array inside. The idea is to directly used data coming
from the API of a threat intel management software.
The syntax of the keyword is the following:
dataset:isset,src_ip,type ip,load src.lst,format json, \
enrichment_key src_ip, value_key ip;
Compare to dataset, it just have a supplementary option key
that is used to indicate in which subobject the JSON value
should be added.
The information is added in the even under the alert.extra
subobject:
"alert": {
"extra": {
"src_ip": {
"ip": "10.6.1.11",
"test": "success",
"context": 3
},
The main interest of the feature is to be able to contextualize
a match. For example, if you have an IOC source, you can do
[
{"buffer": "value1", "actor":"APT28","Country":"FR"},
{"buffer": "value2", "actor":"APT32","Country":"NL"}
]
This way, a single dataset is able to produce context to the
event where it was not possible before and multiple signatures
had to be used.
The format introduced in datajson is an evolution of the
historical datarep format. This has some limitations. For example,
if a user fetch IOCs from a threat intel server there is a large
change that the format will be JSON or XML. Suricata has no support
for the second but can support the first one.
Keeping the key value may seem redundant but it is useful to have it
directly accessible in the extra data to be able to query it
independantly of the signature (where it can be multiple metadata
or even be a transformed metadata).
In some case, when interacting with data (mostly coming from
threat intel servers), the JSON array containing the data
to use is not at the root of the object and it is ncessary
to access a subobject.
This patch implements this with support of key in level1.level2.
This is done via the `array_key` option that contains the path
to the data.
Ticket: #7372
If the connection is lost (for example, Suricata is restarted), try to
re-open the connect and re-execute the command.
This was the behavior of the Python implementation.
Ticket: #7746
For example:
error: lifetime flowing from input to output with different syntax can be confusing
--> htp/src/headers.rs:475:16
|
475 | fn null(input: &[u8]) -> IResult<&[u8], ParsedBytes> {
| ^^^^^ ----- ----------- the lifetimes get resolved as `'_`
| | |
| | the lifetimes get resolved as `'_`
| this lifetime flows to the output
|
note: the lint level is defined here
--> htp/src/lib.rs:3:9
This currently only happens when using the Rust nightly compiler, which
we use for our fuzz builds.
Update all deps with cargo update. Additionally, apply the updated
versions to the Cargo.toml, which while not stricly required, does
make it more clear what the version in use is.
There shouldn't be duplicated messages in the requests Vec. And thus
the parser shouldn't log duplicated keys nor messages. Add debug
validations to ensure this.
With PGSQL's current state machine, most frontend/ client messages will
lead to the creation of a new transaction - which would prevent
duplicated messages being pushed to the requests array and reaching the
logger.
The current exceptions for that are:
- CopyDataIn
- CopyDone
- CopyFail
Thus, debug statements were added for those cases.
CopyDone and CopyFail, per the documentation, shouldn't be seen
duplicated on the wire for the same transaction. CopyDataIn -- yes, but
we consolidate those, so the expectation is that they won't be
duplicated in the requests array or when reaching the logger either.
Related to
Task #7645
We used `copy_column_count`, while just `columns` is more accurate with
what PostgreSQL describes, and what Wireshark shows.
Related to
Task #7644
Task #7645
While this could be considered minor, they were not just bad, but
misleading names, as the variables weren't really `dummy` responses,
but consolidating several messages.
This sub-protocol inspects messages sent mainly from the frontend to
the backend after a 'COPY FROM STDIN' has been processed by the
backend.
Parses new messages:
- CopyInResponse -- initiates copy-in mode/sub-protocol
- CopyData (In) -- data transfer message, from frontend to backend
- CopyDone -- signals that no more CopyData messages will be seen from
the frontend, for the current transaction
- CopyFail -- used by the frontend to signal some failure to proceed
with sending CopyData messages
Task #7645
Important for CopyIn mode/ subprotocol, where the frontend is the one
sending 0 or more messages to the backend as part of a transaction.
Related to
Task #7645
As SCDetectTransformFromBase64Data is not a flat structure,
because it has pointers to other buffers, we cannot use it simply
for TransformId
We need to compute a serialization of the data hold by
SCDetectTransformFromBase64Data and own it.
The mDNS support is based heavily on the DNS support, reusing the
existing DNS parser where possible. This meant adding variations on
DNS, as mDNS is a little different. Mainly being that *all* mDNS
traffic is to_server, yet there is still the concept of request and
responses.
Keywords added are:
- mdns.queries.rrname
- mdns.answers.rrname
- mdns.additionals.rrname
- mdns.authorities.rrname
- mdns.response.rrname
They are mostly in-line with the DNS keywords, except
mdns.answers.rdata which is a better than that mdns.response.rrname,
as its actually looking at the rdata, and not rrnames.
mDNS has its own logger that differs from the DNS logger:
- No grouped logging
- In answers/additionals/authorities, the rdata is logged in a field
that is named after the rdata type. For example, "txt" data is no
longer logged in the "rdata" field, but instead a "txt" field. We
currently already did this in DNS for fields that were not a single
buffer, like SOA, SRV, etc. So this makes things more consistent. And
gives query like semantics that the "grouped" object was trying to
provide.
- Types are logged in lower case ("txt" instead of "TXT")
- Flags are logged as an array: "flags": ["aa", "z"]
Ticket: #3952
A DNS TXT answer record can actually be made of up multiple TXT
entries in a single record. Suricata currently expands these into
multiple TXT records, however that is not very representative of the
actualy DNS message.
Instead, if a TXT record contains multiple labels, parse them into an
array.
We still expand multiple TXT segements into multiple TXT records at
logging time for compatibility, but this will allow something like
MDNS to log more accurately to the protocol.