The log API calls thread modules directly, so the TMM profiling logic
can be applied to it. This patch does so.
The "Thread Module" out now again lists the individual loggers. As the
module are normally called much less frequently the numbers are hard to
compare to pre-log-api numbers.
Add option "profiling.sample-rate":
# Run profiling for every xth packet. The default is 1, which means we
# profile every packet. If set to 1000, one packet is profiled for every
# 1000 received.
#sample-rate: 1000
This allows for configuration of the sample rate.
Instead of a large (6k+) structure in the Packet, make the profiling
storage dynamic. To do this the Packet->profile is now a pointer.
Initial support for selective sampling, e.g. only profile every
1000th packet.
A negated match is matching if the tested field is NULL. But as it
is not set, nor negated nor normal test must match.
Without this patch, a rule like:
alert tls any any -> any any (msg:"negated match"; tls.subject:!"CN=home.regit.org"; sid:1; rev:1;)
is alerting for all connections. Event if they are done on a certificate
with matching subject. This was due to the fact that tls protocol
is discovered before the handshake is complete. Thus the condition
on tls is true with a NULL tls.subject. And code was returning a
positive match in the case of a NULL subject and a signature with
a negated match.
Introduce a utility function to convert an array of bytes into a
null-terminated string:
char *BytesToString(const uint8_t *bytes, size_t nbytes);
All non-printables are copied over, except for '\0', which is
turned into literal '\' '0' in the string. So the resulting string
may be bigger than the input.
In various places SCStrndup was used to 'dup' a bstr string, however
libhtp provides bstr_util_strdup_to_c for this. As this is a cleaner
interface, it's preferred.
This patch separates http keys from file to have a different value
list:
{
"time":"01\/31\/2014-12:04:52.837245","event_type":"file","src_ip":"5.3.1.1","src_port":80,"dest_ip":"1.8.1.9","dest_port":9539,"proto":"TCP",
"http":{"url":"/foo/","hostname":"bar.com","http_refer":"http:\/\/bar.org","http_user_agent":"Mozilla\/5.0"},
"file":{"filename":"bar","magic":"unknown","state":"CLOSED","stored":false,"size":21}
}
One interest of this modification is that it is possible to use the
same key as the one used in http events. Thus correlating both type
of events is trivial. On code side, this will permit to factorize
the code by simply asking the underlying protocol to output its
info in a json object.
Second interest is that adding file extraction for a new protocol
will result in only changing the protocol specific json list.
This patch adds an event_type key to the generated events. Current
value is one of "dns", "alert, "file", "tls", "http", "drop". It is
then easy to differentiate in log analysis tools the events based on
source inside Suricata.
Without this patch DNS answers for a single query are stored in a
single json event. The result is an array in the object like this one:
{"type":"answer","id":45084,"rrname":"s-static.ak.facebook.com","rrtype":"CNAME","ttl":734},
{"type":"answer","id":45084,"rrname":"s-static.ak.facebook.com.edgekey.net","rrtype":"CNAME","ttl":1710},
This type of output is not well supported in logstash. It is
displayed as it is written above and it is not possible to
query the fields.
I think the reason is that this is not logical if we consider search
query. For example if we search for "rrname" equal "s-static.ak.facebook.com"
we got one entry with two values in it. That's against the logic
of event. Furthermore, if we want to get a complete query, we can
used the id.
This patch splits the answer part in mulitple message. The result
is then accepted by logstash and fields can be queried easily.
This patch updates DNS field name to be in sync with RFC 2629:
https://github.com/adulau/pdns-qof
This will allow to easily use Suricata with other passive DNS tools.
This patch is synchronizing key name with Common Information Model.
It updates key name following what is proposed in:
http://docs.splunk.com/Documentation/PCI/2.0/DataSource/CommonInformationModelFieldReference
The interest of these modifications is that using the same key name
as other software will provide an easy to correlate and improve
data. For example, geoip setting in logstash can be applied on
all src_ip fields allowing geoip tagging of data.
Both the drop and tls logs are currently not designed to have multiple
instances running. So until that is changed, error out if more than one
instance is started.
To support the 'eve-log' idea, we need to be able to force all log
modules to be enabled by the master eve-log module, and need to be
able to make all logs go into a single file. This didn't fit the
API so far, so added the sub-module concept.
A sub-module is a regular module, that registers itself as a sub-
module of another module:
OutputRegisterTxSubModule("eve-log", "JsonHttpLog", "http",
OutputHttpLogInitSub, ALPROTO_HTTP, JsonHttpLogger);
The first argument is the name of the parent. The 4th argument is
the OutputCtx init function. It differs slightly from the non-sub
one. The different is that in addition to it's ConfNode, it gets
the OutputCtx from the parent. This way it can set the parents
LogFileCtx in it's own OutputCtx.
The runmode setup code will take care of all the extra setup. It's
possible to register a module both as a normal module and as a sub-
module, which can operate at the same time.
Only the TxLogger API is handled in this patch, the rest will be
updated later.
- restore json output-file.[ch] as output-json-file.[ch] after rebase conflict
- fix Makefile.am after merge conflict
- some dev-log-api-v4.0 rebase json fallout cleanup