If a keyword like filemd5 was being used without a filestore,
or a file output enabled, it would be pruned before detection
had a chance to match.
Consolidate file pruning to the end of the flow worker so files
are available for detection even when a file output is not
enabled.
Redmine issue:
https://redmine.openinfosecfoundation.org/issues/2490
If a file transfer stops on flow timeout, it won't be closed or
truncated. This patch makes sure that in such cases the files
are indeed truncated. This fixes the filestore-v2 output module,
as that requires a sha256 for storing the partial file correctly.
Current file storing approach is using a open file, write data,
close file logic. If this technic is fixing the problem of getting
too much open files in Suricata it is not optimal.
Test on a loop shows that open, write, close on a single file is
two time slower than a single open, loop of write, close.
This patch updates the logic by storing the fd in the File structure.
This is done for a certain number of files. If this amount is exceeded
then the previous logic is used.
This patch also adds two counters. First is the number of
currently open files. The second one is the number of time
the open, write, close sequence has been used due to too much
open files.
In EVE, the entries are:
stats {file_store: {"open_files_max_hit":0,"open_files":5}}
When looping available files 'flags' misuse would lead to all files
being closed after the first close.
This patch separates per file and per call flags.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
This patch introduces the FileDataSize and FileTrackedSize functions.
The first one is just a renaming of the initial FilSize function
whereas the other one is using the newly introduced size field as
value.
As the logging modules are no longer threading modules, rename
them so they don't look like they are being registered as
threading modules.
Also, move the registration to the output.c which will handle
registration of the loggers.
Introduces a new thread module, TMM_LOGGER, which is the
root most logger.
Only handles loggers in the packet path, stats and flow
logging are not included.
The loggers are made up of a hierarchy of loggers. At the top we
have the root logger which is the main entry point to
logging. Under the root there exists parent loggers that are the
entry point for specific types of loggers such as packet logger,
transaction loggers, etc. Each parent logger may have 0 or more
loggers that actual handle the job of producing output to something
like a file.
Make the file storage use the streaming buffer API.
As the individual file chunks were not needed by themselves, this
approach uses a chunkless implementation.
This fixes:
Direct leak of 31792 byte(s) in 3974 object(s) allocated from:
#0 0x4c396b in malloc (/opt/suricata-asan/bin/suricata+0x4c396b)
#1 0xd86ce2 in OutputFiledataLogThreadInit /home/pmanev/sandnet-qa/stage/oisf/src/output-filedata.c:308:34
#2 0x106c255 in TmThreadsSlotPktAcqLoop /home/pmanev/sandnet-qa/stage/oisf/src/tm-threads.c:295:17
#3 0x7fbc9fcb3181 in start_thread /build/eglibc-3GlaMS/eglibc-2.19/nptl/pthread_create.c:312
The log API calls thread modules directly, so the TMM profiling logic
can be applied to it. This patch does so.
The "Thread Module" out now again lists the individual loggers. As the
module are normally called much less frequently the numbers are hard to
compare to pre-log-api numbers.
A new logger API for registering file storage handlers. Where the
FileLog handler is called once per file, this handler will be called
for each data chunk so that storing the entire file is possible.
The logger call in the API is as follows:
typedef int (*FiledataLogger)(ThreadVars *, void *thread_data,
const Packet *, const File *, const FileData *, uint8_t flags);
All data is const, thus should be read only. The final flags field
is used to indicate to the caller that the file is new, or if it's
being closed.
Files use an internal unique id 'file_id' which can be used by the
loggers to create unique file names. This id can use the 'waldo'
feature of the log-filestore module. This patch moves that waldo
loading and storing logic to this API's implementation. A new
configuration directive 'file-store-waldo: <filename>' is added,
but the existing waldo settings will also continue to work.