When the flow engine enters emergency mode, 3 things happen:
1. a different set of (lower) timeout values are applied
2. the flow manager runs more often
3. worker threads go get a flow directly from the hash table
Testing showed that performance went down significantly due to concurrency
issues:
1. worker threads would fight each other over the hash access
2. flow manager would get in the way of workers
This patch changes the behavior in 2 ways:
1. it makes the flow manager slightly less aggressive. It will still
try to run ~3 times per second, but no longer 10 times.
This should be reducing the contention. At the same time flows
won't time out faster if they are checked many times per second.
2. The 'get a used flow' logic optimizes the use of atomics by only
doing an atomic operation once, and while doing so reserving
a slice of the hash per worker.
The worker will also give up much quicker, to avoid the overhead
of hash walking and taking and releasing locks.
These combined changes show much better 'under stress' behavior, esp
on multi-NUMA systems.
Flag the last flow timeout pseudo packet so that we can force
TX logging w/o setting both app-layer flags.
Case this fixes:
1. flow times out when only TS TCP data received, but non of it is ACK'd.
So there is no app-layer proto yet, or app state or Flow::alparser. So
EOF flags can't be set.
2. Flow timeout sees no reason to create pseudo packet in TC direction.
3. TS pseudo packet finds HTTP, creates HTTP state, flag EOF TS.
4. TX logging skips HTTP logging because:
- TC progress not reached
- EOF TC flag not set.
The solution has been to flag the very last packet for the flow as such
and use it has a master-EOF flag.
When the stream engine has data ready for the app-layer it will call
this API from a loop instead of just once. The loop is to ensure that
if we have a very lossy stream where between 'app_progress' and
'last_ack' there are multiple chunks of data and multiple gaps we
process all the chunks.
Elastic search didn't accept the 'hassh' and 'hassh.string'. It would
see the first 'hassh' as a string and split the second key into a
object 'hassh' with a string member 'string'. So two different types
for 'hassh', so it rejected it.
This patch mimics the ja3(s) logging by creating a 'hassh' object
with 2 members: 'hash', which holds the md5 representation, and
'string' which holds the string representation.
In case of lossy connections the NFS state would properly clean up
transactions, including file transactions. However for files the
state was never set to 'truncated', leading to files to stay 'active'.
This would lead these files staying in the NFS's state. In long running
sessions with lots of files this would lead to performance and memory
use issues.
This patch cleans truncates the file that was being transmitted when
a file transaction is being closed.
Based on 65e9a7c31c