In TCP, large gaps in the data could lead to an extremely poor utilization
of the streaming buffer memory. This was caused by the implementation using
a single continues memory allocation from the "stream offset" to the
current data. If a 100 byte segment was inserted for ISN + 20MiB, we would
allocate 20MiB, even if only 100 bytes were actually used.
This patch addresses the issue by implementing a list of memory regions.
The StreamingBuffer structure holds a static "main" region, which can be
extended in the form of a simple list of regions.
[ main region ] [ gap ] [ aux region ]
[ sbb ] [ sbb ]
On insert, find the correct region and see if the new data fits. If it
doesn't, see if we can expand the current region, or than we need to add
a new region. If expanding the current region means we overlap or get
too close to the next region, we merge them.
On sliding, we free any regions that slide out of window and consolidate
auxilary regions into main where needed.
Bug: #4580.
Work towards making `suricata-common.h` only introduce system headers
and other things that are independent of complex internal Suricata
data structures.
Update files to compile after this.
Remove special DPDK handling for strlcpy and strlcat, as this caused
many compilation failures w/o including DPDK headers for all files.
Remove packet macros from decode.h and move them into their own file,
turn them into functions and rename them to match our function naming
policy.
Pruning of StreamBufferBlocks could remove blocks that fell entirely
after the target offset due to a logic error. This could lead to data
being evicted that was still meant to be processed in theapp-layer
parsers.
Bug: #4953.
In some cases a SBB could be seen as overlapping with the requested
offset, when it was in fact precisely before it. In some special cases
this could lead to the stream engine not progressing the 'raw' progress.
Switch StreamBufferBlocks implementation to use RBTREE instead of
a list. This makes inserts/removals and lookups a lot cheaper if
the number of data gaps is large.
Use separate compare functions for inserts and regular lookups.
Inserts care about the offset, while lookups care about the blocks
right edge as well.
Set flags by default:
-Wmissing-prototypes
-Wmissing-declarations
-Wstrict-prototypes
-Wwrite-strings
-Wcast-align
-Wbad-function-cast
-Wformat-security
-Wno-format-nonliteral
-Wmissing-format-attribute
-funsigned-char
Fix minor compiler warnings for these new flags on gcc and clang.
Add list of 'blocks'. This list contains offsets and lengths to
continuous data blocks. This is useful for TCP tracking where we
can have data gaps.
The blocks don't contain any data themselves, instead they contain
lenght and offsets. This way no extra copying is needed.
On inserting new data, existing blocks are expanded instead of
having multiple neighbouring blocks.
When memory allocations happened in HTTP body and general file
tracking, malloc/realloc errors (most likely in the form of memcap
reached conditions) could lead to an endless loop in the buffer
grow logic.
This patch implements proper error handling for all Append/Insert
functions for the streaming API, and it explicitly enables compiler
warnings if the results are ignored.
Add a new API to store data from streaming sources, like HTTP body
processing or TCP data.
Currently most of the code uses a pattern of list of data chunks
(e.g. TcpSegment) that is reassembled into a large buffer on-demand.
The Streaming Buffer API changes the logic to store the data in
reassembled form from the start, with the segments/chunks pointing
to the reassembled data.
The main buffer storing the data slides forward, automatically or
manually. The *NoTrack calls allows for a segmentless mode of
operation.
This approach has two main advantages:
1. accessing the reassembled data is virtually cost-free
2. reduction of allocations and memory management