The input data received in DATA and BDAT command modes can be huge and
could have important data, like a legit huge email. Therefore, exempt
these from the line buffering limits which were introduced to regulate
the size of lines that we buffer at any point in time.
As a part of this patch, anything that comes under DATA or BDAT is
processed early without buffering as and when it arrives. The ways of
processing remain the same as before.
Issue
-----
So far, with the SMTP parser, we would buffer data up until an LF char
was found indicating the end of one line. This would happen in case of
fragmented data where a line might come broken into multiple chunks.
This was problematic if there was a really long line without any LF
character. It would mean that we'd keep buffering data up until we
encounter one such LF char which may be many many bytes of data later.
Fix
---
Fix this issue by setting an upper limit of 4KB on the buffering of
lines. If the limit is reached then we save the data into current line
and process it as if it were a regular request/response up until 4KB
only. Any data after 4KB is discarded up until there is a new LF char in
the received input.
Cases
-----
1. Fragmentation
The limit is enforced for any cases where a line of >= 4KB comes as diff
fragments that are each/some < 4KB.
2. Single too long line
The limit is also enforced for any cases where a single line exceeds the
limit of buffer.
Reported by Victor Julien.
Ticket 5023
FTP control commands will be buffered forever until a new line is seen,
this can lead to memory exhaustion in Suricata.
To fix, set an upper bound, 4096 bytes on the size of the command that
is saved in the transaction. The input continues to be parsed to find
the end of the command so the parser can continue to move onto the next
command.
The result is that the command data in the transaction is truncated,
which also shows up in the ftp transaction logs.
This value is configurable with the max-line-length field in the ftp
app-layer.protocols section.
As FTP doesn't have events at this time, add a new fields to eve-log
that specificy if the request, or the response has been truncated.
Ticket #5024
Ticket: 5243
When switching from SMTP to TLS, and getting HTTP1 instead of
expected TLS, and HTTP1 requesting upgrade to HTTP2, we do not
overwrite the alproto_orig value so as not to have type confusion
in AppLayerParserStateProtoCleanup
Move raw detection logic out of main StreamReassembleRawDo() so that
it can be reused for other parts of the engine.
The caller now has to specify a right edge of the data.
Ticket: 4972
As is done in detect-lua-extensions.
We can have a flow with alproto unknown, no state, and therefore
cannot run AppLayerParserGetTx which could try to run a NULL
function
Since:
9551cd0535 ("threading: don't pass locked flow between threads")
`MoveToWorkQueue()` unconditionally unlocks the flow. This allows simpler
locking handling, including of tcp reuse flows.
The simpler logic also fixes a scenario where TCP reuse flows got "unlocked"
twice, once in `FlowGetFlowFromHash()` and once in `MoveToWorkQueue()`.
Bug: #5248.
Coverity: 1494354.
Some unittests used SCMalloc for allocating new Packet the unittests.
While this is valid, it leads to segmentation faults when we move to
dynamic allocation of the maximum alerts allowed to be triggered by a
single packet.
This massive patch uses PacketGetFromAlloc, which initializes a Packet
in such a way that any dynamic allocated structures within will also be
initialized.
Related to
Task #4207
If there is a space following a keyword that does not expect a value,
the rule fails to load due to improper value evaluation.
e.g. Space after "set" command
alert http any any -> any any (http.user_agent; dataset:set ,ua-seen,type string,save datasets.csv; sid:1;)
gives error
[ERRCODE: SC_ERR_UNKNOWN_VALUE(129)] - dataset action "" is not supported.
Fix this by handling values correctly for such cases.
A lot of time was spent in `SigMatchListSMBelongsTo` for the `mpm_sm`.
Optimize this by keeping the value at hand during Signature parsing and
detection engine setup.
The current code doesn't cover all rows when more than one flow manager is
used. It leaves a single row between ftd->max and ftd->min of the next
manager orphaned. As an example:
hash_size=1000
flowmgr_number=3
range=333
instance ftd->min ftd->max
0 0 333
1 334 666
2 667 1000
Rows not covered: 333, 666
With this check, on the first packet of a certificate presenting
a length of 16Mbytes, we only allocate up to 65Kb
When we get to the point where need more than 65Kb, we realloc
to the true size.
With this check, it makes it more expensive for an attacket to use
this allocation as a way to trigger ressource exhaustion...