Due to the use of AFL_LOOP and initialization/deinit outside of it,
part of the fuzzing relied on the global 'state' in flow and defrag.
Because of this crashes that were found could not be reproduced. The
saved crash input was only the last in the series.
This patch addresses that. It requires a new output directory 'dump'
where the packet fuzzers will store all their input. If the AFL_LOOP
fails the files will not be removed and this 'serie' can be read
again for reproducing the issue.
e.g.: AFL would work with:
--afl-decoder-ppp=@@
and after a crash is found the produced serie can be read with:
--afl-decoder-ppp-serie=1486656919-514163
The series have a timestamp as name and a suffix that controls the
order in which the files will be 'replayed' in Suricata.
If segments section in the yaml is ommitted (default) or when the
pool size is set to 'from_mtu', the size of the pool will be MTU
minus 40. If the MTU couldn't be determined, it's assumed to be
1500, so the segment size for the bool will be 1460.
Fix rate_filter issues: if action was modified it wouldn't be logged
in EVE. To address this pass the PacketAlert structure to the threshold
code so it can flag the PacketAlert as modified. Use this in logging.
Update API to use const where possible. Fix a timout issue that this
uncovered.
Add negated matches to match list instead of amatch.
Allow matching on 'failed'.
Introduce per packet flags for proto detection. Flags are used to
only inspect once per direction. Flag packet on PD-failure too.
Previously the detect and stream code lived in their own thread
modules. This meant profiling showed their cost as part of the
thread module profiling logic. Now that only the flow worker is
a thread module this no longer works.
This patch introduces profiling for the 3 current flow worker
steps: flow, stream, detect.
Instead of handling the packet update during flow lookup, handle
it in the stream/detect threads. This lowers the load of the
capture thread(s) in autofp mode.
The decoders now set a flag in the packet if the packet needs a
flow lookup. Then the workers will take care of this. The decoders
also already calculate the raw flow hash value. This is so that
this value can be used in flow balancing in autofp.
Because the flow lookup/creation is now done in the worker threads,
the flow balancing can no longer use the flow. It's not yet
available. Autofp load balancing uses raw hash values instead.
In the same line, move UDP AppLayer out of the DecodeUDP module,
and also into the stream/detect threads.
Handle TCP session reuse inside the flow engine itself. If a looked up
flow matches the packet, but is a TCP stream starter, check if the
ssn needs to be reused. If that is the case handle it within the
lookup function. Simplies the locking and removes potential race
conditions.
We want to add counters in order to track the number of times we hit a
decode event. A decode event is related to an error in the protocol
decoding over a certain packet.
This patch fist modifies the decode-event list, reordering it in order
to separate single packet events from stream-related events and adding
the prefix "decoder" to decode events.
The counters are created during the decode setup and the relative event
counter is increased every time a packet with the flag PKT_IS_INVALID is
finalized in the decode phase
This adds a counter indicating how many times
the flow max memcap has been reached
Since there is no always a reference to FlowManagerThreadData,
the counter is put in DecodeThreadVars.
Currently when there is no counter increase in one call of FlowGetNew
because we don't have tv or dtv at the time of the call.
The following is a snippet of the generated EVE entry:
"flow":{"memcap":0,"spare":10000,"emerg_mode_entered":0,"emerg_mode_over":0,"tcp_reuse":0,"memuse":7085248}
Put common counters on the first cache line. Please the flow output
pointer last as it's use depends on the flow logging being enabled
and even then it's only called very rarely.
Implement LINKTYPE_NULL for pcap live and pcap file.
From: http://www.tcpdump.org/linktypes.html
"BSD loopback encapsulation; the link layer header is a 4-byte field,
in host byte order, containing a PF_ value from socket.h for the
network-layer protocol of the packet.
Note that ``host byte order'' is the byte order of the machine on
which the packets are captured, and the PF_ values are for the OS
of the machine on which the packets are captured; if a live capture
is being done, ``host byte order'' is the byte order of the machine
capturing the packets, and the PF_ values are those of the OS of
the machine capturing the packets, but if a ``savefile'' is being
read, the byte order and PF_ values are not necessarily those of
the machine reading the capture file."
Feature ticket #1445
Set actions that are set directly from Signatures using the new
utility function DetectSignatureApplyActions. This will apply
the actions and also store info about the 'drop' that first made
the rule drop.
In flow timeout handling we need a function that allocate and blank
a place that will be used to put constructed packet data. This new
function has no other goal.
The field ext_pkt was cleaned before calling the release function.
The result was that IPS mode such as the one of AF_PACKET were not
working anymore because they were not able to send the data which
were initially pointed by ext_pkt.
This patch moves the ext_pkt cleaning to the cleaning macro. This
ensures that the cleaning is done for allocated and pool packets.
Most flows are marked for clean up by the flow manager, which then
passes them to the recycler. The recycler logs and cleans up. However,
under resource stress conditions, the packet threads can recycle
existing flow directly. So here the recycler has no role to play, as
the flow is immediately used.
For this reason, the packet threads need to be able to invoke the
flow logger directly.
The flow logging thread ctx will stored in the DecodeThreadVars
stucture. Therefore, this patch makes the DecodeThreadVars an argument
to FlowHandlePacket.
Using a stack for free Packet storage causes recently freed Packets to be
reused quickly, while there is more likelihood of the data still being in
cache.
The new structure has a per-thread private stack for allocating Packets
which does not need any locking. Since Packets can be freed by any thread,
there is a second stack (return stack) for freeing packets by other threads.
The return stack is protected by a mutex. Packets are moved from the return
stack to the private stack when the private stack is empty.
Returning packets back to their "home" stack keeps the stacks from getting out
of balance.
The PacketPoolInit() function is now called by each thread that will be
allocating packets. Each thread allocates max_pending_packets, which is a
change from before, where that was the total number of packets across all
threads.
For packets that were freed, not recycled, profiling memory wasn't
freed:
==15745== 13,312 bytes in 8 blocks are definitely lost in loss record 611 of 615
==15745== at 0x4C2C494: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15745== by 0xA190D5: SCProfilePacketStart (util-profiling.c:963)
==15745== by 0x4E4345: PacketGetFromAlloc (decode.c:134)
==15745== by 0x83FE75: FlowForceReassemblyPseudoPacketGet (flow-timeout.c:276)
==15745== by 0x8413BF: FlowForceReassemblyForHash (flow-timeout.c:588)
==15745== by 0x841897: FlowForceReassembly (flow-timeout.c:716)
==15745== by 0x9540F6: main (suricata.c:2296)
==15745==
==15745== 14,976 bytes in 9 blocks are definitely lost in loss record 612 of 615
==15745== at 0x4C2C494: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15745== by 0xA190D5: SCProfilePacketStart (util-profiling.c:963)
==15745== by 0x4E4345: PacketGetFromAlloc (decode.c:134)
==15745== by 0x83FE75: FlowForceReassemblyPseudoPacketGet (flow-timeout.c:276)
==15745== by 0x841508: FlowForceReassemblyForHash (flow-timeout.c:620)
==15745== by 0x841897: FlowForceReassembly (flow-timeout.c:716)
==15745== by 0x9540F6: main (suricata.c:2296)
This patch addresses that.
This patch introduces a new counter "decoder.vlan_qinq". It counts
packets that have more than two stacked vlan layers.
Packets with 2 vlan layers will both increment "decoder.vlan" and
"decoder.vlan_qinq".
Instead of a large (6k+) structure in the Packet, make the profiling
storage dynamic. To do this the Packet->profile is now a pointer.
Initial support for selective sampling, e.g. only profile every
1000th packet.
Move app layer event handling into app-layer-event.[ch].
Convert 'Set' macro's to functions.
Get rid of duplication in Set and SetRaw. Set now calls SetRaw.
Fix potentential int overflow condition in the event storage.
Update callers.
To be able to register counters from AppLayerGetCtxThread, the
ThreadVars pointer needs to be available in it and thus in it's
callers:
- AppLayerGetCtxThread
- DecodeThreadVarsAlloc
- StreamTcpReassembleInitThreadCtx