A crash can occurs in the following conditions:
* Suricata running in other mode than "workers"
* Kernel fill in the ring buffer
Under this conditions, it is possible that the capture thread reads
a packet that has not yet released by one of the treatment threads
because there is no modification done on the ring buffer entry when
a packet is read. Doing, this it access to memory which can be
released to the kernel and modified. This results in a kind of memory
corruption.
This bug has only been seen recently and this has to be linked with the
read speed improvement recently made in AF_PACKET support.
The patch fixes the issue by modifying the tp_status bitmask in the
ring buffer. It sets the TP_STATUS_USER_BUSY flag when it is confirmed
that the packet will be treated. And at the start of the read, it exits
from the reading loop (returning to poll) when it reaches a packet with
the flag set. As tp_status is set to 0 during packet release the flag
is destroyed when releasing the packet.
Regarding concurrency, we've got a sequence of modification. The
capture thread read the packet and set the flag, then it passes the
queue and the packet get processed by other threads. The change on
tp_status are thus made at different time.
Regarding the value of the flag, the patch uses the last bit of
tp_status to avoid be impacting by a change in kernel. I will
propose a patch to have TP_STATUS_USER_BUSY included in kernel
as this is a generic issue for multithreading application using
AF_PACKET mechanism.
Coverity pointed out that PoolReturn is almost like free and detected
a use after free when accessing to tracker->af (issue 720339).
This patch fixes this by storing the value in a local variable.
If a capture loop does exit, the thread needs to start without
synchronization with the other threads. This patch fixes this
by resetting the turn count on the peerslist structure and
adding a test on this condition in the wait function.
It seems that, in some case, there is a read waiting but the
offset in the ring buffer is not correct and Suricata need to
walk the ring to find the correct place and make the read.
This patch fixes emergency mode by setting the variable even if we
have a non kernel checksum check. It also does a call to
AFPDUmpCounters() as it seems to improve thing to do it ASAP.
This patch implements "late open". On high performance system, it
is needed to create the AF_PACKET just before reading to avoid
overflow. Socket creation has to be done with respect to the order
of thread creation to respect affinity settings.
This patch adds a counter to AFPPeer to be ale to synchronize the
initial socket creation.
Suricata was not able to start cleanly in AF_PACKET with default
suricata.yaml file if there was no eth1 on the system. This patch
fixes this issue and rework the socket transition phase to fix
some serious issues (file descriptor leak) found when fixing this
problem.
Every 20 seconds it displays a message to the user to warn him about
the interface not being accessible:
[ERRCODE: SC_ERR_AFP_CREATE(196)] - Can not open iface 'eth1'
DecodeTeredo, DecodeIPv6InIPv6 and DecodeIPv4inIPv6 were calling
DecodeTunnel with packet being a pseudo packet and data being
data from initial packet:
DecodeTunnel(tv, dtv, tp, start, blen,
pq, IPPROTO_IPV6);
In decoding functions, arithmetic was done on pkt to set some values?
It was resulting in field of packet pointing outside of the scope of
packet data.
This patch switch to what has been done in DecodeGre(), I mean:
DecodeTunnel(tv, dtv, tp, GET_PKT_DATA(tp),
GET_PKT_LEN(tp), pq, IPPROTO_IP);
Data buffer is then relative to the packet and the arithmetic is
correct.
Some detection keywords need thread local ctx storage. Example is the
filemagic keyword that has a ctx that is modified with each call. That
is not thread safe. This functionality allows registration of thread
local ctxs so that each detect thread works on it's own copy.
This patch sets some define to generate doc for the acquisition
modules. It also suppress the doc generation for unittests which
was polluting the output.
This patch required a evolution of Pool API as it is needed to
proceed to alloc or init separetely. The PoolInit has been changed
with a new Init function parameter.
This patch adds a series of unit tests. First two check test the keyword
by checking packet on signatures using it. Last one adds is here to check
that there is no interaction of l3_proto and ip_proto.
This patch adds a l3_proto keyword to the signature language. It
can be used to specify if the signature has to match on IPv4, IPv6
or both. For example, one can write:
alert http any any -> any 22 (msg: "HTTP v6"; l3_proto:ip6; sid:14;)
This should close#494.
If the MTU on the reception interface and the one on the transmission
interface are different, this will result in an error at transmission
when sending packet to the wire.
Flush all waiting packets to be in sync with kernel when drop
occurs. This mode can be activated by setting use-emergency-flush
to yes in the interface configuration.
This patch moves raw socket binding at the end of init code to
avoid to have a flow of packets reaching the socket before we
start to read them.
The socket creation is now made in the loop function to avoid
any timing issue between init function and the call of the loop.
This patch adds a new feature to AF_PACKET capture mode. It is now
possible to use AF_PACKET in IPS and TAP mode: all traffic received
on a interface will be forwarded (at the Ethernet level) to an other
interface. To do so, Suricata create a raw socket and sends the receive
packets to a interface designed in the configuration file.
This patch adds two variables to the configuration of af-packet
interface:
copy-mode: ips or tap
copy-iface: eth1 #the interface where packet are copied
If copy-mode is set to ips then the packet wth action DROP are not
copied to the destination interface. If copy-mode is set to tap,
all packets are copied to the destination interface.
Any other value of copy-mode results in the feature to be unused.
There is no default interface for copy-iface and the variable has
to be set for the ids or tap mode to work.
For now, this feature depends of the release data system. This
implies you need to activate the ring mode and zero copy. Basically
use-mmap has to be set to yes.
This patch adds a peering of AF_PACKET sockets from the thread on
one interface to the threads on another interface. Peering is
necessary as if we use an other socket the capture socket receives
all emitted packets. This is made using a new AFPPeer structure to
avoid direct interaction between AFPTreadVars.
There is currently a bug in Linux kernel (prior to 3.6) and it is
not possible to use multiple threads.
You need to setup two interfaces with equality on the threads
variable. copy-mode variable must be set on the two interfaces
and use-mmap must be set to activated.
A valid configuration for an IPS using eth0 and vboxnet1 interfaces
will look like:
af-packet:
- interface: eth0
threads: 1
defrag: yes
cluster-type: cluster_flow
cluster-id: 98
copy-mode: ips
copy-iface: vboxnet1
buffer-size: 64535
use-mmap: yes
- interface: vboxnet1
threads: 1
cluster-id: 97
defrag: yes
cluster-type: cluster_flow
copy-mode: ips
copy-iface: eth0
buffer-size: 64535
use-mmap: yes
This patch adds a data release mechanism. If the capture module
has a call to indicate that userland has finished with the data,
it is possible to use this system. The data will then be released
when the treatment of the packet is finished.
To do so the Packet structure has been modified:
+ TmEcode (*ReleaseData)(ThreadVars *, struct Packet_ *);
If ReleaseData is null, the function is called when the treatment
of the Packet is finished.
Thus it is sufficient for the capture module to code a function
wrapping the data release mechanism and to assign it to ReleaseData
field.
This patch also includes an implementation of this mechanism for
AF_PACKET.