mirror of https://github.com/OISF/suricata
You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
288 lines
8.1 KiB
ReStructuredText
288 lines
8.1 KiB
ReStructuredText
AF_XDP
|
|
======
|
|
|
|
AF_XDP (eXpress Data Path) is a high speed capture framework for Linux that was
|
|
introduced in Linux v4.18. AF_XDP aims at improving capture performance by
|
|
redirecting ingress frames to user-space memory rings, thus bypassing the network
|
|
stack.
|
|
|
|
Note that during ``af_xdp`` operation the selected interface cannot be used for
|
|
regular network usage.
|
|
|
|
Further reading:
|
|
|
|
- https://www.kernel.org/doc/html/latest/networking/af_xdp.html
|
|
|
|
Compiling Suricata
|
|
------------------
|
|
|
|
Linux
|
|
~~~~~
|
|
|
|
libxdp and libpbf are required for this feature. When building from source the
|
|
development files will also be required.
|
|
|
|
Example::
|
|
|
|
dnf -y install libxdp-devel libbpf-devel
|
|
|
|
This feature is enabled provided the libraries above are installed, the user
|
|
does not need to add any additional command line options.
|
|
|
|
The command line option ``--disable-af-xdp`` can be used to disable this
|
|
feature.
|
|
|
|
Example::
|
|
|
|
./configure --disable-af-xdp
|
|
|
|
Starting Suricata
|
|
-----------------
|
|
|
|
IDS
|
|
~~~
|
|
|
|
Suricata can be started as follows to use af-xdp:
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
suricata --af-xdp=<interface>
|
|
suricata --af-xdp=igb0
|
|
|
|
In the above example Suricata will start reading from the `igb0` network interface.
|
|
|
|
AF_XDP Configuration
|
|
--------------------
|
|
|
|
Each of these settings can be configured under ``af-xdp`` within the "Configure
|
|
common capture settings" section of suricata.yaml configuration file.
|
|
|
|
The number of threads created can be configured in the suricata.yaml configuration
|
|
file. It is recommended to use threads equal to NIC queues/CPU cores.
|
|
|
|
Another option is to select ``auto`` which will allow Suricata to configure the
|
|
number of threads based on the number of RSS queues available on the NIC.
|
|
|
|
With ``auto`` selected, Suricata spawns receive threads equal to the number of
|
|
configured RSS queues on the interface.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
threads: <number>
|
|
threads: auto
|
|
threads: 8
|
|
|
|
Advanced setup
|
|
---------------
|
|
|
|
af-xdp capture source will operate using the default configuration settings.
|
|
However, these settings are available in the suricata.yaml configuration file.
|
|
|
|
Available configuration options are:
|
|
|
|
force-xdp-mode
|
|
~~~~~~~~~~~~~~
|
|
|
|
There are two operating modes employed when loading the XDP program, these are:
|
|
|
|
- XDP_DRV: Mode chosen when the driver supports AF_XDP
|
|
- XDP_SKB: Mode chosen when no AF_XDP support is unavailable
|
|
|
|
XDP_DRV mode is the preferred mode, used to ensure best performance.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
force-xdp-mode: <value> where: value = <skb|drv|none>
|
|
force-xdp-mode: drv
|
|
|
|
force-bind-mode
|
|
~~~~~~~~~~~~~~~
|
|
|
|
During binding the kernel will first attempt to use zero-copy (preferred). If
|
|
zero-copy support is unavailable it will fallback to copy mode, copying all
|
|
packets out to user space.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
force-bind-mode: <value> where: value = <copy|zero|none>
|
|
force-bind-mode: zero
|
|
|
|
For both options, the kernel will attempt the 'preferred' option first and
|
|
fallback upon failure. Therefore the default (none) means the kernel has
|
|
control of which option to apply. By configuring these options the user
|
|
is forcing said option. Note that if enabled, the bind will only attempt
|
|
this option, upon failure the bind will fail i.e. no fallback.
|
|
|
|
mem-unaligned
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
AF_XDP can operate in two memory alignment modes, these are:
|
|
|
|
- Aligned chunk mode
|
|
- Unaligned chunk mode
|
|
|
|
Aligned chunk mode is the default option which ensures alignment of the
|
|
data within the UMEM.
|
|
|
|
Unaligned chunk mode uses hugepages for the UMEM.
|
|
Hugepages start at the size of 2MB but they can be as large as 1GB.
|
|
Lower count of pages (memory chunks) allows faster lookup of page entries.
|
|
The hugepages need to be allocated on the NUMA node where the NIC and CPU resides.
|
|
Otherwise, if the hugepages are allocated only on NUMA node 0 and the NIC is
|
|
connected to NUMA node 1, then the application will fail to start.
|
|
Therefore, it is recommended to first find out to which NUMA node the NIC is
|
|
connected to and only then allocate hugepages and set CPU cores affinity
|
|
to the given NUMA node.
|
|
|
|
Memory assigned per socket/thread is 16MB, so each worker thread requires at least
|
|
16MB of free space. As stated above hugepages can be of various sizes, consult the
|
|
OS to confirm with ``cat /proc/meminfo``.
|
|
|
|
Example ::
|
|
|
|
8 worker threads * 16Mb = 128Mb
|
|
hugepages = 2048 kB
|
|
so: pages required = 62.5 (63) pages
|
|
|
|
See https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt for detailed
|
|
description.
|
|
|
|
To enable unaligned chunk mode:
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
mem-unaligned: <yes/no>
|
|
mem-unaligned: yes
|
|
|
|
Introduced from Linux v5.11 a ``SO_PREFER_BUSY_POLL`` option has been added to
|
|
AF_XDP that allows a true polling of the socket queues. This feature has
|
|
been introduced to reduce context switching and improve CPU reaction time
|
|
during traffic reception.
|
|
|
|
Enabled by default, this feature will apply the following options, unless
|
|
disabled (see below). The following options are used to configure this feature.
|
|
|
|
enable-busy-poll
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
Enables or disables busy polling.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
enable-busy-poll: <yes/no>
|
|
enable-busy-poll: yes
|
|
|
|
busy-poll-time
|
|
~~~~~~~~~~~~~~
|
|
|
|
Sets the approximate time in microseconds to busy poll on a ``blocking receive``
|
|
when there is no data.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
busy-poll-time: <time>
|
|
busy-poll-time: 20
|
|
|
|
busy-poll-budget
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
Budget allowed for batching of ingress frames. Larger values means more
|
|
frames can be stored/read. It is recommended to test this for performance.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
busy-poll-budget: <budget>
|
|
busy-poll-budget: 64
|
|
|
|
Linux tunables
|
|
~~~~~~~~~~~~~~~
|
|
|
|
The ``SO_PREFER_BUSY_POLL`` option works in concert with the following two Linux
|
|
knobs to ensure best capture performance. These are not socket options:
|
|
|
|
- gro-flush-timeout
|
|
- napi-defer-hard-irq
|
|
|
|
The purpose of these two knobs is to defer interrupts and to allow the
|
|
NAPI context to be scheduled from a watchdog timer instead.
|
|
|
|
The ``gro-flush-timeout`` indicates the timeout period for the watchdog
|
|
timer. When no traffic is received for ``gro-flush-timeout`` the timer will
|
|
exit and softirq handling will resume.
|
|
|
|
The ``napi-defer-hard-irq`` indicates the number of queue scan attempts
|
|
before exiting to interrupt context. When enabled, the softirq NAPI context will
|
|
exit early, allowing busy polling.
|
|
|
|
::
|
|
|
|
af-xdp:
|
|
gro-flush-timeout: 2000000
|
|
napi-defer-hard-irq: 2
|
|
|
|
|
|
Hardware setup
|
|
---------------
|
|
|
|
Intel NIC setup
|
|
~~~~~~~~~~~~~~~
|
|
|
|
Intel network cards don't support symmetric hashing but it is possible to emulate
|
|
it by using a specific hashing function.
|
|
|
|
Follow these instructions closely for desired result::
|
|
|
|
ifconfig eth3 down
|
|
|
|
Enable symmetric hashing ::
|
|
|
|
ifconfig eth3 down
|
|
ethtool -L eth3 combined 16 # if you have at least 16 cores
|
|
ethtool -K eth3 rxhash on
|
|
ethtool -K eth3 ntuple on
|
|
ifconfig eth3 up
|
|
./set_irq_affinity 0-15 eth3
|
|
ethtool -X eth3 hkey 6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A:6D:5A equal 16
|
|
ethtool -x eth3
|
|
ethtool -n eth3
|
|
|
|
In the above setup you are free to use any recent ``set_irq_affinity`` script. It is available in any Intel x520/710 NIC sources driver download.
|
|
|
|
**NOTE:**
|
|
We use a special low entropy key for the symmetric hashing. `More info about the research for symmetric hashing set up <http://www.ndsl.kaist.edu/~kyoungsoo/papers/TR-symRSS.pdf>`_
|
|
|
|
Disable any NIC offloading
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Suricata shall disable NIC offloading based on configuration parameter ``disable-offloading``, which is enabled by default.
|
|
See ``capture`` section of yaml file.
|
|
|
|
::
|
|
|
|
capture:
|
|
# disable NIC offloading. It's restored when Suricata exits.
|
|
# Enabled by default.
|
|
#disable-offloading: false
|
|
|
|
Balance as much as you can
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Try to use the network card's flow balancing as much as possible ::
|
|
|
|
for proto in tcp4 udp4 ah4 esp4 sctp4 tcp6 udp6 ah6 esp6 sctp6; do
|
|
/sbin/ethtool -N eth3 rx-flow-hash $proto sd
|
|
done
|
|
|
|
This command triggers load balancing using only source and destination IPs. This may be not optimal
|
|
in terms of load balancing fairness but this ensures all packets of a flow will reach the same thread
|
|
even in the case of IP fragmentation (where source and destination port will not be available for
|
|
some fragmented packets).
|