You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
suricata/src/flow-hash.h

106 lines
3.3 KiB
C

/* Copyright (C) 2007-2012 Open Information Security Foundation
*
* You can copy, redistribute or modify this Program under the terms of
* the GNU General Public License version 2 as published by the Free
* Software Foundation.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* version 2 along with this program; if not, write to the Free Software
* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
* 02110-1301, USA.
*/
/**
* \file
*
* \author Victor Julien <victor@inliniac.net>
*/
#ifndef SURICATA_FLOW_HASH_H
#define SURICATA_FLOW_HASH_H
#include "flow.h"
/** Spinlocks or Mutex for the flow buckets. */
//#define FBLOCK_SPIN
#define FBLOCK_MUTEX
#ifdef FBLOCK_SPIN
#ifdef FBLOCK_MUTEX
#error Cannot enable both FBLOCK_SPIN and FBLOCK_MUTEX
#endif
#endif
/* flow hash bucket -- the hash is basically an array of these buckets.
* Each bucket contains a flow or list of flows. All these flows have
* the same hashkey (the hash is a chained hash). When doing modifications
* to the list, the entire bucket is locked. */
typedef struct FlowBucket_ {
flow: redesign of flow timeout handling Goals: - reduce locking - take advantage of 'hot' caches - better locality Locking reduction New flow spare pool. The global pool is implmented as a list of blocks, where each block has a 100 spare flows. Worker threads fetch a block at a time, storing the block in the local thread storage. Flow Recycler now returns flows to the pool is blocks as well. Flow Recycler fetches all flows to be processed in one step instead of one at a time. Cache 'hot'ness Worker threads now check the timeout of flows they evaluate during lookup. The worker will have to read the flow into cache anyway, so the added overhead of checking the timeout value is minimal. When a flow is considered timed out, one of 2 things happens: - if the flow is 'owned' by the thread it is handled locally. Handling means checking if the flow needs 'timeout' work. - otherwise, the flow is added to a special 'evicted' list in the flow bucket where it will be picked up by the flow manager. Flow Manager timing By default the flow manager now tries to do passes of the flow hash in smaller steps, where the goal is to do full pass in 8 x the lowest timeout value it has to enforce. So if the lowest timeout value is 30s, a full pass will take 4 minutes. The goal here is to reduce locking overhead and not get in the way of the workers. In emergency mode each pass is full, and lower timeouts are used. Timing of the flow manager is also no longer relying on pthread condition variables, as these generally cause waking up much quicker than the desired timout. Instead a simple (u)sleep loop is used. Both changes reduce the number of hash passes a lot. Emergency behavior In emergency mode there a number of changes to the workers. In this scenario the flow memcap is fully used up and it is unavoidable that some flows won't be tracked. 1. flow spare pool fetches are reduced to once a second. This avoids locking overhead, while the chance of success was very low. 2. getting an active flow directly from the hash skips flows that had very recent activity to avoid the scenario where all flows get only into the NEW state before getting reused. Rather allow some to have a chance of completing. 3. TCP packets that are not SYN packets will not get a used flow, unless stream.midstream is enabled. The goal here is again to avoid evicting active flows unnecessarily. Better Localily Flow Manager injects flows into the worker threads now, instead of one or two packets. Advantage of this is that the worker threads can get packets from their local packet pools, avoiding constant overhead of packets returning to 'foreign' pools. Counters A lot of flow counters have been added and some have been renamed. Overall the worker threads increment 'flow.wrk.*' counters, while the flow manager increments 'flow.mgr.*'. Additionally, none of the counters are snapshots anymore, they all increment over time. The flow.memuse and flow.spare counters are exceptions. Misc FlowQueue has been split into a FlowQueuePrivate (unlocked) and FlowQueue. Flow no longer has 'prev' pointers and used a unified 'next' pointer for both hash and queue use.
6 years ago
/** head of the list of active flows for this row. */
Flow *head;
flow: redesign of flow timeout handling Goals: - reduce locking - take advantage of 'hot' caches - better locality Locking reduction New flow spare pool. The global pool is implmented as a list of blocks, where each block has a 100 spare flows. Worker threads fetch a block at a time, storing the block in the local thread storage. Flow Recycler now returns flows to the pool is blocks as well. Flow Recycler fetches all flows to be processed in one step instead of one at a time. Cache 'hot'ness Worker threads now check the timeout of flows they evaluate during lookup. The worker will have to read the flow into cache anyway, so the added overhead of checking the timeout value is minimal. When a flow is considered timed out, one of 2 things happens: - if the flow is 'owned' by the thread it is handled locally. Handling means checking if the flow needs 'timeout' work. - otherwise, the flow is added to a special 'evicted' list in the flow bucket where it will be picked up by the flow manager. Flow Manager timing By default the flow manager now tries to do passes of the flow hash in smaller steps, where the goal is to do full pass in 8 x the lowest timeout value it has to enforce. So if the lowest timeout value is 30s, a full pass will take 4 minutes. The goal here is to reduce locking overhead and not get in the way of the workers. In emergency mode each pass is full, and lower timeouts are used. Timing of the flow manager is also no longer relying on pthread condition variables, as these generally cause waking up much quicker than the desired timout. Instead a simple (u)sleep loop is used. Both changes reduce the number of hash passes a lot. Emergency behavior In emergency mode there a number of changes to the workers. In this scenario the flow memcap is fully used up and it is unavoidable that some flows won't be tracked. 1. flow spare pool fetches are reduced to once a second. This avoids locking overhead, while the chance of success was very low. 2. getting an active flow directly from the hash skips flows that had very recent activity to avoid the scenario where all flows get only into the NEW state before getting reused. Rather allow some to have a chance of completing. 3. TCP packets that are not SYN packets will not get a used flow, unless stream.midstream is enabled. The goal here is again to avoid evicting active flows unnecessarily. Better Localily Flow Manager injects flows into the worker threads now, instead of one or two packets. Advantage of this is that the worker threads can get packets from their local packet pools, avoiding constant overhead of packets returning to 'foreign' pools. Counters A lot of flow counters have been added and some have been renamed. Overall the worker threads increment 'flow.wrk.*' counters, while the flow manager increments 'flow.mgr.*'. Additionally, none of the counters are snapshots anymore, they all increment over time. The flow.memuse and flow.spare counters are exceptions. Misc FlowQueue has been split into a FlowQueuePrivate (unlocked) and FlowQueue. Flow no longer has 'prev' pointers and used a unified 'next' pointer for both hash and queue use.
6 years ago
/** head of the list of evicted flows for this row. Waiting to be
* collected by the Flow Manager. */
Flow *evicted;
#ifdef FBLOCK_MUTEX
SCMutex m;
#elif defined FBLOCK_SPIN
SCSpinlock s;
#else
#error Enable FBLOCK_SPIN or FBLOCK_MUTEX
#endif
flow-manager: optimize hash walking Until now the flow manager would walk the entire flow hash table on an interval. It would thus touch all flows, leading to a lot of memory and cache pressure. In scenario's where the number of tracked flows run into the hundreds on thousands, and the memory used can run into many hundreds of megabytes or even gigabytes, this would lead to serious performance degradation. This patch introduces a new approach. A timestamp per flow bucket (hash row) is maintained by the flow manager. It holds the timestamp of the earliest possible timeout of a flow in the list. The hash walk skips rows with timestamps beyond the current time. As the timestamp depends on the flows in the hash row's list, and on the 'state' of each flow in the list, any addition of a flow or changing of a flow's state invalidates the timestamp. The flow manager then has to walk the list again to set a new timestamp. A utility function FlowUpdateState is introduced to change Flow states, taking care of the bucket timestamp invalidation while at it. Empty flow buckets use a special value so that we don't have to take the flow bucket lock to find out the bucket is empty. This patch also adds more performance counters: flow_mgr.flows_checked | Total | 929 flow_mgr.flows_notimeout | Total | 391 flow_mgr.flows_timeout | Total | 538 flow_mgr.flows_removed | Total | 277 flow_mgr.flows_timeout_inuse | Total | 261 flow_mgr.rows_checked | Total | 1000000 flow_mgr.rows_skipped | Total | 998835 flow_mgr.rows_empty | Total | 290 flow_mgr.rows_maxlen | Total | 2 flow_mgr.flows_checked: number of flows checked for timeout in the last pass flow_mgr.flows_notimeout: number of flows out of flow_mgr.flows_checked that didn't time out flow_mgr.flows_timeout: number of out of flow_mgr.flows_checked that did reach the time out flow_mgr.flows_removed: number of flows out of flow_mgr.flows_timeout that were really removed flow_mgr.flows_timeout_inuse: number of flows out of flow_mgr.flows_timeout that were still in use or needed work flow_mgr.rows_checked: hash table rows checked flow_mgr.rows_skipped: hash table rows skipped because non of the flows would time out anyway The counters below are only relating to rows that were not skipped. flow_mgr.rows_empty: empty hash rows flow_mgr.rows_maxlen: max number of flows per hash row. Best to keep low, so increase hash-size if needed. flow_mgr.rows_busy: row skipped because it was locked by another thread
9 years ago
/** timestamp in seconds of the earliest possible moment a flow
* will time out in this row. Set by the flow manager. Cleared
* to 0 by workers, either when new flows are added or when a
* flow state changes. The flow manager sets this to UINT_MAX for
flow-manager: optimize hash walking Until now the flow manager would walk the entire flow hash table on an interval. It would thus touch all flows, leading to a lot of memory and cache pressure. In scenario's where the number of tracked flows run into the hundreds on thousands, and the memory used can run into many hundreds of megabytes or even gigabytes, this would lead to serious performance degradation. This patch introduces a new approach. A timestamp per flow bucket (hash row) is maintained by the flow manager. It holds the timestamp of the earliest possible timeout of a flow in the list. The hash walk skips rows with timestamps beyond the current time. As the timestamp depends on the flows in the hash row's list, and on the 'state' of each flow in the list, any addition of a flow or changing of a flow's state invalidates the timestamp. The flow manager then has to walk the list again to set a new timestamp. A utility function FlowUpdateState is introduced to change Flow states, taking care of the bucket timestamp invalidation while at it. Empty flow buckets use a special value so that we don't have to take the flow bucket lock to find out the bucket is empty. This patch also adds more performance counters: flow_mgr.flows_checked | Total | 929 flow_mgr.flows_notimeout | Total | 391 flow_mgr.flows_timeout | Total | 538 flow_mgr.flows_removed | Total | 277 flow_mgr.flows_timeout_inuse | Total | 261 flow_mgr.rows_checked | Total | 1000000 flow_mgr.rows_skipped | Total | 998835 flow_mgr.rows_empty | Total | 290 flow_mgr.rows_maxlen | Total | 2 flow_mgr.flows_checked: number of flows checked for timeout in the last pass flow_mgr.flows_notimeout: number of flows out of flow_mgr.flows_checked that didn't time out flow_mgr.flows_timeout: number of out of flow_mgr.flows_checked that did reach the time out flow_mgr.flows_removed: number of flows out of flow_mgr.flows_timeout that were really removed flow_mgr.flows_timeout_inuse: number of flows out of flow_mgr.flows_timeout that were still in use or needed work flow_mgr.rows_checked: hash table rows checked flow_mgr.rows_skipped: hash table rows skipped because non of the flows would time out anyway The counters below are only relating to rows that were not skipped. flow_mgr.rows_empty: empty hash rows flow_mgr.rows_maxlen: max number of flows per hash row. Best to keep low, so increase hash-size if needed. flow_mgr.rows_busy: row skipped because it was locked by another thread
9 years ago
* empty buckets. */
SC_ATOMIC_DECLARE(uint32_t, next_ts);
} __attribute__((aligned(CLS))) FlowBucket;
#ifdef FBLOCK_SPIN
#define FBLOCK_INIT(fb) SCSpinInit(&(fb)->s, 0)
#define FBLOCK_DESTROY(fb) SCSpinDestroy(&(fb)->s)
#define FBLOCK_LOCK(fb) SCSpinLock(&(fb)->s)
#define FBLOCK_TRYLOCK(fb) SCSpinTrylock(&(fb)->s)
#define FBLOCK_UNLOCK(fb) SCSpinUnlock(&(fb)->s)
#elif defined FBLOCK_MUTEX
#define FBLOCK_INIT(fb) SCMutexInit(&(fb)->m, NULL)
#define FBLOCK_DESTROY(fb) SCMutexDestroy(&(fb)->m)
#define FBLOCK_LOCK(fb) SCMutexLock(&(fb)->m)
#define FBLOCK_TRYLOCK(fb) SCMutexTrylock(&(fb)->m)
#define FBLOCK_UNLOCK(fb) SCMutexUnlock(&(fb)->m)
#else
#error Enable FBLOCK_SPIN or FBLOCK_MUTEX
#endif
/* prototypes */
Flow *FlowGetFlowFromHash(ThreadVars *tv, FlowLookupStruct *tctx, Packet *, Flow **);
Flow *FlowGetFromFlowKey(FlowKey *key, struct timespec *ttime, const uint32_t hash);
Flow *FlowGetExistingFlowFromFlowId(int64_t flow_id);
uint32_t FlowKeyGetHash(FlowKey *flow_key);
uint32_t FlowGetIpPairProtoHash(const Packet *p);
flow: redesign of flow timeout handling Goals: - reduce locking - take advantage of 'hot' caches - better locality Locking reduction New flow spare pool. The global pool is implmented as a list of blocks, where each block has a 100 spare flows. Worker threads fetch a block at a time, storing the block in the local thread storage. Flow Recycler now returns flows to the pool is blocks as well. Flow Recycler fetches all flows to be processed in one step instead of one at a time. Cache 'hot'ness Worker threads now check the timeout of flows they evaluate during lookup. The worker will have to read the flow into cache anyway, so the added overhead of checking the timeout value is minimal. When a flow is considered timed out, one of 2 things happens: - if the flow is 'owned' by the thread it is handled locally. Handling means checking if the flow needs 'timeout' work. - otherwise, the flow is added to a special 'evicted' list in the flow bucket where it will be picked up by the flow manager. Flow Manager timing By default the flow manager now tries to do passes of the flow hash in smaller steps, where the goal is to do full pass in 8 x the lowest timeout value it has to enforce. So if the lowest timeout value is 30s, a full pass will take 4 minutes. The goal here is to reduce locking overhead and not get in the way of the workers. In emergency mode each pass is full, and lower timeouts are used. Timing of the flow manager is also no longer relying on pthread condition variables, as these generally cause waking up much quicker than the desired timout. Instead a simple (u)sleep loop is used. Both changes reduce the number of hash passes a lot. Emergency behavior In emergency mode there a number of changes to the workers. In this scenario the flow memcap is fully used up and it is unavoidable that some flows won't be tracked. 1. flow spare pool fetches are reduced to once a second. This avoids locking overhead, while the chance of success was very low. 2. getting an active flow directly from the hash skips flows that had very recent activity to avoid the scenario where all flows get only into the NEW state before getting reused. Rather allow some to have a chance of completing. 3. TCP packets that are not SYN packets will not get a used flow, unless stream.midstream is enabled. The goal here is again to avoid evicting active flows unnecessarily. Better Localily Flow Manager injects flows into the worker threads now, instead of one or two packets. Advantage of this is that the worker threads can get packets from their local packet pools, avoiding constant overhead of packets returning to 'foreign' pools. Counters A lot of flow counters have been added and some have been renamed. Overall the worker threads increment 'flow.wrk.*' counters, while the flow manager increments 'flow.mgr.*'. Additionally, none of the counters are snapshots anymore, they all increment over time. The flow.memuse and flow.spare counters are exceptions. Misc FlowQueue has been split into a FlowQueuePrivate (unlocked) and FlowQueue. Flow no longer has 'prev' pointers and used a unified 'next' pointer for both hash and queue use.
6 years ago
/** \note f->fb must be locked */
static inline void RemoveFromHash(Flow *f, Flow *prev_f)
{
FlowBucket *fb = f->fb;
/* remove from the hash */
if (prev_f != NULL) {
prev_f->next = f->next;
} else {
fb->head = f->next;
}
f->next = NULL;
f->fb = NULL;
}
#endif /* SURICATA_FLOW_HASH_H */