suricata

Commit Graph

Author	SHA1	Message	Date
Victor Julien	633e6cf09e	flow/recycler: minor code cleanups	3 years ago
Victor Julien	73138809e2	flow/manager: move counters into util func	3 years ago
Victor Julien	0c048d3e5c	flow/manager: minor code cleanups	3 years ago
Victor Julien	7f4e120a97	flow/manager: remove debug and dead code	3 years ago
Victor Julien	e6ac2e4e8a	flow/manager: sleep handled by pthread_cond_t again Use only in live mode to allow FM to respond quickly to time increases in offline mode. Bug #4379.	3 years ago
Victor Julien	e9d2417e0f	flow/manager: adaptive hash eviction timing The flow manager scans the hash table in chunks based on the flow timeout settings. In the default config this will lead to a full hash pass every 240 seconds. Under pressure, this will lead to a large amount of memory still in use by flows waiting to be evicted, or evicted flows waiting to be freed. This patch implements a new adaptive logic to the timing and amount of work that is done by the flow manager. It takes the memcap budgets and calculates the proportion of the memcap budgets in use. It takes the max in-use percentage, and adapts the flow manager behavior based on that. The memcaps considered are: flow, stream, stream-reassembly and app-layer-http The percentage in use, is inversely applies to the time the flow manager takes for a full hash pass. In addition, it is also applied to the chunk size and the sleep time. Example: tcp.reassembly_memuse is at 90% of the memcap and normal flow hash pass is 240s. Hash pass time will be: 240 * (100 - 90) / 100 = 24s Chunk size and sleep time will automatically be updated for this. Adds various counters. Bug: #4650. Bug: #4808.	3 years ago
Victor Julien	5deb479f4c	flow: cleanup locking debug leftovers	3 years ago
Arne Welzel	8ef066318d	flow-manager: fix off-by-one in flow_hash row allocation The current code doesn't cover all rows when more than one flow manager is used. It leaves a single row between ftd->max and ftd->min of the next manager orphaned. As an example: hash_size=1000 flowmgr_number=3 range=333 instance ftd->min ftd->max 0 0 333 1 334 666 2 667 1000 Rows not covered: 333, 666	3 years ago
Victor Julien	ae0b8d92da	flow/manager: remove dead code	4 years ago
Victor Julien	244dd11c34	flow/manager: fix flows not evicted & freed in time Flows have been shown to linger for a long time w/o giving up their resources. This would lead to higher memory use and memcaps getting reached. Three main causes have been identified: Slow passes hash passes. By default the flow manager will scan the flow hash slowly. It is based on the flow timeout settings, and with the default config it will take 4 minutes for a full scan to be complete. This leaves a window for flows that are timed out to linger for minutes longer than expected. Flow Manager yields under pressure. The per row TryLock causes work to be delayed more. The Flow manager will use trylock on a hash row and will yield immediately if the row is busy. This means that it will take a full pass before the row is revisited again. If the row holds busy flows, this could happen many times in a row. Flow Manager favors evicted flows over active flows. The Flow Manager will only process the evicted flows if they are present. These flows have been evicted by workers. The active flows on that hash row will have to wait until the next hash pass. Of course by then there could be more evicted flows. Combined these factors could lead to flows not being considered for freeing and logging for a very long time, potentially even indefinitly. The patch addresses the latter two flow manager issues by no longer using TryLock. It will now simply wait for the lock to be released and then do its work on it. Additionally for each row both the evicted list and the active flow list will be processed. Bug: #4650.	4 years ago
Victor Julien	b19d1df69f	flow/bypass: add util func to check if flow is bypassed To hide the ifdefs for capture offload.	4 years ago
Victor Julien	41fee41722	flow/manager: remove obsolete code	4 years ago
Philippe Antoine	cb30772372	style: remove latest warnings about unused variables	4 years ago
Philippe Antoine	e82416a415	http/range: reassemble files from different flows with range adds a container, ie a thread safe hash table whose key is the filename keep a tree of unordered ranges, up to a memcap limit adds HTPFileOpenWithRange to handle like HTPFileOpen if there is a range : open 2 files, one for the whole reassembled, and one only for the current range	4 years ago
Eric Leblond	d7468c55ca	flow: be sure to check hash till the end	4 years ago
Eric Leblond	e531530a67	flow: add comment on flow handling	4 years ago
Eric Leblond	cce7e4f4cb	flow: fix a debug assert As the FlowBypassedTimeout function is interacting with the capture method it is possible that the return changes between the call that did trigger the timeout and the actual state (ie if packets arrive in between the two calls). So we should not use the call to FlowBypassedTimeout in the assert.	4 years ago
Eric Leblond	9c89bc80d0	flow: document FlowBypassedTimeout Main point is to document it is interacting with the capture layer.	4 years ago
Eric Leblond	9a4ef6b8fc	flow: more accurate flow counters Don't increment the flow timeout counter for flows that are not really timeout (as use_cnt is non zero). And also don't take into account bypassed flows in the counter for flow timeout in use.	4 years ago
Victor Julien	61f6fe037d	flow/manager: set proper end flag	4 years ago
Victor Julien	9551cd0535	threading: don't pass locked flow between threads Previously the flow manager would share evicted flows with the workers while keeping the flows mutex locked. This reduced the number of unlock/ lock cycles while there was guaranteed to be no contention. This turns out to be undefined behavior. A lock is supposed to be locked and unlocked from the same thread. It appears that FreeBSD is stricter on this than Linux. This patch addresses the issue by unlocking before handing a flow off to another thread, and locking again from the new thread. Issue was reported and largely analyzed by Bill Meeks. Bug: #4478	4 years ago
Jeff Lucovsky	1eeb96696b	general: Cleanup bool usage	4 years ago
Victor Julien	17a38f1823	flow/manager: (u)sleep slightly longer Sleep 250 microseconds instead of 100 as running in KVM cause the old value to use 100% CPU for these threads. Perf testing suggests no measurable impact for the non-KVM case. Ticket: #4096	5 years ago
Victor Julien	abca451901	flow: suppress Coverity FP	5 years ago
Victor Julien	204302cbac	flow: minor code cleanup	5 years ago
Victor Julien	42ce297e0e	flow: turn BUG_ON into debug check	5 years ago
Victor Julien	703de67086	flow: fix multi-manager hash range calculation	5 years ago
Victor Julien	260a20bf91	flow: suppress noisy messages	5 years ago
Victor Julien	ee41c0e293	flow/spare: implement pool shrinking Remove at most one block per run, so it shrinks slowly.	5 years ago
Victor Julien	b3599507f4	flow: redesign of flow timeout handling Goals: - reduce locking - take advantage of 'hot' caches - better locality Locking reduction New flow spare pool. The global pool is implmented as a list of blocks, where each block has a 100 spare flows. Worker threads fetch a block at a time, storing the block in the local thread storage. Flow Recycler now returns flows to the pool is blocks as well. Flow Recycler fetches all flows to be processed in one step instead of one at a time. Cache 'hot'ness Worker threads now check the timeout of flows they evaluate during lookup. The worker will have to read the flow into cache anyway, so the added overhead of checking the timeout value is minimal. When a flow is considered timed out, one of 2 things happens: - if the flow is 'owned' by the thread it is handled locally. Handling means checking if the flow needs 'timeout' work. - otherwise, the flow is added to a special 'evicted' list in the flow bucket where it will be picked up by the flow manager. Flow Manager timing By default the flow manager now tries to do passes of the flow hash in smaller steps, where the goal is to do full pass in 8 x the lowest timeout value it has to enforce. So if the lowest timeout value is 30s, a full pass will take 4 minutes. The goal here is to reduce locking overhead and not get in the way of the workers. In emergency mode each pass is full, and lower timeouts are used. Timing of the flow manager is also no longer relying on pthread condition variables, as these generally cause waking up much quicker than the desired timout. Instead a simple (u)sleep loop is used. Both changes reduce the number of hash passes a lot. Emergency behavior In emergency mode there a number of changes to the workers. In this scenario the flow memcap is fully used up and it is unavoidable that some flows won't be tracked. 1. flow spare pool fetches are reduced to once a second. This avoids locking overhead, while the chance of success was very low. 2. getting an active flow directly from the hash skips flows that had very recent activity to avoid the scenario where all flows get only into the NEW state before getting reused. Rather allow some to have a chance of completing. 3. TCP packets that are not SYN packets will not get a used flow, unless stream.midstream is enabled. The goal here is again to avoid evicting active flows unnecessarily. Better Localily Flow Manager injects flows into the worker threads now, instead of one or two packets. Advantage of this is that the worker threads can get packets from their local packet pools, avoiding constant overhead of packets returning to 'foreign' pools. Counters A lot of flow counters have been added and some have been renamed. Overall the worker threads increment 'flow.wrk.' counters, while the flow manager increments 'flow.mgr.'. Additionally, none of the counters are snapshots anymore, they all increment over time. The flow.memuse and flow.spare counters are exceptions. Misc FlowQueue has been split into a FlowQueuePrivate (unlocked) and FlowQueue. Flow no longer has 'prev' pointers and used a unified 'next' pointer for both hash and queue use.	5 years ago
Victor Julien	f50c7b6d11	flow-manager: call other timeouts max once a second Call Defrag and others only once per second. Flow Manager may wake up (much) more often when flow engine is under resource pressure. As this does not affect Defrag and others, it only unnecessarily adds load.	5 years ago
Victor Julien	6814f08e93	flow-manager: only update FlowBucket::next_ts if it changed	5 years ago
Victor Julien	2a872ccb86	flow: timeout check on flow lookup	5 years ago
Victor Julien	611c991f27	flow: improve performance in emergency mode When the flow engine enters emergency mode, 3 things happen: 1. a different set of (lower) timeout values are applied 2. the flow manager runs more often 3. worker threads go get a flow directly from the hash table Testing showed that performance went down significantly due to concurrency issues: 1. worker threads would fight each other over the hash access 2. flow manager would get in the way of workers This patch changes the behavior in 2 ways: 1. it makes the flow manager slightly less aggressive. It will still try to run ~3 times per second, but no longer 10 times. This should be reducing the contention. At the same time flows won't time out faster if they are checked many times per second. 2. The 'get a used flow' logic optimizes the use of atomics by only doing an atomic operation once, and while doing so reserving a slice of the hash per worker. The worker will also give up much quicker, to avoid the overhead of hash walking and taking and releasing locks. These combined changes show much better 'under stress' behavior, esp on multi-NUMA systems.	5 years ago
Victor Julien	5acfdfcc76	flow/manager: fix management tasks not running Fix tasks not running on the first manager, even if there is just a single manager.	5 years ago
Victor Julien	c83a607b6a	atomics: add SC_ATOMIC_INITPTR macro Until now both atomic ints and pointers were initialized by SC_ATOMIC_INIT by setting them to 0. However, C11's atomic pointer type cannot be initialized this way w/o causing compiler warnings. As a preparation to supporting C11's atomics, this patch introduces a new macro to initialize atomic pointers and updates the relevant callers to use it.	5 years ago
Victor Julien	531ff3ddec	atomics: change SC_ATOMIC_ADD to 'fetch_add' Until this point the SC_ATOMIC_ADD macro pointed to a 'add_fetch' intrinsic. This patch changes it to a 'fetch_add'. There are 2 reasons for this: 1. C11 stdatomics.h has only 'atomic_fetch_add' and no 'add_fetch' So this patch prepares for adding support for C11 atomics. 2. It was not consistent with SC_ATOMIC_SUB, which did use 'fetch_sub' and not 'sub_fetch'. Most callers are not using the return value, so these are unaffected. The callers that do use the return value are updated.	5 years ago
Philippe Antoine	293eebd999	fuzz: remove obsolete AFL code	5 years ago
Victor Julien	072c421e46	pcap/file: improve time handling This patch addresses two problems. First, various parts of the engine, but most notably the flow manager (FM), use a minimum of the time notion of the packet threads. This did not however, take into account the scenario where one or more of these threads would be inactive for prolonged times. This could lead to the time used by the FM could get stale. This is addressed by keeping track of the last time the per thread packet timestamp was updated, and only considering it for the 'minimum' when it is reasonably current. Second, there was a minor race condition at start up, where the FM would already inspect the hash table(s) while the packet threads weren't active yet. Since FM gets the time from the packet threads, it would use a bogus time of 0. This is addressed by adding a wait loop to the start of the FM that waits for 'time' to get ready.	5 years ago
Victor Julien	5e583f3a12	flow: fix global variable use	6 years ago
Victor Julien	abea227cfc	flow-manager: code cleanups	6 years ago
Victor Julien	6fd35fb786	flow-manager: avoid doubly signaling threads Don't try to wake up the threads we just flagged and validated that they changed their state.	6 years ago
Eric Leblond	53a62953e9	bypass: introduce CAPTURE_OFFLOAD This define is used to remove reference to capture bypass in case no capture method implementing this is active. This patch also introduces CAPTURE_OFFLOAD_MANAGER that is defined if we need the flow bypass manager code.	6 years ago
Philippe Antoine	e30a77c5a1	warnings : Fixes integer sizes in format strings	6 years ago
Eric Leblond	98e7d9d1c0	util-device: introduce bypassed stats sub function	6 years ago
Eric Leblond	f29a4b8bee	flow-manager: move bypass timeout to a inline func	6 years ago
Eric Leblond	8c97998cb9	bypass: implement iface-bypassed-stat for callback	6 years ago
Eric Leblond	51ab06256a	bypass: account callback method in stats	6 years ago
Eric Leblond	f78e5ba1e1	bypass: restore interface counter	6 years ago
Eric Leblond	b07bda7a7b	bypass: new callback stragegy This patch introduces and uses a new bypass strategy based on a callback. EBPF bypass implementation is updated to use this new strategy. Once the flow manager detect that a flow should be timeouted, it asks the capture method if it has seen packets in the interval. If it is the case the lastts of the flow is updated and the timeout is postponed.	6 years ago

1 2 3 4

152 Commits (633e6cf09e4eb9b7e264cbb4f787d43838ab18a5)