Commit Graph

168 Commits (2c8e8c2516742a100875a4b9392bd889e4506a00)

Author SHA1 Message Date
Victor Julien bec59f426e Fix sanity check in AppInspectionEngine registration code 12 years ago
Victor Julien a26243a23c Clean up rule reload logging 12 years ago
Anoop Saldanha f592c481dc Introduce a separate inspection engine for app events. 12 years ago
Victor Julien 8080494e9a counters: consolidate counters after all ThreadInit functions of a thread have run. This prevents duplicate and overwriting memory allocations. 12 years ago
Victor Julien dd76e679fe mpm: clean up stream thread ctx 12 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
12 years ago
Anoop Saldanha a44d42b124 Fixes segv inside rule swap under low mem conditions.
We now gracefully exit rule swap on any allocation or other failures.
12 years ago
Victor Julien 43ba5a677e DNS: enable mpm/fast_pattern support for dns_query 12 years ago
Victor Julien f10dd603ff DNS: adding dns_request content modifier 12 years ago
Anoop Saldanha 17c763f855 Version 1 of AC Cuda. 12 years ago
Anoop Saldanha d4d18e3136 Transaction engine redesigned.
Improved accuracy, improved performance.  Performance improvement
noticeable with http heavy traffic and ruleset.

A lot of other cosmetic changes carried out as well.  Wrappers introduced
for a lot of app layer functions.

Failing dce unittests disabled.  Will be reintroduced in the updated dce
engine.

Cross transaction matching taken care of.  FPs emanating from these
matches have now disappeared.  Double inspection of transactions taken
care of as well.
12 years ago
Victor Julien eb11280888 Use define instead of magic number for pmq's per detect thread 12 years ago
Victor Julien 0fa38c13d1 detection engine: consolidate thread setup
DetectEngineThreadCtxInit and DetectEngineThreadCtxInitForLiveRuleSwap did
pretty much the same thing, except for a counters registration. As can be
predicted with code duplication like this, things got out of sync. To make
sure this doesn't happen again, I created a helper function that does the
heavy lifting in this function.
12 years ago
Victor Julien 73158fea33 Fix PmqSetup calls in Liveswap thread init. Func was out of sync with normal thread init. 12 years ago
Anoop Saldanha 8bf034e8c4 Live rule swap logs added to report SigLoadSignatures() failure. Also set
thread_closed flag on exit for live swap thread.
13 years ago
Anoop Saldanha 4c6efa2d40 Update content id assignment.
All fp id assignment now happens in one go.
Also noticing a slight perf increase, probably emanating from improved cache
perf.
Removed irrelevant unittests as well.
13 years ago
Anoop Saldanha 75130f9702 fix for #769.
Packet inserted by live swap flagged as pseudo packet.
13 years ago
Anoop Saldanha 3511f91bba Add support for the new keyword - http_raw_host header.
The corresponding pcre modifier would be 'Z'.
13 years ago
Anoop Saldanha c4ce19a1be Add support for a new keyword to inspect http_host header.
The corresponding content keyword would now be - http_host.
The corresponding pcre modifier would be W.
13 years ago
Anoop Saldanha 464ed95f71 fix for bug #526.
Insert pseudo packet under low load conditions to complete rule swap.
This is necessary when we use autofp active packets where most packets
would be sent to the first queue under low load conditions.
13 years ago
Victor Julien e30b1bfe64 Simple IP reputation implementation 13 years ago
Victor Julien 84bad6db77 Silence compiler warnings found by clang 13 years ago
Anoop Saldanha bca1b7c52a change default mpm to ac. Also default sgh-mpm-context is full. 13 years ago
Victor Julien fd6df00684 Bug 585: use per detect thread libmagic ctx 13 years ago
Anoop Saldanha b99f9fe890 New app inspection engine introduced. Moved existing inspecting engines to use it. 13 years ago
Victor Julien a0c43a8a1c Minor parsing cleanups in detect-engine options. 13 years ago
Eric Leblond e176be6fcc Use unlikely for error treatment.
When handling error case on SCMallog, SCCalloc or SCStrdup
we are in an unlikely case. This patch adds the unlikely()
expression to indicate this to gcc.

This patch has been obtained via coccinelle. The transformation
is the following:

@istested@
identifier x;
statement S1;
identifier func =~ "(SCMalloc|SCStrdup|SCCalloc)";
@@

x = func(...)
... when != x
- if (x == NULL) S1
+ if (unlikely(x == NULL)) S1
13 years ago
Victor Julien 6f7e527e92 luajit: fix crash at shutdown / rule reload if lua script didn't properly init. 13 years ago
Victor Julien ec7e79c748 Rule profiling update
- Remove usage of counters api.
- Store stats in detect engine thread ctx to remove locking
- Support rule reloads
13 years ago
Eric Leblond d51dd6a30e Fix warning about unused return of SC_ATOMIC func. 13 years ago
Eric Leblond d1569337a7 affinity: add call to setup function in threads
Threads created through TMThreadSpawn need to call the affinity
function by themselves.
13 years ago
Eric Leblond 0eeccb4b17 affinity: tag management threads as such
The management threads were not tagged for CPU affinity and thus
the setting was not applied.
13 years ago
Victor Julien ba3260ed38 Thread local ctx for detection keywords
Some detection keywords need thread local ctx storage. Example is the
filemagic keyword that has a ctx that is modified with each call. That
is not thread safe. This functionality allows registration of thread
local ctxs so that each detect thread works on it's own copy.
13 years ago
Victor Julien 408548c2c4 rule reloads: don't lock up main thread so clean shutdown is impossible 13 years ago
Eric Leblond 92679442ca Convert to atomic and disable check on HTP config change.
This patch converts the series of variable to an atomic.

Furthermore, as the callbacks are now always run, it is not
necessary anymore to refuse a ruleswap if HTP parameters are
changing.
13 years ago
Anoop Saldanha b2f589527a Set thread name Suricata-Main for main thread and LiveRuleSwap for live swap thread 13 years ago
Victor Julien c7af0589bc Fix a reload memleak in thread local detection engine ctx. 13 years ago
Anoop Saldanha 0c24bbab0c code cleanup for live swap 13 years ago
Anoop Saldanha 2bc7d0792d update clean up of old detection engine contexts for live rule swap 13 years ago
Anoop Saldanha eee33866df DetectEngineCtxFree() cleanup, also in main 13 years ago
Anoop Saldanha c3eab5cf4e Replace the old atomic sets using cas with the new sc_atomic_set macro 13 years ago
Anoop Saldanha 8fb2040eee disable live rule swap when -s or -S option's used at startup 13 years ago
Anoop Saldanha 31eb5fa2f6 Introduce util-signal.[ch]. Move our signal setup functions here 13 years ago
Victor Julien 4cde2355bd Simplify flow resetting on de_ctx update. Detect ctx id starts at 1. So in a flow 0 means uninitialized (thus set) and if we detect flow is not equal to detect id, we reset the sgh storage and de_state. 13 years ago
Anoop Saldanha 6fa46d7526 If new ruleset requires any htp callbacks that aren't already set, don't load new ruleset; request user to restart suricata + disable setting fileinsepection flags unconditionally in main 13 years ago
Anoop Saldanha e5edcfaca8 add unittest for atomic operation with void * 13 years ago
Anoop Saldanha ecad4a24fa live rule support added
To reload ruleset during engine runtime, send the USR2 signal to the engine, and the ruleset would be reloaded from the same yaml file supplied at engine startup
13 years ago
Anoop Saldanha 5878d83174 byte_extract_id var now a non-global de_ctx specific var 13 years ago
Anoop Saldanha 7acf5ad38e clean reference config API 13 years ago
Anoop Saldanha 6003c7cb6b clean classification config API 13 years ago
Anoop Saldanha f2dd61868d variable names global vars, global no more. Moved to detection engine ctx, a place it belongs 13 years ago
Anoop Saldanha 55d4e9518e Kill engine during init stage if it fails to load valid value for sgh-mpm-context 13 years ago
Anoop Saldanha 09ec7ec728 bug 456 fix for byte_extract to have array of the right size to update values with 13 years ago
Victor Julien d378b76c04 http: body inspection improvement
Improve http_client_body and file_data performance when request and
response body limits are set to high values.
13 years ago
Victor Julien da3c5bf84d Minor error message cleanups 14 years ago
Nikolay Denev 139768dd58 Do not use underscored config vars internally. 14 years ago
Anoop Saldanha db859cc56e treate ac-bs auto as single context 14 years ago
Anoop Saldanha 4d38a571cc smtp reply code mpm phase support added 14 years ago
Victor Julien 820b0ded82 Add per packet profiling.
Per packet profiling uses tick based accounting. It has 2 outputs, a summary
and a csv file that contains per packet stats.

Stats per packet include:
 1) total ticks spent
 2) ticks spent per individual thread module
 3) "threading overhead" which is simply calculated by subtracting (2) of (1).

A number of changes were made to integrate the new code in a clean way:
a number of generic enums are now placed in tm-threads-common.h so we can
include them from any part of the engine.

Code depends on --enable-profiling just like the rule profiling code.

New yaml parameters:

profiling:
  # packet profiling
  packets:

    # Profiling can be disabled here, but it will still have a
    # performance impact if compiled in.
    enabled: yes
    filename: packet_stats.log
    append: yes

    # per packet csv output
    csv:

      # Output can be disabled here, but it will still have a
      # performance impact if compiled in.
      enabled: no
      filename: packet_stats.csv

Example output of summary stats:

IP ver   Proto   cnt        min      max          avg
------   -----   ------     ------   ----------   -------
 IPv4       6     19436      11448      5404365     32993
 IPv4     256         4      11511        49968     30575

Per Thread module stats:

Thread Module              IP ver   Proto   cnt        min      max          avg
------------------------   ------   -----   ------     ------   ----------   -------
TMM_DECODEPCAPFILE          IPv4       6     19434       1242        47889      1770
TMM_DETECT                  IPv4       6     19436       1107       137241      1504
TMM_ALERTFASTLOG            IPv4       6     19436         90         1323       155
TMM_ALERTUNIFIED2ALERT      IPv4       6     19436        108         1359       138
TMM_ALERTDEBUGLOG           IPv4       6     19436         90         1134       154
TMM_LOGHTTPLOG              IPv4       6     19436        414      5392089      7944
TMM_STREAMTCP               IPv4       6     19434        828      1299159     19438

The proto 256 is a counter for handling of pseudo/tunnel packets.

Example output of csv:

pcap_cnt,ipver,ipproto,total,TMM_DECODENFQ,TMM_VERDICTNFQ,TMM_RECEIVENFQ,TMM_RECEIVEPCAP,TMM_RECEIVEPCAPFILE,TMM_DECODEPCAP,TMM_DECODEPCAPFILE,TMM_RECEIVEPFRING,TMM_DECODEPFRING,TMM_DETECT,TMM_ALERTFASTLOG,TMM_ALERTFASTLOG4,TMM_ALERTFASTLOG6,TMM_ALERTUNIFIEDLOG,TMM_ALERTUNIFIEDALERT,TMM_ALERTUNIFIED2ALERT,TMM_ALERTPRELUDE,TMM_ALERTDEBUGLOG,TMM_ALERTSYSLOG,TMM_LOGDROPLOG,TMM_ALERTSYSLOG4,TMM_ALERTSYSLOG6,TMM_RESPONDREJECT,TMM_LOGHTTPLOG,TMM_LOGHTTPLOG4,TMM_LOGHTTPLOG6,TMM_PCAPLOG,TMM_STREAMTCP,TMM_DECODEIPFW,TMM_VERDICTIPFW,TMM_RECEIVEIPFW,TMM_RECEIVEERFFILE,TMM_DECODEERFFILE,TMM_RECEIVEERFDAG,TMM_DECODEERFDAG,threading
1,4,6,172008,0,0,0,0,0,0,47889,0,0,48582,1323,0,0,0,0,1359,0,1134,0,0,0,0,0,8028,0,0,0,49356,0,0,0,0,0,0,0,14337

First line of the file contains labels.

2 example gnuplot scripts added to plot the data.
14 years ago
Anoop Saldanha 35f3eafa5e byte extract added to the engine. Detection support added for packet payload, uri and dce detection engines 14 years ago
Eric Leblond 49adc264bc Don't print message after SCMalloc failure.
This patch generated via coccinelle is getting rid of logging
message after a SCMalloc failure. They were useless as SCMalloc
already displays a message.
15 years ago
Victor Julien b24ccf8c80 Clean up stream pmqs in the detect thread ctx. 15 years ago
Victor Julien 6ebe7b7cd3 Change the way the request body limit is enforced. 15 years ago
Anoop Saldanha 778ec0939c make client body buffer limit configurable. Also some minor changes 15 years ago
Victor Julien 07ec1ee10e Slightly cleanup detect-engine.sgh-mpm-context option parsing. 15 years ago
Anoop Saldanha c89507836b if sgh-mpm-context is not available in conf, alias the auto case inside the engine 15 years ago
Victor Julien 275bd3b7d7 Switch back to defaulting to full for detect-engine.sgh-mpm-context as it broke many tests. 15 years ago
Victor Julien 7e6f01765f Change default of detect-engine.sgh-mpm-context to auto. 15 years ago
Anoop Saldanha 59923316bc change the default recursion limit in the code to 3000, the value which we currently have in the conf file. Also change print modifier for printing timeval 15 years ago
Anoop Saldanha bc99328ec8 define a new conf paramter detect-engine:inspection-recursion-limit; Defines a recursion limit for content inspection code 15 years ago
Victor Julien 3bd7441ea5 Default to 'single' ctx for ac-gfbs as well. 15 years ago
Anoop Saldanha a2d04a94b5 selecting auto for detect-engine.sgh_mpm_context now uses single if the mpm is ac, full otherwise 15 years ago
Anoop Saldanha 0ef684705c support single mpm context distribution across sghs in staging. Also see to it that ac works fine with this setup 15 years ago
Anoop Saldanha b367c37ae6 suricata.yaml conf update to support single mpm context distribution over multiple sghs + code to parse this conf 15 years ago
Victor Julien 87f88867f4 Further improve B2gc. Add B2gm. Improve memory layout. 15 years ago
Anoop Saldanha 33f4beb0bc batching of packets support for cuda b2g mpm. Supported for both 32 and 64 bit platforms 15 years ago
Anoop Saldanha 9ecade76b9 in case of duplicate signatures used the one with the latest revision 15 years ago
Gurvinder Singh 8852b83fa7 flowbits, flowvars, pktvars, flow flags and app layer info added to alert-debug.log 15 years ago
Pablo Rincon eed0ef6e69 Adding tag keyword support 15 years ago
Anoop Saldanha bbb5bf5c51 allow counters clubbing for detect TM 15 years ago
Victor Julien 83b2c8abdb Improve stateful uri detection code. 15 years ago
Victor Julien a0c1209a44 Inspect the reassembled stream together with the packet payload in the same direction. 15 years ago
Victor Julien 2fd31a1a11 Remove dsize grouping from detection engine grouping reducing memory usage. Store sgh in flow to reduce lookups. Reduce locking in alert handling. Increase default grouping values as we use less memory. 15 years ago
Gurvinder Singh cda664a8c4 memroy leaks fixes in detection module, app layer and counters 15 years ago
William Metcalf 2eef905c07 GPL and Copyright header updates. 15 years ago
Victor Julien 70b32f7380 First stab at creating a stateful detection engine.
Stateful detection for app layer detection keywords, except uricontent. Stores it's partial results in the flow structure. Other modifications:

- Generalize transaction tracking, logging and inspection.
- Adapt http and dcerpc to use the new transaction handling.
- Stream engine now always notifies app layer of a stream eof.

This commit fixes bug #124.
15 years ago
Gerardo Iglesias Galvan 9f4fae5b1a Fix inconsistent use of dynamic memory allocation 15 years ago
Victor Julien 7a427ec7f4 Switch to pattern id based results checking in the mpm. Move app layer proto detection towards a more signature based approach. 15 years ago
William Metcalf ce01927515 Import of GPLv2 Header 050410 15 years ago
Anoop Saldanha 47037ef9ec fix for bug 115 15 years ago
Anoop Saldanha c54b91ed94 fix for bug 113 16 years ago
Victor Julien 50e41817a7 Share content id's between identical patterns. 16 years ago
Pablo Rincon 25a3a5c6d8 Adding mem wrapper to debug runtime alloc()/free() functions. Fixing some memory leaks. 16 years ago
Victor Julien 60685f8b3c Make unittests run more quiet. 16 years ago
Pablo Rincon 38dc7ffebc Adding settings for detect engine group config 16 years ago
Anoop Saldanha 8cf60d6645 Changed the way cuda dispatcher passes back results. Now each detection thread has it's own queue to which the dispatcher can pump packets back to the detect thread. Also, with cuda enabled and a non-cuda mpm being used, we won't create a dispatcher and instead call the b2g scan/search funtions directly instead of using the dispatcher. 16 years ago
Gurvinder Singh fea277b2aa memory leak fixes 16 years ago
Anoop Saldanha 011b74df63 Modify the classification config tests to use the buffer than a temp file and also fix an invalid free 16 years ago
Breno Silva 69eb869cc9 Threshold Rule 16 years ago
Victor Julien ecf86f9c23 Rename to Suricata. 16 years ago