Commit Graph

58 Commits (e43c4f3ea2b9d72a58df6e4c4ada9057bcf01101)

Author SHA1 Message Date
Victor Julien e43c4f3ea2 mpm: optimize calls
For all mpm wrapper functions, check minlen vs the input buffer to see
if we can bypass the mpm search.

Next to this, make all the function inline. Also constify the input and
do other minor cleanups.
9 years ago
Victor Julien 6bb2b001a3 mpm: cleanup: move mpm funcs into buffer specific files 9 years ago
Victor Julien 4f8e1f59a6 mpm: remove obsolete mpm algos
Remove: ac-gfbs, wumanber, b2g, b3g.
9 years ago
Ken Steele 8f1d75039a Enforce function coding standard
Functions should be defined as:

int foo(void)
{
}

Rather than:
int food(void) {
}

All functions where changed by a script to match this standard.
11 years ago
Victor Julien 5e1bc99e5b detect: cleanup
Remove unused alstate and app layer flags arguments from
DetectEngineInspectPacketPayload()
11 years ago
Victor Julien 79c924af8c Fix 2 compiler warnings
FreeBSD 10 32-bit with clang 3.3:

log-tlslog.c:172:14: error: format specifies type 'long' but the argument has type 'time_t' (aka 'int') [-Werror,-Wformat]
             p->ts.tv_sec,
             ^~~~~~~~~~~~
1 error generated.

detect-engine-payload.c:508:27: warning: format specifies type 'long' but the argument has type 'time_t' (aka 'int') [-Wformat]
    printf("%ld.%06ld\n", tv_diff.tv_sec, (long int)tv_diff.tv_usec);
            ~~~           ^~~~~~~~~~~~~~
            %d
1 warning generated.
11 years ago
Anoop Saldanha 3749fc98fd Modify handling of negated content.
The old behaviour of returning a failure if we found a pattern while
matching on negated content is now changed to continuing searching
for other combinations where we don't find the pattern for the
negated content.

Thanks to Will Metcalf for reporting this.
12 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
12 years ago
Anoop Saldanha bd6896bee1 Unit-tests exposing a bug in byte_test, byte_jump and byte_extract.
Bug emanates from all the keywords being unable to handle negative offsets
when the inspection pointer is at the end of the buffer.
12 years ago
Anoop Saldanha ab4b15c2e7 fix for #788.
Now depth is kept in mind when we inspect chunks in client/server body.
This takes care of FPs originating from inspecting subsequent chunks that
match with depth, but shouldn't.
12 years ago
Anoop Saldanha aa363a8144 unittest to display #784. 13 years ago
Victor Julien 0c84a7a2a9 Use _mm_free for memory allocated by _mm_alloc. Bug 703. Minor compiler warning fixes. 13 years ago
Victor Julien 472e061c6d build: more checking for includes 13 years ago
Anoop Saldanha 19e8f82f25 Unittest to display #bug 529. pcre anchor not respected 13 years ago
Anoop Saldanha a34f91358d tests to highlight that
- suricata treates sigs with offset/depth without any packet keywords as stream sigs
- as a consequence suricata will FN on such sigs

The tests introduced here will fail, displaying the issues.  The
next patch in the series would fix the said issues.
13 years ago
Anoop Saldanha 37f66e5f46 update handling negative offsets in byte_extract. Also improve validation in byte_extract to not extract values out of the buffer range 13 years ago
Anoop Saldanha 603d4a719a remove det_ctx->payload_offset and use det_ctx->buffer_offset. Update hscd and hsmd to use the new generic content inspection engine 14 years ago
Anoop Saldanha d1d5507679 remove all old content inspection engines and references to them. We have cleaned the entire content inspection phase and improved alert accuracy 14 years ago
Anoop Saldanha 35f1f7e8d9 unify payload detection engines + fix other bugs in pcre init 14 years ago
Anoop Saldanha 7433d92dd2 undo this commit -
commit eff08f93d8
Author: Anoop Saldanha <poonaatsoc@gmail.com>
Date:   Thu Nov 3 14:31:24 2011 +0530

    update failing unittest to reflect the mpm design update

Fixed a bug in the mpm code that would make all the changes in the commit just undone wrong.
14 years ago
Anoop Saldanha eff08f93d8 update failing unittest to reflect the mpm design update 14 years ago
Anoop Saldanha ed3b44b3b5 fix parsing content keywords. We are more strict now. All content keywords need to be enclosed in double quotes. Better validation for sid, priority and rev keywords 14 years ago
Eric Leblond a85dc9b0e2 Add support for replace keyword.
This patch adds support for the replace keyword. It is used with
content to change selected part of the payload. The major point
with this patch is that having a replace keyword made necessary
to avoid all stream level check because we need to access to the
could-be-modified packet payload.

One of the main difficulty is to handle complex signature. If there is
other content check, we must do the substitution when we're sure all
match are valid. The patch adds an attribute to the thread context
variable to be able to deal with recursivity of the match function.

Replace is only activated in IPS mode and apply only to raw match.
14 years ago
Anoop Saldanha 35f3eafa5e byte extract added to the engine. Detection support added for packet payload, uri and dce detection engines 14 years ago
Victor Julien e16a566a96 Account for distance when checking within. Bug #285. 14 years ago
Victor Julien 987ce57a02 Wrap a number of BUG_ON's in the detection engine in DEBUG ifdefs as the conditions they check for are not serious enough to abort the engine. 15 years ago
Anoop Saldanha 2321a4dd58 support isdataat negation. Also fix addiing isdataat to appropriate lists 15 years ago
Anoop Saldanha c227aeeacb remove support for skipping reinspecting fast pattern contents once again during packet payload inspection. Also make some changes to our detection engine 15 years ago
Anoop Saldanha bbd0c5056b store the content added for mpm inside Signature. also carry out an unconditional cleanup of packet pattern matcher pmq det_ctx->pmq 15 years ago
Anoop Saldanha 6df051321f fix fp when content is negated and also added to mpm 15 years ago
Anoop Saldanha eade60f0fd make some name changes. break PopulateMpm(). Set the avoid mpm double check flags 15 years ago
Anoop Saldanha 46b4806d8e use a single populatempm() function to add the right content for mpm 15 years ago
Anoop Saldanha e54358a9e1 replace all Signature->pmatch instances in the engine with Signature->sm_lists[DETECT_SM_LIST_PMATCH] 15 years ago
Anoop Saldanha 59923316bc change the default recursion limit in the code to 3000, the value which we currently have in the conf file. Also change print modifier for printing timeval 15 years ago
Anoop Saldanha 5d9a453e0d find an optimal value for detect-engine:inspection-recursion_limit + unittest 15 years ago
Anoop Saldanha bc99328ec8 define a new conf paramter detect-engine:inspection-recursion-limit; Defines a recursion limit for content inspection code 15 years ago
Victor Julien 2b187a2721 Remove a BUG_ON statement from the payload inspection code. 15 years ago
Anoop Saldanha a85fa6b792 support for fast_pattern only and fast_pattern:offset,length. Also support the new option for engine-analysis 15 years ago
Anoop Saldanha 60c770c434 make pcre respect discontinue_matching flag in content matching functions 15 years ago
Anoop Saldanha 3536ba7348 fix seg fault due to premature cleanup/double cleanup for byte(jump|test), isdataat, on seeing no previous relative keywords 15 years ago
Victor Julien 1071a53210 Fix unittests after ip_proto keyword change. 15 years ago
Anoop Saldanha b94eaec7c2 implement relative pcre matching in detect-engine-(payload|uri|dcepayload).c. Also fix within/distance handling of RELATIVE_NEXT flag for uricontent 15 years ago
Anoop Saldanha ae3148aded fix false positives for a negated content case 15 years ago
Anoop Saldanha fa373516c5 fixes the offset case for content matches + a case not handled by the prevous fix for multiple relative content matches. fix for payload.c dcepayload.c and uri.c 15 years ago
Anoop Saldanha 5fb6981e9e content handling changes in detect-engine-payload.c for multiple relative matches 15 years ago
Victor Julien a0c1209a44 Inspect the reassembled stream together with the packet payload in the same direction. 15 years ago
William Metcalf 2eef905c07 GPL and Copyright header updates. 15 years ago
William Metcalf ce01927515 Import of GPLv2 Header 050410 15 years ago
Gurvinder Singh 719fa5f5e1 fixed the incorrect depth update incase of offset is 0 (bug 134) 15 years ago
Victor Julien 09b48d2697 Fix payload and uri detection inline errors in gnu99 15 years ago