Commit Graph

36 Commits (644d4dc61b466e28319c53af177878b4175b5241)

Author SHA1 Message Date
Victor Julien bb0cd0e883 prefilter: rename PatternMatcherQueue datatype
In preparation of the introduction of more general purpose prefilter
engines, rename PatternMatcherQueue to PrefilterRuleStore. The new
engines will fill this structure a similar way to the current mpm
prefilters.
9 years ago
Victor Julien 4c0ab681f2 mpm: remove Cleanup API call
It's unused by all of the implementations.
9 years ago
Victor Julien 371113e21e ac-ks: don't allow use on big-endian 10 years ago
Victor Julien 3979cb0e57 ac-ks: fix integer handling issue 10 years ago
Jason Ish 796dd5223b tests: no longer necessary to provide successful return code
1 pass, 0 is fail.
10 years ago
Victor Julien 9b6e292a28 mpm: remove unused max pattern len field 10 years ago
Victor Julien 9b3d4f7e24 mpm: unify & localize mpm pattern (id) handling
So far, the patterns as passed to the mpm's would use global id's that
were shared among all buffers, directions. This would lead to a fairly
large pattern id space. As the mpm algo's use the pattern id's to
prevent duplicate matching through a pattern id based bitarray,
shrinking this space will optimize performance.

This patch implements this. It sets a flag before adding the pattern
to the mpm ctx, instructing the mpm to ignore the provided pid and
handle pids management itself. This leads to a shrinking of the
bitarray size.

This is made possible by the previous work that removes the pid logic
from the code.

Next to this, this patch moves the pattern setup stage to common util
functions. This avoids code duplication.

Update ac, ac-bs and ac-ks to use this.
10 years ago
Victor Julien fa885e1d85 mpm: remove pattern id logic 10 years ago
Victor Julien e48d745ed7 mpm: constify search func args 10 years ago
Victor Julien 14d9ce7b2e detect/mpm: remove unused max_id param from API 10 years ago
Victor Julien 262abbb49f mpm: fix ac-ks compilation on cygwin 10 years ago
Victor Julien 9c2e374a3d ac-ks: fix mem leaks 10 years ago
Victor Julien 887ddf1ed8 mpm: introduce ac-ks
Introduce 'ac-ks' or the Kenneth Steele AC implementation. It's
actually 'ac-tile' written by Ken for the Tilera platform. This
patch adds support for it on other architectures as well.

Enable ac-tile for other archs as 'ac-ks'.

Fix a bunch of OOB reads in the loops that triggered ASAN.
10 years ago
Ken Steele 736ac6a459 Use SigIntId as the type for storing signature IDs (Internal)
Previously using uint32_t, but SigIntId is currently uint16_t, so arrays
will take less memory.
11 years ago
Ken Steele d01d3324fc Increase max pattern ID allowed in MPM AC-tile to 28-bits 11 years ago
Ken Steele 900def5caf Create Specialized SCMemcmpNZ() when the length can't be zero. 11 years ago
Ken Steele 83ed01a279 Fix compiler warnings in ac-tile.
Signed vs unsigned comparisons.
11 years ago
Ken Steele 7a2095d851 In AC-Tile, convert from using pids for indexing to pattern index
Use an MPM specific pattern index, which is simply an index starting
at zero and incremented for each pattern added to the MPM, rather than
the externally provided Pattern ID (pid), since that can be much
larger than the number of patterns. The Pattern ID is shared across at
MPMs. For example, an MPM with one pattern with pid=8000 would result
in a max_pid of 8000, so the pid_pat_list would have 8000 entries.

The pid_pat_list[] is replaced by a array of pattern indexes. The PID is
moved to the SCACTilePatternList as a single value. The PatternList is
also indexed by the Pattern Index.

max_pat_id is no longer needed and mpm_ctx->pattern_cnt is used instead.

The local bitarray is then also indexed by pattern index instead of PID, making
it much smaller. The local bit array sets a bit for each pattern found
for this MPM. It is only kept during one MPM search (stack allocated).

One note, the local bit array is checked first and if the pattern has already
been found, it will stop checking, but count a match. This could result in
over counting matches of case-sensitve matches, since following case-insensitive
matches will also be counted. For example, finding "Foo" in "foo Foo foo" would
report finding "Foo" 2 times, mis-counting the third word as "Foo".
11 years ago
Ken Steele 104a903478 Dynamically resize pmq->rule_id_array
Rather than statically allocate 64K entries in every rule_id_array,
increase the size only when needed. Created a new function MpmAddSids()
to check the size before adding the new sids. If the array is not large
enough, it calls MpmAddSidsResize() that calls realloc and does error
checking. If the realloc fails, it prints an error and drops the new sids
on the floor, which seems better than exiting Suricata.

The size is increased to (current_size + new_count) * 2. This handles the
case where new_count > current_size, which would not be handled by simply
using current_size * 2. It should also be faster than simply reallocing to
current_size + new_count, which would then require another realloc for each
new addition.
11 years ago
Ken Steele d03f124445 Implement MPM opt for b2g, b3g, wumanber
Found problems in b2gm and b2gc, so those are removed.
11 years ago
Ken Steele edaefe5af2 Fix AC-tile for new pattern ID array. 11 years ago
Ken Steele 3f86c5a83f Fix memory leak in ac-tile
Incorrectly reallocing the goto table after it was freed by calling
SCACTileReallocState() when really only want to realloc the output table.
This was causing a large goto table to be allocated and never used or
freed.
11 years ago
Ken Steele b9e20ab4b8 Clean up memory leaks in ac-tile code
Free some memory at exit that was not getting freed.

Change pid_pat_list to store copy of case-strings in the same block
of memory as the array of pointers.
11 years ago
Ken Steele 033ad9e974 Reduce reallocation in AC Tile MPM creation.
Exponentially increase the memory allocated for new states when adding new
states, then at the end resize down to the actually final size so that no space is wasted.
11 years ago
Ken Steele 970f22c752 Move memcpy_lower() into new util-memcpy.h
Remove local copies from each MPM file and use include file instead.
Might be better to also add util-memcpy.c rather than inlining it each time,
to get smaller code, since only seems to be used at initialization.
12 years ago
Ken Steele 6b1517c0b8 Remove case_state usage
The case_state in MPMs was just to track when a pid could have no-case and
case-sensitive matches for the same PID. Now that can't happen after fixing
bug 1110, so remove the code and storage for case_state.
12 years ago
Ken Steele c41041a9c7 When assigning Pattern IDs pids, check Case flags
This fixes bug 1110. When assigning PIDs, use the NO_CASE flag when comparing
for duplicates. The state of the flag must be the same, but also use the same
type of comparisons when checking for duplicates.

Previously, "foo":CS would match with "foo":CI when it should not.
and "foo":CI would not match "FoO":CI when it should. Both of those
cases are fixed with this change.

This then allows simplifying the use of pid in MPMs because now if they
pids match, then so do the flags, so checking the flags is not required.
12 years ago
Ken Steele b7baa561c0 Cleanup in ac-tile MPM
Remove return from void functions.
Add some commments
Remove inline on functions where it doesn't make sense.
Rewrote if statement to be more clear.
12 years ago
Ken Steele 1f99096b30 Fix PmqSetup() argument removal in ac-tile MPM unit tests.
Needed to remove the second argument from all the calls, which was always 0
and was removed in other tests in a previous checkin.
12 years ago
Ken Steele ba4758d033 Port unittest from bug #970 for util-mpm-ac.c to util-mpm-ac-tile.c
Passes on ac-tile too.
12 years ago
Ken Steele 326d5d3e15 Add 8-bit states to ac-tile
When running with sgh-mpm-context: full, many more MPMs are created
(16K) and many are small. If they have less than 128 states, they only
need 1 byte for the next state instead of 2 bytes, cutting the size of
the next-state table in half. This reduces total memory usage.

Since that makes 3 different state sizes (1, 2 and 4 bytes), rather
than going from 2 copies of the code to create the MPM to 3, I
factored out the code that fills the next-state table into three
functions so that all the other code could be the same.

The search function is now parameterize for 8-bit and 16-bit state
sizes and alphabet sizes 8, 16, 32, 64, 128 and 256.
12 years ago
Eric Leblond 1f07d1521e Fix realloc error handling
This patch is fixing realloc error handling. In case of a realloc
failure, it free the initial memory and continue existing error
handling.

The patch has been obtained via the following semantic patch and
a bit oh hand editing:

@@
expression x, E;
identifier f;
@@

f(...)
{
+ void *ptmp;
<+...
- x = SCRealloc(x, E);
+ ptmp = SCRealloc(x, E);
... when != x
- if (x == NULL)
+ if (ptmp == NULL)
{
+ SCFree(x);
+ x = NULL;
...
- }
+ } else {
+     x = ptmp;
+ }
...+>
}

@@
expression x, E;
identifier f;
statement ES;
@@

f(...) {
+ void *ptmp;

<+...
- x = SCRealloc(x, E);
+ ptmp = SCRealloc(x, E);
... when != x
- if (x == NULL) ES
+ if (ptmp == NULL) {
+ SCFree(x);
+ x = NULL;
+ ES
+ } else {
+     x = ptmp;
+ }
...+>

}

@@
expression x, E;
identifier f;
@@

f(...)
{
+ void *ptmp;
<+...
- x = SCRealloc(x, E);
+ ptmp = SCRealloc(x, E);
... when != x
- if (unlikely(x == NULL))
+ if (unlikely(ptmp == NULL))
{
+ SCFree(x);
+ x = NULL;
...
- }
+ } else {
+     x = ptmp;
+ }
...+>
}

@@
expression x, E;
identifier f;
statement ES;
@@

f(...) {
+ void *ptmp;

<+...
- x = SCRealloc(x, E);
+ ptmp = SCRealloc(x, E);
... when != x
- if (unlikely(x == NULL)) ES
+ if (unlikely(ptmp == NULL)) {
+ SCFree(x);
+ x = NULL;
+ ES
+ } else {
+     x = ptmp;
+ }
...+>

}
12 years ago
Ken Steele 3870def601 Split AC-Tile MPM context into Search and Initialization structures.
Some of the fields in the SCACTileCtx struct are only used to create the MPM,
but are not needed to search the MPM. Create a new structure to contain just
the data needed by AC Search. After creating the MPM, copy the data into the
new structure and then free the memory only needed during initialization.

This reduces the size of the AC-Tile MPM context from 1360 bytes down to 296
bytes.
12 years ago
Anoop Saldanha a49cbf8a49 Code cleanup.
Use the MpmAddPattern[CS|CI] wrapper to add patterns to the mpm context.

Also use MpmInitCtx() to init the mpm context.
12 years ago
Eric Leblond 79fcf1378a Use unlikely in malloc failure test.
This patch is a result of applying the following coccinelle
transformation to suricata sources:

  @istested@
  identifier x;
  statement S1;
  identifier func =~ "(SCMalloc|SCStrdup|SCCalloc|SCMallocAligned|SCRealloc)";
  @@

  x = func(...)
  ... when != x
  - if (x == NULL) S1
  + if (unlikely(x == NULL)) S1
12 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
12 years ago