Commit Graph

295 Commits (a2bc0080932e2971590b83011df109373f7aca66)

Author SHA1 Message Date
Victor Julien 9cab3ea2cd http_stat_code: mpm prefilter engine 9 years ago
Victor Julien 4d57b2fc63 http_stat_msg: mpm prefilter engine 9 years ago
Victor Julien 86d303e32b http_raw_host: mpm prefilter engine 9 years ago
Victor Julien 5218849213 http_host: mpm prefilter engine 9 years ago
Victor Julien 61c3748fc4 http_user_agent: mpm prefilter engine 9 years ago
Victor Julien a43a69305d http_cookie: mpm prefilter engine 9 years ago
Victor Julien 7a46364e42 http_raw_uri: mpm prefilter engine 9 years ago
Victor Julien 746a169127 dns_query: mpm prefilter engine 9 years ago
Victor Julien 9ff5703c49 packet/stream: mpm prefilter engine 9 years ago
Victor Julien 72f2a78b1f http_method: mpm prefilter engine 9 years ago
Victor Julien b62c4cc359 http_uri: mpm prefilter engine
Inspect partial request line as well.
9 years ago
Victor Julien 4c0ab681f2 mpm: remove Cleanup API call
It's unused by all of the implementations.
9 years ago
Mats Klepsland 4172c4c8ac tls: add (mpm) keyword tls_cert_subject
This keyword is a replacement for tls.subject.
9 years ago
Mats Klepsland 9b2717799c tls: add (mpm) keyword tls_cert_issuer
This keyword is a replacement for tls.issuerdn.
9 years ago
Victor Julien ec0217f52c detect: minor style fixes 9 years ago
Victor Julien b3bf7a5729 output: introduce config and perf output levels
Goal is to reduce info output
9 years ago
Victor Julien 371113e21e ac-ks: don't allow use on big-endian 9 years ago
Justin Viiret c9d0d6f698 mpm: add "auto" default for mpm-algo
Setting mpm-algo to "auto" will use "hs" if Suricata was built against
Hyperscan, and "ac" otherwise (or "ac-tile" on Tilera platforms).
9 years ago
Mats Klepsland a13df67864 detect: add (mpm) keyword for tls_sni
Match on server name indication (SNI) extension in TLS using tls_sni
keyword, e.g:

alert tls any any -> any any (msg:"SNI test"; tls_sni;
        content:"example.com"; sid:12345;)
9 years ago
maxtors 9d3fd82849 Removed duplicate include statements. 9 years ago
Victor Julien d085362e61 detect: fix error handling in mpm setup
*** CID 1358124:  Null pointer dereferences  (REVERSE_INULL)
/src/detect-engine-mpm.c: 940 in MpmStoreSetup()
934                     PopulateMpmHelperAddPatternToPktCtx(ms->mpm_ctx,
935                             cd, s, 0, (cd->flags & DETECT_CONTENT_FAST_PATTERN_CHOP));
936                 }
937             }
938         }
939
>>>     CID 1358124:  Null pointer dereferences  (REVERSE_INULL)
>>>     Null-checking "ms->mpm_ctx" suggests that it may be null, but it has already been dereferenced on all paths leading to the check.
940         if (ms->mpm_ctx != NULL) {
941             if (ms->mpm_ctx->pattern_cnt == 0) {
942                 MpmFactoryReClaimMpmCtx(de_ctx, ms->mpm_ctx);
943                 ms->mpm_ctx = NULL;
944             } else {
945                 if (ms->sgh_mpm_context == MPM_CTX_FACTORY_UNIQUE_CONTEXT) {
9 years ago
Victor Julien 5b1d75f0bd detect: suppress output 9 years ago
Victor Julien ac2c206359 mpm: clean up builtin mpm setup, enable single/full 9 years ago
Victor Julien 6ef27c9f92 mpm: allow app buffer shared/unique
Allow setting of shared or unique setting per app buffer type:
e.g. detect.mpm.http_uri.shared=true
9 years ago
Victor Julien 79a96b2b90 mpm: refactor 'single' setup handling 9 years ago
Victor Julien 157ca89dd7 mpm: remove useless flag from factory 9 years ago
Victor Julien 87f3adbe4c detect/mpm: unify packet/stream mpm_ctx pointers
SGH's for tcp and udp are now always only per proto and per direction.
This means we can simply reuse the packet and stream mpm pointers.

The SGH's for the other protocols already used a directionless catch
all mpm pointer.
9 years ago
Victor Julien 6bb2b001a3 mpm: cleanup: move mpm funcs into buffer specific files 9 years ago
Victor Julien c880b79f45 detect: shrink sgh
Turn list of mpm_ctx pointers into a union so that we don't waste
space. The sgh's for tcp and udp are in one direction only, so the
ts and tc ones are now in the union.
9 years ago
Victor Julien c804102a9a detect: move app_mpms array to init data 9 years ago
Victor Julien 9b3d4f7e24 mpm: unify & localize mpm pattern (id) handling
So far, the patterns as passed to the mpm's would use global id's that
were shared among all buffers, directions. This would lead to a fairly
large pattern id space. As the mpm algo's use the pattern id's to
prevent duplicate matching through a pattern id based bitarray,
shrinking this space will optimize performance.

This patch implements this. It sets a flag before adding the pattern
to the mpm ctx, instructing the mpm to ignore the provided pid and
handle pids management itself. This leads to a shrinking of the
bitarray size.

This is made possible by the previous work that removes the pid logic
from the code.

Next to this, this patch moves the pattern setup stage to common util
functions. This avoids code duplication.

Update ac, ac-bs and ac-ks to use this.
9 years ago
Victor Julien ba9d43cce5 mpm: improve negated mpm
The idea is: if mpm is negated, it's both on mpm and nonmpm sid lists
and we can kick it out in that case during the merge sort.

It only works for patterns that are 'independent'. This means that the
rule doesn't need to only match if the negated mpm pattern is limited
to the first 10 bytes for example.

Or more generally, an negated mpm pattern that has depth, offset,
distance or within settings can't be handled this way. These patterns
are not added to the mpm at all, but just to to non-mpm list. This
makes sense as they will *always* need manual inspection.

Similarly, a pattern that is 'chopped' always needs validation. This
is because in this case we only inspect a part of the final pattern.
9 years ago
Victor Julien 9e71ef4c3b detect: remove signature pattern id reference 9 years ago
Victor Julien c1ad08d11e detect: remove stream pmq array 9 years ago
Victor Julien 4e8e591715 detect mpm: mpm store cleanup
Move all rule modification to the fast_pattern assigment.
9 years ago
Victor Julien c87fcb29ff detect mpm: fast_pattern assignment cleanup 9 years ago
Victor Julien 7c94077892 detect mpm: remove unused mpm flags 9 years ago
Victor Julien a34be23002 detect: simplify negated mpm handling 9 years ago
Victor Julien e48d745ed7 mpm: constify search func args 9 years ago
Victor Julien 26517b8b61 detect: mpm store frees mpm_ctx' it owns 9 years ago
Victor Julien 102a82fc7b detect: use mpm store for app layer mpms
Rework app-layer mpm setup and registration to make this possible.
9 years ago
Victor Julien fac2cc0560 detect: mpm deduplication
Create hash for mpm's that we can reuse. Have packet/stream mpms
use this.
9 years ago
Victor Julien f0ba00e51d detect: remove old unused code 9 years ago
Victor Julien d82df4eb8b detect-mpm: make sgh setup proto aware
Allow multi-proto, multi-direction sgh's.
9 years ago
Victor Julien 66b3dba676 detect: remove dead code 9 years ago
Victor Julien b3dcdb10be detect mpm: remove dead code 9 years ago
Victor Julien 14d9ce7b2e detect/mpm: remove unused max_id param from API 9 years ago
Victor Julien 0d3f671b55 detect: constify mpm/detect funcs 9 years ago
Justin Viiret c37195c95f mpm: pass offset, depth args to add functions
MpmAddPatternCI and MpmAddPatternCS had arguments for offset and depth,
but these were not being passed in by the caller.
9 years ago
Victor Julien 9c5ee76455 tcp: fix unlikely NULL-ptr dereference
If a TCP packet could not get a flow (flow engine out of flows/memory)
and there were *only* TCP inspecting rules with the direction
explicitly set to 'to_server', a NULL pointer deref could happen.

PacketPatternSearchWithStreamCtx would fall through to the 'to_client'
case which was not initialized.
9 years ago
Victor Julien c0b16fa2bb detect: allow for more than 64k mpm rules 10 years ago
Victor Julien da7bad7c1b mpm: improve debug output 10 years ago
Victor Julien 977074930b mpm: use IPPROTO_TCP for readability 10 years ago
Victor Julien a559c41295 mpm: optimize & debug validate
Wrappers are called only if a mpm_ctx is available. So remove the test
for a null ctx and replace it by a debug validation BUG_ON.
10 years ago
Victor Julien 0dd3b73db2 mpm: assume we'll likely have a mpm_ctx 10 years ago
Victor Julien 7c336f4190 mpm: indent fix, no functional change 10 years ago
Victor Julien a00d83f1f5 mpm: change direction checking in mpm wrappers
Instead of having reachable assertions, use DEBUG_VALIDATE_BUG_ON
10 years ago
Victor Julien e755913b4b mpm: minor fixes and cleanups 10 years ago
Victor Julien 2c8e8c2516 dns: rename type so it's purpose is more clear 10 years ago
Giuseppe Longo 84dc73d9de mpm: implement prefiltering for smtp 10 years ago
Giuseppe Longo 41a1a9f4af find and replace HSBDMATCH by FILEDATA
This commit do a find and replace of the following:

- DETECT_SM_LIST_HSBDMATCH by DETECT_SM_LIST_FILEDATA
  sed -i 's/DETECT_SM_LIST_HSBDMATCH/DETECT_SM_LIST_FILEDATA/g' src/*

- HSBD by FILEDATA:
  sed -i 's/HSBDMATCH/FILEDATA/g' src/*
10 years ago
Victor Julien e7882da178 detect: introduce 'minimal' detect engine
The minimal detect engine has only the minimal memory use and setup
time. It's to be used for 'delayed' detect where the first detection
engine is essentially empty.

The threads setup are also minimal.
10 years ago
Ken Steele 3f3481e4d2 Fix indentation 11 years ago
Ken Steele 8f1d75039a Enforce function coding standard
Functions should be defined as:

int foo(void)
{
}

Rather than:
int food(void) {
}

All functions where changed by a script to match this standard.
11 years ago
Ken Steele 970f22c752 Move memcpy_lower() into new util-memcpy.h
Remove local copies from each MPM file and use include file instead.
Might be better to also add util-memcpy.c rather than inlining it each time,
to get smaller code, since only seems to be used at initialization.
12 years ago
Ken Steele cd1c18d981 Store case-insensitive patterns as lowercase.
This is required because SCMemcmpLowercase() expects it first argument
to be already lowercase for the comparison. This is done by using
memcpy_tolower() for NO_CASE patterns.

This addresses code review comments from Victor.
12 years ago
Ken Steele c41041a9c7 When assigning Pattern IDs pids, check Case flags
This fixes bug 1110. When assigning PIDs, use the NO_CASE flag when comparing
for duplicates. The state of the flag must be the same, but also use the same
type of comparisons when checking for duplicates.

Previously, "foo":CS would match with "foo":CI when it should not.
and "foo":CI would not match "FoO":CI when it should. Both of those
cases are fixed with this change.

This then allows simplifying the use of pid in MPMs because now if they
pids match, then so do the flags, so checking the flags is not required.
12 years ago
Victor Julien 0ec375d95a stream msg: remove structure 12 years ago
Anoop Saldanha a49cbf8a49 Code cleanup.
Use the MpmAddPattern[CS|CI] wrapper to add patterns to the mpm context.

Also use MpmInitCtx() to init the mpm context.
12 years ago
Eric Leblond 79fcf1378a Use unlikely in malloc failure test.
This patch is a result of applying the following coccinelle
transformation to suricata sources:

  @istested@
  identifier x;
  statement S1;
  identifier func =~ "(SCMalloc|SCStrdup|SCCalloc|SCMallocAligned|SCRealloc)";
  @@

  x = func(...)
  ... when != x
  - if (x == NULL) S1
  + if (unlikely(x == NULL)) S1
12 years ago
Victor Julien af311aee4e Minor fix for detection engine setup error check
cppcheck said:
[detect-engine-mpm.c:2075] -> [detect-engine-mpm.c:2075]: (style) Same expression on both sides of '||'.
12 years ago
Anoop Saldanha cfa2cda42b fix for bug #973.
An alternative solution for bug #970.

For chopped patterns, which in it's whole is a duplicate of another
pattern we assign an unique content id.
12 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
12 years ago
Anoop Saldanha 48cf0585fb Suricata upgrade to libhtp 0.5.x.
Remove the support for now unsupported personalities from libhtp -
TOMCAT_6_0, APACHE and APACHE_2_2.  We instead use the APACHE_2
personality.
12 years ago
Victor Julien 538da26812 Fix sgh mpm flags assignment 12 years ago
Eric Leblond 150cd39c6e detect-engine: do a direct update of flag
There is no reason not to update the flag directly. So do it
to avoid to crash the test.
12 years ago
Anoop Saldanha cd7b4fac40 remove unused pattern id assignment functions. Goodbye 12 years ago
Victor Julien f353fb630c DNS: convert dns_query to sticky buffer 12 years ago
Victor Julien 7292998a58 Content: set up sticky buffers like file_data and dce_stub_data w/o flags, but with a list variable 12 years ago
Victor Julien b3b554c269 Coverity 1038959: DNS mpm might use initialized variable 12 years ago
Anoop Saldanha fba95e9125 Remove mpm ctxs in the wrong direction.
A lot of http mpm ctxs have now been removed as a result of this.
12 years ago
Anoop Saldanha 3c2ddf04c1 Update mpm init ctx to not accept the final cuda_rc_module argument.
It was a part of our older architecture and is no longer used.
12 years ago
Victor Julien 33818c0272 DNS: fix CUDA build 12 years ago
Victor Julien 6645620c03 Merge SIG_FLAG_MPM_HTTP and SIG_FLAG_MPM_DNS into SIG_FLAG_MPM_APPLAYER, do the same for the _NEG variant. 12 years ago
Victor Julien 43ba5a677e DNS: enable mpm/fast_pattern support for dns_query 12 years ago
Anoop Saldanha 602c91ed41 Minor cosmetic changes to the cuda code.
Moved a couple of functions to more cuda relevant files;
Re-structured some data types.
12 years ago
Anoop Saldanha 17c763f855 Version 1 of AC Cuda. 12 years ago
Anoop Saldanha b787da5643 Remove all cuda related code in the engine except for the cuda api wrappers 12 years ago
Victor Julien ce99a07582 After some discussion we decided that var declarations inside a for statement are not in line with our coding style. So removing a bunch. Decision was not unanimous ^^. 13 years ago
Anoop Saldanha a3212f6a0f Minor fixes against the last set of patches for #564, 565, 581 + fp automation.
Rename struct DetectFigureFPAndId_t_ to DetectFPAndItsId_ and move it's
definition from inside the function where it's used to the global namespace,
as requested on #suricata.

Rename DetectEngineContentModifiedBufferSetup to DetectEngineContentModifierBufferSetup.

Also rename DetectFigureFPAndId() to DetectSetFastPatternAndItsId().

Updated DetectSetFastPatternAndItsId() to not exit on failure and return error.
13 years ago
Anoop Saldanha 6de8b1ed53 fix for #564.
Get rid of the hash table, and use a single-one_time_alloc'ed array for
pattern id assignment.
13 years ago
Anoop Saldanha 4c6efa2d40 Update content id assignment.
All fp id assignment now happens in one go.
Also noticing a slight perf increase, probably emanating from improved cache
perf.
Removed irrelevant unittests as well.
13 years ago
Anoop Saldanha 60be1751d5 Figure out sig fp during validation stage, instead of staging stage. 13 years ago
Anoop Saldanha 601836d831 Fast pattern setup now configurable in our code.
You can either enable/disable fp for a particular type + set priority.
13 years ago
Anoop Saldanha 0b5d277254 code cleanup for all content based keywords. 13 years ago
Anoop Saldanha 3511f91bba Add support for the new keyword - http_raw_host header.
The corresponding pcre modifier would be 'Z'.
13 years ago
Anoop Saldanha c4ce19a1be Add support for a new keyword to inspect http_host header.
The corresponding content keyword would now be - http_host.
The corresponding pcre modifier would be W.
13 years ago
Anoop Saldanha 7a7cd6999e feature #558.
Print FP info in rule analysis + other cleanup.
13 years ago
Anoop Saldanha bca1b7c52a change default mpm to ac. Also default sgh-mpm-context is full. 13 years ago
Victor Julien 75cddabd8a fast_pattern: don't consider http_method, http_stat_code and http_stat_msg when automatically giving preference to a HTTP pattern over a stream pattern. 13 years ago