Commit Graph

55 Commits (dacb860d24f221fd7ad5a3fc775ad35f556717e0)

Author SHA1 Message Date
Jason Ish 796dd5223b tests: no longer necessary to provide successful return code
1 pass, 0 is fail.
9 years ago
Ken Steele 923a77e952 Change Match() function to take const SigMatchCtx*
The Match functions don't need a pointer to the SigMatch object, just the
context pointer contained inside, so pass the Context to the Match function
rather than the SigMatch object. This allows for further optimization.

Change SigMatch->ctx to have type SigMatchCtx* rather than void* for better
type checking. This requires adding type casts when using or assigning it.

The SigMatch contex should not be changed by the Match() funciton, so pass it
as a const SigMatchCtx*.
11 years ago
Victor Julien 02529b13a8 rule parser: set flag for optionless keywords
If a keyword doesn't have an argument, it should set the SIGMATCH_NOOPT
flag so the parser knows.
11 years ago
Victor Julien 8dbf7a0d78 Update tests to use AppLayerParserThreadCtx ptr instead of void. Fix a few bugs uncovered by this. 12 years ago
Victor Julien fdefb65be4 app-layer: rename AppLayerThreadCtx funcs
AppLayerParserGetCtxThread -> AppLayerParserThreadCtxAlloc
AppLayerParserDestroyCtxThread -> AppLayerParserThreadCtxFree
12 years ago
Anoop Saldanha 429c6388f6 App layer API rewritten. The main files in question are:
app-layer.[ch], app-layer-detect-proto.[ch] and app-layer-parser.[ch].

Things addressed in this commit:
- Brings out a proper separation between protocol detection phase and the
  parser phase.
- The dns app layer now is registered such that we don't use "dnstcp" and
  "dnsudp" in the rules.  A user who previously wrote a rule like this -

  "alert dnstcp....." or
  "alert dnsudp....."

  would now have to use,

  alert dns (ipproto:tcp;) or
  alert udp (app-layer-protocol:dns;) or
  alert ip (ipproto:udp; app-layer-protocol:dns;)

  The same rules extend to other another such protocol, dcerpc.
- The app layer parser api now takes in the ipproto while registering
  callbacks.
- The app inspection/detection engine also takes an ipproto.
- All app layer parser functions now take direction as STREAM_TOSERVER or
  STREAM_TOCLIENT, as opposed to 0 or 1, which was taken by some of the
  functions.
- FlowInitialize() and FlowRecycle() now resets proto to 0.  This is
  needed by unittests, which would try to clean the flow, and that would
  call the api, AppLayerParserCleanupParserState(), which would try to
  clean the app state, but the app layer now needs an ipproto to figure
  out which api to internally call to clean the state, and if the ipproto
  is 0, it would return without trying to clean the state.
- A lot of unittests are now updated where if they are using a flow and
  they need to use the app layer, we would set a flow ipproto.
- The "app-layer" section in the yaml conf has also been updated as well.
12 years ago
Victor Julien 3e604b8703 pcre: parsing cleanup
Remove all flags indicating the buffer type. They were only used
at parse time.

Because of this the DetectPcreData_ structure could shrink to 32
bytes.
12 years ago
Victor Julien 958938bf01 Bug 640: add more tests to validate that issue is fixed 12 years ago
Anoop Saldanha 8ae92c7a5e Add unittest to test for http ambiguous host header.
Previously we would not check the port part of the host from the uri
hostname, while we did use the port part from the host header, leading
to FPs.
12 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
12 years ago
Eric Leblond cd3e32ce19 unittests: some functions needs a flow lock.
In debug validation mode, it is required to call application layer
parsing and other functions with a lock on flow. This patch updates
the code to do so.
12 years ago
Anoop Saldanha 4c6efa2d40 Update content id assignment.
All fp id assignment now happens in one go.
Also noticing a slight perf increase, probably emanating from improved cache
perf.
Removed irrelevant unittests as well.
13 years ago
Anoop Saldanha f8ae53ac02 Further customize content modifier buffer registration.
Allow modifier setups functions to have CustomCallbacks to enable their
internal conditions.
13 years ago
Anoop Saldanha a304a98d1d http_* setup unified. 13 years ago
Anoop Saldanha 0b5d277254 code cleanup for all content based keywords. 13 years ago
Anoop Saldanha a308d718ae Allow the use of relative without the presence of a related previous keyword. 13 years ago
Last G 8ae11f73b2 Added parentheses to fix Eclipse static code analysis
Fixed bug in action priority (REJECT_DST had lowest prio)
13 years ago
Eric Leblond 6842545331 Add documentation url in list-keyword output.
The output of the list-keyword is modified to include the url to
the keyword documentation when this is available. All documented
keywords should have their link set.

list-keyword can be used with an optional value:
 no option or short: display list of keywords
 csv: display a csv output on info an all keywords
 all: display a human readable output of keywords info
 $KWD: display the info about one keyword.
13 years ago
Victor Julien 472e061c6d build: more checking for includes 13 years ago
Anoop Saldanha 97308674ee All http_http_header modified patterns now are DETECT_CONTENT and not DETECT_AL_HTTP_HEADER 14 years ago
Anoop Saldanha dcb2afb02f Use sm_list to differentiate between different content types while retrieving pattern ids instead of sm_type 14 years ago
Anoop Saldanha 9a6aef459e modify all relevant app layer API calls to accomodate passing parser local storage argument 14 years ago
Victor Julien 262a7300d7 flow: shrink Flow datatype
Introduce a separate FlowAddress structure for holding the ipv4 or ipv6 address
that doesn't have the family in it like the Address structure. Instead, the
family is stored in the flow as a flag: FLOW_IPV4 and FLOW_IPV6.

Add macro's to check the family, copy the address, etc.

Update many unittests to reflect these changes. Introduce unittest helper
functions for creating and initializing a flow and freeing it again.

On 64 bit this shrinks the flow with 8 bytes.
14 years ago
Victor Julien 06904c9024 App Layer cleanup
Removal of per flow 'aldata' array. It contained a ptr for each ALPROTO. Instead now we have 2 ptrs in the flow: alparser and alstate.
Various cleanups and dead code removal from the app layer API.
Should safe 100+ bytes memory per flow on 64 bit.
Updated lots of unittests to reflect these changes.
14 years ago
Eric Leblond 60a99915c1 doc: create http support group
This patch create an httplayer group and adds related files to
it. It also fixes some typo in documentation string and format.
14 years ago
Anoop Saldanha ed3b44b3b5 fix parsing content keywords. We are more strict now. All content keywords need to be enclosed in double quotes. Better validation for sid, priority and rev keywords 14 years ago
Victor Julien 1d971b53a6 Update all unittests 15 years ago
Victor Julien 6131dec8a1 Fix a compiler warning due to a broken prototype declaration. 15 years ago
Anoop Saldanha 8bd6a38318 support relative pcre for http header. All pcre processing for http header moved to hhd engine 15 years ago
Anoop Saldanha 6648d1faf0 allow sigs for http uri of the form content:one; content:two; distance:0; http_[raw_]header; 15 years ago
Anoop Saldanha 7ec0382774 support fast pattern for http raw header. Also support relative modifiers for http raw header 15 years ago
Anoop Saldanha c61c68fd36 mpm and fast pattern support for http_header. Also support relative modifiers for http_header 15 years ago
Anoop Saldanha fc46f216ca detect-http-header.c cleanup before we start working on it 15 years ago
Anoop Saldanha 4c53a9d606 unifying content structure - http_header now uses DetectContentData 15 years ago
Anoop Saldanha a7353be20d replace all Signature->amatch instances in the engine with Signature->sm_lists[DETECT_SM_LIST_AMATCH] 15 years ago
Anoop Saldanha e54358a9e1 replace all Signature->pmatch instances in the engine with Signature->sm_lists[DETECT_SM_LIST_PMATCH] 15 years ago
Anoop Saldanha 82fd581b64 replace all sm lists (match, pmatch, dmatch, umatch, amatch, tmatch) with an array Signature->sm_lists[]. Replace all Signature->match instances in the engine with Signature->sm_lists[DETECT_SM_LIST_MATCH] 15 years ago
Victor Julien 001f91056e Add http_raw_header as an alias to the http_header keyword as that actually inspects the raw headers (see issue #243). Closes issue #242. 15 years ago
Anoop Saldanha 0c5b82d891 provide separate ids for content, uricontent, http_(client_body_data|cookie|header|method|uri), when they share the same pattern 15 years ago
Victor Julien fc248ca7a1 Many small performance updates. 15 years ago
Pablo Rincon f225bd1428 Adding modifiers /C /H and /M to pcre (http cookie, header and method) 15 years ago
Victor Julien 1071a53210 Fix unittests after ip_proto keyword change. 15 years ago
Pablo Rincon 169cb22dc6 Updating other http modifiers for sigs with fast_pattern option 15 years ago
William Metcalf 0e4235cc94 FLOW_DESTROY added to clean-up UT's that init flow 15 years ago
Victor Julien 2f29b8a724 Improve detection of app layer, making sure we only handle app layer on 'established' packets. Should really fix #166. 15 years ago
Pablo Rincon 8cc525c939 UDP support at AppLayer message handling 15 years ago
William Metcalf cc76aa4bc6 properly init flows inside of unit-tests caused lock-up when falling back to using mutex locks 15 years ago
Gurvinder Singh cda664a8c4 memroy leaks fixes in detection module, app layer and counters 15 years ago
Victor Julien 70b32f7380 First stab at creating a stateful detection engine.
Stateful detection for app layer detection keywords, except uricontent. Stores it's partial results in the flow structure. Other modifications:

- Generalize transaction tracking, logging and inspection.
- Adapt http and dcerpc to use the new transaction handling.
- Stream engine now always notifies app layer of a stream eof.

This commit fixes bug #124.
15 years ago
William Metcalf 8d66323f62 clang fixes for null derefrences 15 years ago