Commit Graph

36 Commits (2d25f12cda60ebaf8448f79ff94ee84c2969b329)

Author SHA1 Message Date
Ken Steele f9705377ae Remove pkt variable from Packet structure.
The uint8_t *pkt in the Packet structure always points to the memory
immediately following the Packet structure. It is better to simply
calculate that value every time than store the 8 byte pointer.
12 years ago
Ken Steele f16b339fc4 Rename checksums to level3_comp_csum and level4_comp_csum.
This will also sharing even more memory in the Packet_ structure.
12 years ago
Ken Steele c6a8d0ab6b Share Packet checksum values for TCP, UDP, IPv6. ICMPv4 and ICMPv6
Keep a separate checksum for IPV4, since a packet can have both an IPV4
checksum and a TCPV4 checksum, or IPV4 and UDPV4 checksum.

This will allow future sharing of more values.

Use PACKET_RESET_CHECKSUMS() in Unit Tests in place of setting the
individual checksum values.
12 years ago
Victor Julien 3470b07ea5 Fix several compile and runtime warnings found by clang 3.2 with the -fsanitize=address option. 12 years ago
Ken Steele e05034f5dd New Multi-pattern matcher, ac-tile, optimized for Tile architecture.
Aho-Corasick mpm optimized for Tilera Tile-Gx architecture. Based on the
util-mpm-ac.c code base. The primary optimizations are:
1) Matching function used Tilera specific instructions.
2) Alphabet compression to reduce delta table size to increase cache
   utilization  and performance.

The basic observation is that not all 256 ASCII characters are used by
the set of multiple patterns in a group for which a DFA is
created. The first reason is that Suricata's pattern matching is
case-insensitive, so all uppercase characters are converted to
lowercase, leaving a hole of 26 characters in the
alphabet. Previously, this hole was simply left in the middle of the
alphabet and thus in the generated Next State (delta) tables.

A new, smaller, alphabet is created using a translation table of 256
bytes per mpm group. Previously, there was one global translation
table for converting upper case to lowercase.

Additional, unused characters are found by creating a histogram of all
the characters in all the patterns. Then all the characters with zero
counts are mapped to one character (0) in the new alphabet. Since
These characters appear in no pattern, they can all be mapped to a
single character and still result in the same matches being
found. Zero was chosen for the value in the new alphabet since this
"character" is more likely to appear in the input. The unused
character always results in the next state being state zero, but that
fact is not currently used by the code, since special casing takes
additional instructions.

The characters that do appear in some pattern are mapped to
consecutive characters in the new alphabet, starting at 1. This
results in a dense packing of next state values in the delta tables
and additionally can allow for a smaller number of columns in that
table, thus using less memory and better packing into the cache. The
size of the new alphabet is the number of used characters plus 1 for
the unused catch-all character.

The alphabet size is rounded up to the next larger power-of-2 so that
multiplication by the alphabet size can be done with a shift.  It
might be possible to use a multiply instruction, so that the exact
alphabet size could be used, which would further reduce the size of
the delta tables, increase cache density and not require the
specialized search functions. The multiply would likely add 1 cycle to
the inner search loop.

Since the multiply by alphabet-size is cleverly merged with a mask
instruction (in the SINDEX macro), specialized versions of the
SCACSearch function are generated for alphabet sizes 256, 128, 64, 32
and 16.  This is done by including the file util-mpm-ac-small.c
multiple times with a redefined SINDEX macro. A function pointer is
then stored in the mpm context for the search function. For alpha bit
sizes of 8 or smaller, the number of states usually small, so the DFA
is already very small, so there is little difference using the 16
state search function.

The SCACSearch function is also specialized by the size of the value
stored in the next state (delta) tables, either 16-bits or 32-bits.
This removes a conditional inside the Search function. That
conditional is only called once, but doesn't hurt to remove
it. 16-bits are used for up to 32K states, with the sign bit set for
states with matches.

Future optimization:

The state-has-match values is only needed per state, not per next
state, so checking the next-state sign bit could be replaced with
reading a different value, at the cost of an additional load, but
increasing the 16-bit next state span to 64K.

Since the order of the characters in the new alphabet doesn't matter,
the new alphabet could be sorted by the frequency of the characters in
the expected input stream for that multi-pattern matcher. This would
group more frequent characters into the same cache lines, thus
increasing the probability of reusing a cache-line.

All the next state values for each state live in their own set of
cache-lines. With power-of-two sizes alphabets, these don't overlap.
So either 32 or 16 character's next states are loaded in each cache
line load. If the alphabet size is not an exact power-of-2, then the
last cache-line is not completely full and up to 31*2 bytes of that
line could be wasted per state.

The next state table could be transposed, so that all the next states
for a specific character are stored sequentially, this could be better
if some characters, for example the unused character, are much more
frequent.
12 years ago
Eric Leblond c5bd04f102 unittest: recycle packet before exit
To avoid an issue with flow validation, we need to recycle the packet
before cleaning the flow.
12 years ago
Eric Leblond 281d2f27f8 Fix compilation warning
A goto could lead to the use de_ctx without declaring it.
12 years ago
Anoop Saldanha 0febe5a410 fix for #760.
If udpv4 csum isn't calculated, udpv4-csum detection shouldn't run on the
csum.
13 years ago
Victor Julien a55ff64a1b Use GET_PKT_LEN and GET_PKT_DATA macro's 13 years ago
Anoop Saldanha a30a1e5950 fix for bug 675.
Fix icmpv6-csum to send the right length to calculate the csum.
13 years ago
Anoop Saldanha af92c2fa4b Unittest to show the issue we have with 674 - csum-icmpv6 sends
wrong length for csum calculation)
13 years ago
Eric Leblond e176be6fcc Use unlikely for error treatment.
When handling error case on SCMallog, SCCalloc or SCStrdup
we are in an unlikely case. This patch adds the unlikely()
expression to indicate this to gcc.

This patch has been obtained via coccinelle. The transformation
is the following:

@istested@
identifier x;
statement S1;
identifier func =~ "(SCMalloc|SCStrdup|SCCalloc)";
@@

x = func(...)
... when != x
- if (x == NULL) S1
+ if (unlikely(x == NULL)) S1
13 years ago
Anoop Saldanha 98a8234e0a csum function fixes. Improves alert accuracy. FPs on invalid-csums decoder rules fixed 13 years ago
Victor Julien cdba2f50d1 Various fixes and improvements based on feedback by Coverity analyzer. 14 years ago
Anoop Saldanha a4638fb0ad code cleanup - replace SigMatchAppendPacket with SigMatchAppendSMToList 14 years ago
Anoop Saldanha 2fa55a86fa Fix csum validation functions to not carry out csum calculation if respective headers are not present 14 years ago
Eric Leblond 8d635ddfc2 detect-csum: incomplete checksum is a valid checksum
This patch modify checksum match to not alert on packet with
incomplete checksum. They will be checksummed later and thus
can be considered as valid one.
14 years ago
Anoop Saldanha 420befb180 Changed my email address to anoopsaldanha at gmail dot com from my current one 14 years ago
Anoop Saldanha 54f8d56f48 Packet inspection keywords modified to not inspect pseudo packet 14 years ago
Victor Julien e1d4e16645 Simplify packet decoding macro's. 14 years ago
Victor Julien 6aa551c558 Small optimizations to IPV4 and TCP header parsing. 14 years ago
Victor Julien addab7b5ee Don't test the several packet detection checks against pseudo packets as the matches would not be meaningful anyway. Prevents a segv in the csum detection. 15 years ago
Eric Leblond a69bb94335 Checksum match: fix logic problem
This patch fixes a logic error in the checksum matches. In
case the protocol is not the one tested, the test must return
0 and not 1 (test matched).

Signed-off-by: Eric Leblond <eric@regit.org>
15 years ago
Victor Julien 014f62247a Another batch of clang fixes. Nothing really serious. Includes a couple of fixes for broken fixes from yesterday. 15 years ago
Victor Julien ac97bb7799 Fix a number of small clang issues. Clang doesn't know we exit on malloc errors during init. 15 years ago
Anoop Saldanha 82fd581b64 replace all sm lists (match, pmatch, dmatch, umatch, amatch, tmatch) with an array Signature->sm_lists[]. Replace all Signature->match instances in the engine with Signature->sm_lists[DETECT_SM_LIST_MATCH] 15 years ago
Gerardo Iglesias Galvan 9f4fae5b1a Fix inconsistent use of dynamic memory allocation 15 years ago
William Metcalf ce01927515 Import of GPLv2 Header 050410 15 years ago
Victor Julien 8b30226914 Detection keyword cleanup 16 years ago
Victor Julien b259e362cd Convert uricontent to use new scanning methods as well. Move http_method and http_cookie keywords out of pmatch list for now. 16 years ago
Pablo Rincon 25a3a5c6d8 Adding mem wrapper to debug runtime alloc()/free() functions. Fixing some memory leaks. 16 years ago
Gerardo Iglesias Galvan ba6d807a6e Improve information about errors on signature failure 16 years ago
Victor Julien ecf86f9c23 Rename to Suricata. 16 years ago
Gurvinder Singh a0f184866c http_cookie keywork support 16 years ago
Victor Julien 0d0ffb9963 Reorganize header inclusions. 16 years ago
Anoop Saldanha 22c0ec2bc5 Added support for the csum-<protocol> rules keyword to the detection engine. Keywords added are ipv4-csum, tcpv4-csum, tcpv6-csum, udpv4-csum, udpv6-csum, icmpv4-csum and icmpv6-csum 16 years ago