mirror of https://github.com/OISF/suricata
doc: fast-pattern
parent
a464573230
commit
33e96c5087
@ -0,0 +1,186 @@
|
||||
Suricata Fast Pattern Determination Explained
|
||||
=============================================
|
||||
|
||||
If the 'fast_pattern' keyword is explicitly set in a rule, Suricata
|
||||
will use that as the fast pattern match. The 'fast_pattern' keyword
|
||||
can only be set once per rule. If 'fast_pattern' is not set, Suricata
|
||||
automatically determines the content to use as the fast pattern match.
|
||||
|
||||
The following explains the logic Suricata uses to automatically
|
||||
determine the fast pattern match to use.
|
||||
|
||||
Be aware that if there are positive (i.e. non-negated) content
|
||||
matches, then negated content matches are ignored for fast pattern
|
||||
determination. Otherwise, negated content matches are considered.
|
||||
|
||||
Suricata 1.1.x - 1.4.x
|
||||
----------------------
|
||||
|
||||
#. The longest (in terms of character/byte length) content match is
|
||||
used as the fast pattern match.
|
||||
|
||||
#. If multiple content matches qualify for the longest length, the one
|
||||
with the highest character/byte diversity score ("Pattern
|
||||
Strength") is used as the fast pattern match. See :ref:`Appendix C
|
||||
<fast-pattern-explained-appendix-c>` for details on the algorithm
|
||||
used to determine Pattern Strength.
|
||||
|
||||
#. If multiple content matches qualify for the longest length and have
|
||||
the same highest Pattern Strength, the buffer that has the *lower
|
||||
"list_id"* is used as the fast pattern match. See :ref:`Appendix A
|
||||
<fast-pattern-explained-appendix-a>` for the list_id of each
|
||||
buffers/list.
|
||||
|
||||
#. If multiple content matches qualify for the longest length and have
|
||||
the same highest Pattern Strength, and have the same list_id
|
||||
(i.e. are looking in the same buffer), then the one that comes
|
||||
first (from left-to-right) in the rule is used as the fast pattern
|
||||
match.
|
||||
|
||||
It is worth noting that for content matches that have the same length
|
||||
and Pattern Strength, regular 'content' matches take precedence over
|
||||
matches that use the 'http_*' buffers.
|
||||
|
||||
Suricata 2.0.x
|
||||
--------------
|
||||
|
||||
#. Suricata first identifies all content matches that have the highest
|
||||
"priority" that are used in the signature. The priority is based
|
||||
off of the buffer being matched on and generally 'http_*' buffers
|
||||
have a higher priority (lower number is higher priority). See
|
||||
:ref:`Appendix B <fast-pattern-explained-appendix-b>` for details
|
||||
on which buffers have what priority.
|
||||
#. Within the content matches identified in step 1 (the highest
|
||||
priority content matches), the longest (in terms of character/byte
|
||||
length) content match is used as the fast pattern match.
|
||||
#. If multiple content matches have the same highest priority and
|
||||
qualify for the longest length, the one with the highest
|
||||
character/byte diversity score ("Pattern Strength") is used as the
|
||||
fast pattern match. See :ref:`Appendix C
|
||||
<fast-pattern-explained-appendix-c>` for details on the algorithm
|
||||
used to determine Pattern Strength.
|
||||
#. If multiple content matches have the same highest priority, qualify
|
||||
for the longest length, and the same highest Pattern Strength, the
|
||||
buffer ("list_id") that was *registered last* is used as the fast
|
||||
pattern match. See :ref:`Appendix B
|
||||
<fast-pattern-explained-appendix-b>` for the registration order of
|
||||
the different buffers/lists.
|
||||
#. If multiple content matches have the same highest priority, qualify
|
||||
for the longest length, the same highest Pattern Strength, and have
|
||||
the same list_id (i.e. are looking in the same buffer), then the
|
||||
one that comes first (from left-to-right) in the rule is used as
|
||||
the fast pattern match.
|
||||
|
||||
It is worth noting that for content matches that have the same
|
||||
priority, length, and Pattern Strength, 'http_stat_msg',
|
||||
'http_stat_code', and 'http_method' take precedence over regular
|
||||
'content' matches.
|
||||
|
||||
Appendices
|
||||
----------
|
||||
|
||||
.. _fast-pattern-explained-appendix-a:
|
||||
|
||||
Appendix A - Buffers, list_id values, and Registration Order for Suricata 1.3.4
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This should be pretty much the same for Suricata 1.1.x - 1.4.x.
|
||||
|
||||
======= ============================== ======================== ==================
|
||||
list_id Content Modifier Keyword Buffer Name Registration Order
|
||||
======= ============================== ======================== ==================
|
||||
1 <none> (regular content match) DETECT_SM_LIST_PMATCH 1 (first)
|
||||
2 http_uri DETECT_SM_LIST_UMATCH 2
|
||||
6 http_client_body DETECT_SM_LIST_HCBDMATCH 3
|
||||
7 http_server_body DETECT_SM_LIST_HSBDMATCH 4
|
||||
8 http_header DETECT_SM_LIST_HHDMATCH 5
|
||||
9 http_raw_header DETECT_SM_LIST_HRHDMATCH 6
|
||||
10 http_method DETECT_SM_LIST_HMDMATCH 7
|
||||
11 http_cookie DETECT_SM_LIST_HCDMATCH 8
|
||||
12 http_raw_uri DETECT_SM_LIST_HRUDMATCH 9
|
||||
13 http_stat_msg DETECT_SM_LIST_HSMDMATCH 10
|
||||
14 http_stat_code DETECT_SM_LIST_HSCDMATCH 11
|
||||
15 http_user_agent DETECT_SM_LIST_HUADMATCH 12 (last)
|
||||
======= ============================== ======================== ==================
|
||||
|
||||
Note: registration order doesn't matter when it comes to determining the fast pattern match for Suricata 1.3.4 but list_id value does.
|
||||
|
||||
.. _fast-pattern-explained-appendix-b:
|
||||
|
||||
Appendix B - Buffers, list_id values, Priorities, and Registration Order for Suricata 2.0.7
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This should be pretty much the same for Suricata 2.0.x.
|
||||
|
||||
========================================== ================== ============================== ============================= =======
|
||||
Priority (lower number is higher priority) Registration Order Content Modifier Keyword Buffer Name list_id
|
||||
========================================== ================== ============================== ============================= =======
|
||||
3 11 <none> (regular content match) DETECT_SM_LIST_PMATCH 1
|
||||
3 12 http_method DETECT_SM_LIST_HMDMATCH 12
|
||||
3 13 http_stat_code DETECT_SM_LIST_HSCDMATCH 9
|
||||
3 14 http_stat_msg DETECT_SM_LIST_HSMDMATCH 8
|
||||
2 1 (first) http_client_body DETECT_SM_LIST_HCBDMATCH 4
|
||||
2 2 http_server_body DETECT_SM_LIST_HSBDMATCH 5
|
||||
2 3 http_header DETECT_SM_LIST_HHDMATCH 6
|
||||
2 4 http_raw_header DETECT_SM_LIST_HRHDMATCH 7
|
||||
2 5 http_uri DETECT_SM_LIST_UMATCH 2
|
||||
2 6 http_raw_uri DETECT_SM_LIST_HRUDMATCH 3
|
||||
2 7 http_host DETECT_SM_LIST_HHHDMATCH 10
|
||||
2 8 http_raw_host DETECT_SM_LIST_HRHHDMATCH 11
|
||||
2 9 http_cookie DETECT_SM_LIST_HCDMATCH 13
|
||||
2 10 http_user_agent DETECT_SM_LIST_HUADMATCH 14
|
||||
2 15 (last) dns_query DETECT_SM_LIST_DNSQUERY_MATCH 20
|
||||
========================================== ================== ============================== ============================= =======
|
||||
|
||||
Note: list_id value doesn't matter when it comes to determining the
|
||||
fast pattern match for Suricata 2.0.7 but registration order does.
|
||||
|
||||
.. _fast-pattern-explained-appendix-c:
|
||||
|
||||
Appendix C - Pattern Strength Algorithm
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
From detect-engine-mpm.c. Basically the Pattern Strength "score"
|
||||
starts at zero and looks at each character/byte in the passed in byte
|
||||
array from left to right. If the character/byte has not been seen
|
||||
before in the array, it adds 3 to the score if it is an alpha
|
||||
character; else it adds 4 to the score if it is a printable character,
|
||||
0x00, 0x01, or 0xFF; else it adds 6 to the score. If the
|
||||
character/byte has been seen before it adds 1 to the score. The final
|
||||
score is returned.
|
||||
|
||||
.. code-block:: c
|
||||
|
||||
/** \brief Predict a strength value for patterns
|
||||
*
|
||||
* Patterns with high character diversity score higher.
|
||||
* Alpha chars score not so high
|
||||
* Other printable + a few common codes a little higher
|
||||
* Everything else highest.
|
||||
* Longer patterns score better than short patters.
|
||||
*
|
||||
* \param pat pattern
|
||||
* \param patlen length of the patternn
|
||||
*
|
||||
* \retval s pattern score
|
||||
*/
|
||||
uint32_t PatternStrength(uint8_t *pat, uint16_t patlen) {
|
||||
uint8_t a[256];
|
||||
memset(&a, 0 ,sizeof(a));
|
||||
uint32_t s = 0;
|
||||
uint16_t u = 0;
|
||||
for (u = 0; u < patlen; u++) {
|
||||
if (a[pat[u]] == 0) {
|
||||
if (isalpha(pat[u]))
|
||||
s += 3;
|
||||
else if (isprint(pat[u]) || pat[u] == 0x00 || pat[u] == 0x01 || pat[u] == 0xFF)
|
||||
s += 4;
|
||||
else
|
||||
s += 6;
|
||||
a[pat[u]] = 1;
|
||||
} else {
|
||||
s++;
|
||||
}
|
||||
}
|
||||
return s;
|
||||
}
|
@ -1,4 +1,63 @@
|
||||
Fast Pattern
|
||||
============
|
||||
|
||||
Just a place holder now to demontrate linking.
|
||||
.. toctree::
|
||||
|
||||
fast-pattern-explained
|
||||
|
||||
Only one content of a signature will be used in the Multi Pattern
|
||||
Matcher (MPM). If there are multiple contents, then Suricata uses the
|
||||
'strongest' content. This means a combination of length, how varied a
|
||||
content is, and what buffer it is looking in. Generally, the longer
|
||||
and more varied the better. For full details on how Suricata
|
||||
determines the fast pattern match, see :doc:`fast-pattern-explained`.
|
||||
|
||||
Sometimes a signature writer concludes he wants Suricata to use
|
||||
another content than it does by default.
|
||||
|
||||
For instance::
|
||||
|
||||
User-agent: Mozilla/5.0 Badness;
|
||||
|
||||
content:”User-Agent|3A|”;
|
||||
content:”Badness”; distance:0;
|
||||
|
||||
In this example you see the first content is longer and more varied
|
||||
than the second one, so you know Suricata will use this content for
|
||||
the MPM. Because 'User-Agent:' will be a match very often, and
|
||||
'Badness' appears less often in network traffic, you can make Suricata
|
||||
use the second content by using 'fast_pattern'.
|
||||
|
||||
::
|
||||
|
||||
content:”User-Agent|3A|”;
|
||||
content:”Badness”; distance:0; fast_pattern;
|
||||
|
||||
The keyword fast_pattern modifies the content previous to it.
|
||||
|
||||
.. image:: fast-pattern/fast_pattern.png
|
||||
|
||||
Fast-pattern can also be combined with all previous mentioned
|
||||
keywords, and all mentioned HTTP-modifiers.
|
||||
|
||||
fast_pattern:only
|
||||
-----------------
|
||||
|
||||
Sometimes a signature contains only one content. In that case it is
|
||||
not necessary Suricata will check it any further after a match has
|
||||
been found in MPM. If there is only one content, the whole signature
|
||||
matches. Suricata notices this automatically. In some signatures this
|
||||
is still indicated with 'fast_pattern:only;'. Although Suricata does
|
||||
not need fast_pattern:only, it does support it.
|
||||
|
||||
Fast_pattern: 'chop'
|
||||
--------------------
|
||||
|
||||
If you do not want the MPM to use the whole content, you can use
|
||||
fast_pattern 'chop'.
|
||||
|
||||
For example::
|
||||
|
||||
content: “aaaaaaaaabc”; fast_pattern:8,4;
|
||||
|
||||
This way, MPM uses only the last four characters.
|
||||
|
Binary file not shown.
After Width: | Height: | Size: 12 KiB |
Loading…
Reference in New Issue