Ticket: #2696
There are a lot of changes here, which are described below.
In general these changes are renaming constants to conform to the
libhtp-rs versions (which are generated by cbindgen); making all htp
types opaque and changing struct->member references to
htp_struct_member() function calls; and a handful of changes to offload
functionality onto libhtp-rs from suricata, such as URI normalization
and transaction cleanup.
Functions introduced to handle opaque htp_tx_t:
- tx->parsed_uri => htp_tx_parsed_uri(tx)
- tx->parsed_uri->path => htp_uri_path(htp_tx_parsed_uri(tx)
- tx->parsed_uri->hostname => htp_uri_hostname(htp_tx_parsed_uri(tx))
- htp_tx_get_user_data() => htp_tx_user_data(tx)
- htp_tx_is_http_2_upgrade(tx) convenience function introduced to detect response status 101
and “Upgrade: h2c" header.
Functions introduced to handle opaque htp_tx_data_t:
- d->len => htp_tx_data_len()
- d->data => htp_tx_data_data()
- htp_tx_data_tx(data) function to get the htp_tx_t from the htp_tx_data_t
- htp_tx_data_is_empty(data) convenience function introduced to test if the data is empty.
Other changes:
Build libhtp-rs as a crate inside rust. Update autoconf to no longer
use libhtp as an external dependency. Remove HAVE_HTP feature defines
since they are no longer needed.
Make function arguments and return values const where possible
htp_tx_destroy(tx) will now free an incomplete transaction
htp_time_t replaced with standard struct timeval
Callbacks from libhtp now provide the htp_connp_t and the htp_tx_data_t
as separate arguments. This means the connection parser is no longer
fetched from the transaction inside callbacks.
SCHTPGenerateNormalizedUri() functionality moved inside libhtp-rs, which
now provides normalized URI values.
The normalized URI is available with accessor function: htp_tx_normalized_uri()
Configuration settings added to control the behaviour of the URI normalization:
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
- htp_config_set_convert_lowercase()
- htp_config_set_double_decode_normalized_query()
- htp_config_set_double_decode_normalized_path()
- htp_config_set_backslash_convert_slashes()
- htp_config_set_bestfit_replacement_byte()
- htp_config_set_convert_lowercase()
- htp_config_set_nul_encoded_terminates()
- htp_config_set_nul_raw_terminates()
- htp_config_set_path_separators_compress()
- htp_config_set_path_separators_decode()
- htp_config_set_u_encoding_decode()
- htp_config_set_url_encoding_invalid_handling()
- htp_config_set_utf8_convert_bestfit()
- htp_config_set_normalized_uri_include_all()
- htp_config_set_plusspace_decode()
Constants related to configuring uri normalization:
- HTP_URL_DECODE_PRESERVE_PERCENT => HTP_URL_ENCODING_HANDLING_PRESERVE_PERCENT
- HTP_URL_DECODE_REMOVE_PERCENT => HTP_URL_ENCODING_HANDLING_REMOVE_PERCENT
- HTP_URL_DECODE_PROCESS_INVALID => HTP_URL_ENCODING_HANDLING_PROCESS_INVALID
htp_config_set_field_limits(soft_limit, hard_limit) changed to
htp_config_set_field_limit(limit) because libhtp didn't implement soft
limits.
libhtp logging API updated to provide HTP_LOG_CODE constants along with
the message. This eliminates the need to perform string matching on
message text to map log messages to HTTP_DECODER_EVENT values, and the
HTP_LOG_CODE values can be used directly. In support of this,
HTP_DECODER_EVENT values are mapped to their corresponding HTP_LOG_CODE
values.
New log events to describe additional anomalies:
HTP_LOG_CODE_REQUEST_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_RESPONSE_TOO_MANY_LZMA_LAYERS
HTP_LOG_CODE_PROTOCOL_CONTAINS_EXTRA_DATA
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_START
HTP_LOG_CODE_CONTENT_LENGTH_EXTRA_DATA_END
HTP_LOG_CODE_SWITCHING_PROTO_WITH_CONTENT_LENGTH
HTP_LOG_CODE_DEFORMED_EOL
HTP_LOG_CODE_PARSER_STATE_ERROR
HTP_LOG_CODE_MISSING_OUTBOUND_TRANSACTION_DATA
HTP_LOG_CODE_MISSING_INBOUND_TRANSACTION_DATA
HTP_LOG_CODE_ZERO_LENGTH_DATA_CHUNKS
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_NO_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_UNKNOWN_METHOD_INVALID_PROTOCOL
HTP_LOG_CODE_REQUEST_LINE_NO_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_PROTOCOL
HTP_LOG_CODE_RESPONSE_LINE_INVALID_RESPONSE_STATUS
HTP_LOG_CODE_RESPONSE_BODY_INTERNAL_ERROR
HTP_LOG_CODE_REQUEST_BODY_DATA_CALLBACK_ERROR
HTP_LOG_CODE_RESPONSE_INVALID_EMPTY_NAME
HTP_LOG_CODE_REQUEST_INVALID_EMPTY_NAME
HTP_LOG_CODE_RESPONSE_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_RESPONSE_HEADER_NAME_NOT_TOKEN
HTP_LOG_CODE_REQUEST_INVALID_LWS_AFTER_NAME
HTP_LOG_CODE_LZMA_DECOMPRESSION_DISABLED
HTP_LOG_CODE_CONNECTION_ALREADY_OPEN
HTP_LOG_CODE_COMPRESSION_BOMB_DOUBLE_LZMA
HTP_LOG_CODE_INVALID_CONTENT_ENCODING
HTP_LOG_CODE_INVALID_GAP
HTP_LOG_CODE_ERROR
The new htp_log API supports consuming log messages more easily than
walking a list and tracking the current offset. Internally, libhtp-rs
now provides log messages as a queue of htp_log_t, which means the
application can simply call htp_conn_next_log() to fetch the next log
message until the queue is empty. Once the application is done with a
log message, they can call htp_log_free() to dispose of it.
Functions supporting htp_log_t:
htp_conn_next_log(conn) - Get the next log message
htp_log_message(log) - To get the text of the message
htp_log_code(log) - To get the HTP_LOG_CODE value
htp_log_free(log) - To free the htp_log_t
Currently this script has two commands: "missing" and "having".
"missing" will show eve fields that do not map to any keywords.
"having" will sohw eve fields along with their keyword mappsings,
while also validating that those keywords really exist.
Related to tickets: #6463, #4772
Ticket: 5053
The names are now dynamically registered at runtime.
The AppProto alproto enum identifiers are still static for now.
This is the final step before app-layer plugins.
Use quoted include style for Lua includes ("lua.h" instead of <lua.h>)
as this could result in system includes being picked up instead of the
includes from our vendor directory.
Especially fix setup-app-layer script to not forget this part
This allows, for simple loggers, to have a unique definition
of the actual logging function with the jsonbuilder.
This way, alerts, files, and app-layer event can share the code
to output the same data.
Ticket: #3827
Although we prefer that formatting changes (e.g. the ones made by
running clang) go in a different commit, our script error message was
still suggesting `rewrite-branch` as an option. Removed that and added
that the changes made by the script should go into a separate commit.
Allow pull requests (and merge requests) to be specified by using a
branch name like "pr/111" or "mr/222". This allows CI to use this
script as well, instead of multiple variations of the same thing.
Additonally allow the destination directory to be overridden with the
DESTDIR environment variable.
After the changes in the script in 05e16820de, the file
app-layer-protos.c was to be modified properly iff it was left unformatted.
However, the file was also formatted as a part of the same commit making
the lines split which broke the output of the script. Fix that by
looking for another pattern and changing the lines following that.
Remove the app-layer-PROTO stub for Rust based parsers. It is no longer
needed as Rust parsers now contain the registration function in Rust.
Ticket: 4939
To better fit with our current CI processes, use git to clone the
suricata-update and libhtp dependencies. The requirements.txt file has
been modified to take a repo URL and a `-b` command line option for tag
or branch.
For the master branch we will use the libhtp 0.5.x branch and the
suricata-update master branch.
Also allows for repo and branch names to be overrided with environment
variables:
- SU_REPO
- SU_BRANCH
- LIBHTP_REPO
- LIBHTP_BRANCH
Currently, it seems easier to upload the diagram images to git than to
try to make the image generation script work with out of the tree builds
and other corner cases.
This means, however, that one must activelly remember to update msc
diagram files, run the script and re-add new png files, if those ever
need to be updated. To raise awareness to that, a watermark was added
to the diagram images.
Also removed configuration steps that added mscgen as dependency
(locally and for workflow builds and readthedocs).
There is a hack to know the type of an integer
and do an explicit cast in the python script
generating the C file
Also extends some bounds check against negative values
Add a bundle.sh script to bundle the requirements of libhtp
and suricata-update. This uses a Python like requirements.txt
file to specify the URL to download for libhtp and suricata-update.
adjust lines for patching /src/Makefile.am, as current generated
Makefile wasn't building Suricata.
Add suggestion to run "./configure" before running "make".
Add --logger and --parser options to examples.