hs: prune stale MPM cache files

Hyperscan MPM can cache the compiled contexts to files.
This however grows as rulesets change and leads to bloating
the system. This addition prunes the stale cache files based
on their modified file timestamp.

Part of this work incorporates new model for MPM cache stats
to split it out from the cache save function and aggregate
cache-related stats in one place (newly added pruning).

Ticket: 7893
(cherry picked from commit 15c83be61a)

hs: suppress TOCTOU stat use

To explain a bit more the TOCTOU issue found, we can consider
a case where Suricata starts to prune, yet externally somebody also
starts erasing cache files.
Right after Suricata checks the file age with the stat function,
somebody may delete or update the file of our interest.

Suricata aging decision doesn't reflect the actual state of the file.
This commit additionally adds a check for noent failure of the unlink operation
(considered as a success). The code can still delete a file that is recently
updated but was considered stale.

In the documentation-following deployments this should not happen anyway as
one cache folder should only be used by a single Suricata instance (and within
Suricata instance only one thread handles cache eviction).
Additionally, the `stat` and `unlink` command are immediatelly followed, making
this scenario extra unlikely.

Additional comment in the code explains problems of using fstat and potential
issues on Windows.

Ticket: 8244
(cherry picked from commit 0fe0390a2f)

hs/cache: cleaner and more detailed output

Reduce logging level of a minor informational message.

Split tracking of pruning by age and by version and log those
separately, where the logging only appears if something has been
removed.

Ticket: 8323
(cherry picked from commit 569ba3d26f)

hs: remove redundant file handle in HSLoadCache

HSLoadCache opened the cache file but never used the resulting handle
for reading. The actual read was done by HSReadStream which opened
the same file independently.

Removed the unused fopen/fclose pair and flattened the control flow.

Ticket: 8326
(cherry picked from commit d754b28717)

hs: use binary mode for cache file I/O

HSSaveCache wrote serialized Hyperscan databases using text mode ("w")
while HSReadStream already read them with binary mode ("rb").
Matched file reading modes to the binary format and simplified
write-size check.

Ticket: 8326
(cherry picked from commit 0cdc77b707)

hs: warn about the same cache directory

This is especially relevant for multi-instance simultaneous setups
as we might risk read/write races.

(cherry picked from commit 56c1552c3e)

hs: validate cached database against current HS installation

After deserializing a cached Hyperscan database, verify that its
version, CPU features, and mode match the current Hyperscan
installation by comparing hs_database_info output against a
reference database. Reject loading incompatible caches.

Ticket: 8326
(cherry picked from commit 2e7b12dda4)

hs: include HS platform info in cache file hash

Hash Hyperscan installation info (version, CPU features, mode)
into the cache filename. A Hyperscan upgrade or platform change
would now produce a different filename, so stale caches from an
older installation are never opened.

Ticket: 8326
(cherry picked from commit d640719413)

hs: address coverity warning in a reference string

Move the locking mechanism outside of the getter function and hold the
lock until the reference string is no longer reused.

** CID 1682023:       Concurrent data access violations  (MISSING_LOCK)
/src/util-mpm-hs-cache.c: 139           in HSGetReferenceDbInfo()

(cherry picked from commit 6ec9e5c957)
pull/15003/head
Lukas Sismis 8 months ago committed by Victor Julien
parent 4604266685
commit e0f2cdf7c3

@ -83,6 +83,8 @@ if it is present on the system in case of the "auto" setting.
If the current suricata installation does not have hyperscan
support, refer to :ref:`installation`
.. _hyperscan-cache-configuration:
Hyperscan caching
~~~~~~~~~~~~~~~~~
@ -104,6 +106,28 @@ To enable this function, in `suricata.yaml` configure:
sgh-mpm-caching-path: /var/lib/suricata/cache/hs
To avoid cache files growing indefinitely, Suricata supports pruning of old
cache files. Suricata removes cache files older than the specified age
on startup/rule reloads, where age is determined by delta of the file
modification time and the current time.
Cache files that are actively being used will have their modification time
updated when loaded, so they won't be deleted.
In `suricata.yaml` configure:
::
detect:
sgh-mpm-caching-max-age: 7d
The setting accepts a combination of time units (s,m,h,d,w,y),
e.g. `1w 3d 12h` for 1 week, 3 days and 12 hours. Setting the value to `0`
disables pruning.
**Note**:
You might need to create and adjust permissions to the default caching folder
path, especially if you are running Suricata as a non-root user.
**Note**:
If you're running multiple Suricata instances, use separate cache folders
for each one to avoid read/write conflicts when they run at the same time.

@ -46,6 +46,10 @@ Other Changes
from unbounded to 2048. Configuration options, ``max-tx``,
``max-points``, and ``max-objects`` have been added for users who
may need to change these defaults.
- Hyperscan caching (`detect.sgh-mpm-caching`), when enabled, prunes
cache files that have not been used in the last 7 days by default.
See :ref:`Hyperscan caching configuration
<hyperscan-cache-configuration>` for more information.
Upgrading to 8.0.2
------------------

@ -502,10 +502,6 @@ skip_regular_rules:
ret = 0;
if (mpm_table[de_ctx->mpm_matcher].CacheRuleset != NULL) {
mpm_table[de_ctx->mpm_matcher].CacheRuleset(de_ctx->mpm_cfg);
}
end:
gettimeofday(&de_ctx->last_reload, NULL);
if (SCRunmodeGet() == RUNMODE_ENGINE_ANALYSIS) {

@ -2489,6 +2489,49 @@ const char *DetectEngineMpmCachingGetPath(void)
return SGH_CACHE_DIR;
}
void DetectEngineMpmCacheService(uint32_t op_flags)
{
DetectEngineCtx *de_ctx = DetectEngineGetCurrent();
if (!de_ctx) {
return;
}
if (!de_ctx->mpm_cfg || !de_ctx->mpm_cfg->cache_dir_path) {
goto error;
}
if (mpm_table[de_ctx->mpm_matcher].CacheStatsInit != NULL) {
de_ctx->mpm_cfg->cache_stats = mpm_table[de_ctx->mpm_matcher].CacheStatsInit();
if (de_ctx->mpm_cfg->cache_stats == NULL) {
goto error;
}
}
if (op_flags & DETECT_ENGINE_MPM_CACHE_OP_SAVE) {
if (mpm_table[de_ctx->mpm_matcher].CacheRuleset != NULL) {
mpm_table[de_ctx->mpm_matcher].CacheRuleset(de_ctx->mpm_cfg);
}
}
if (op_flags & DETECT_ENGINE_MPM_CACHE_OP_PRUNE) {
if (mpm_table[de_ctx->mpm_matcher].CachePrune != NULL) {
mpm_table[de_ctx->mpm_matcher].CachePrune(de_ctx->mpm_cfg);
}
}
if (mpm_table[de_ctx->mpm_matcher].CacheStatsPrint != NULL) {
mpm_table[de_ctx->mpm_matcher].CacheStatsPrint(de_ctx->mpm_cfg->cache_stats);
}
if (mpm_table[de_ctx->mpm_matcher].CacheStatsDeinit != NULL) {
mpm_table[de_ctx->mpm_matcher].CacheStatsDeinit(de_ctx->mpm_cfg->cache_stats);
de_ctx->mpm_cfg->cache_stats = NULL;
}
error:
DetectEngineDeReference(&de_ctx);
}
static DetectEngineCtx *DetectEngineCtxInitReal(
enum DetectEngineType type, const char *prefix, uint32_t tenant_id)
{
@ -2511,10 +2554,18 @@ static DetectEngineCtx *DetectEngineCtxInitReal(
if (de_ctx->mpm_cfg == NULL) {
goto error;
}
}
if (DetectEngineMpmCachingEnabled() && mpm_table[de_ctx->mpm_matcher].ConfigCacheDirSet) {
mpm_table[de_ctx->mpm_matcher].ConfigCacheDirSet(
de_ctx->mpm_cfg, DetectEngineMpmCachingGetPath());
if (DetectEngineMpmCachingEnabled() && mpm_table[de_ctx->mpm_matcher].ConfigCacheDirSet) {
mpm_table[de_ctx->mpm_matcher].ConfigCacheDirSet(
de_ctx->mpm_cfg, DetectEngineMpmCachingGetPath());
if (mpm_table[de_ctx->mpm_matcher].CachePrune) {
if (SCConfGetTime("detect.sgh-mpm-caching-max-age",
&de_ctx->mpm_cfg->cache_max_age_seconds) != 1) {
de_ctx->mpm_cfg->cache_max_age_seconds = 7ULL * 24ULL * 60ULL * 60ULL;
}
}
}
}
if (type == DETECT_ENGINE_TYPE_DD_STUB || type == DETECT_ENGINE_TYPE_MT_STUB) {
@ -4877,6 +4928,8 @@ int DetectEngineReload(const SCInstance *suri)
SCLogDebug("old_de_ctx should have been freed");
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_SAVE | DETECT_ENGINE_MPM_CACHE_OP_PRUNE);
SCLogNotice("rule reload complete");
#ifdef HAVE_MALLOC_TRIM

@ -89,6 +89,7 @@ TmEcode DetectEngineThreadCtxInit(ThreadVars *, void *, void **);
TmEcode DetectEngineThreadCtxDeinit(ThreadVars *, void *);
bool DetectEngineMpmCachingEnabled(void);
const char *DetectEngineMpmCachingGetPath(void);
void DetectEngineMpmCacheService(uint32_t op_flags);
/* faster as a macro than a inline function on my box -- VJ */
#define DetectEngineGetMaxSigId(de_ctx) ((de_ctx)->signum)
void DetectEngineResetMaxSigId(DetectEngineCtx *);

@ -1734,6 +1734,9 @@ extern SigTableElmt *sigmatch_table;
/** Remember to add the options in SignatureIsIPOnly() at detect.c otherwise it wont be part of a signature group */
#define DETECT_ENGINE_MPM_CACHE_OP_PRUNE BIT_U32(0)
#define DETECT_ENGINE_MPM_CACHE_OP_SAVE BIT_U32(1)
/* detection api */
TmEcode Detect(ThreadVars *tv, Packet *p, void *data);
uint8_t DetectPreFlow(ThreadVars *tv, DetectEngineThreadCtx *det_ctx, Packet *p);

@ -967,6 +967,8 @@ TmEcode UnixSocketRegisterTenantHandler(json_t *cmd, json_t* answer, void *data)
return TM_ECODE_FAILED;
}
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_SAVE);
json_object_set_new(answer, "message", json_string("handler added"));
return TM_ECODE_OK;
}
@ -1054,6 +1056,8 @@ TmEcode UnixSocketUnregisterTenantHandler(json_t *cmd, json_t* answer, void *dat
return TM_ECODE_FAILED;
}
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_PRUNE);
json_object_set_new(answer, "message", json_string("handler removed"));
return TM_ECODE_OK;
}
@ -1126,6 +1130,8 @@ TmEcode UnixSocketRegisterTenant(json_t *cmd, json_t* answer, void *data)
return TM_ECODE_FAILED;
}
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_SAVE);
json_object_set_new(answer, "message", json_string("adding tenant succeeded"));
return TM_ECODE_OK;
}
@ -1193,6 +1199,8 @@ TmEcode UnixSocketReloadTenant(json_t *cmd, json_t* answer, void *data)
return TM_ECODE_FAILED;
}
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_SAVE | DETECT_ENGINE_MPM_CACHE_OP_PRUNE);
json_object_set_new(answer, "message", json_string("reloading tenant succeeded"));
return TM_ECODE_OK;
}
@ -1226,6 +1234,7 @@ TmEcode UnixSocketReloadTenants(json_t *cmd, json_t *answer, void *data)
return TM_ECODE_FAILED;
}
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_SAVE | DETECT_ENGINE_MPM_CACHE_OP_PRUNE);
SCLogNotice("reload-tenants complete");
json_object_set_new(answer, "message", json_string("reloading tenants succeeded"));
@ -1284,6 +1293,8 @@ TmEcode UnixSocketUnregisterTenant(json_t *cmd, json_t* answer, void *data)
return TM_ECODE_FAILED;
}
DetectEngineMpmCacheService(DETECT_ENGINE_MPM_CACHE_OP_PRUNE);
/* walk free list, freeing the removed de_ctx */
DetectEnginePruneFreeList();

@ -2677,6 +2677,8 @@ void PostConfLoadedDetectSetup(SCInstance *suri)
gettimeofday(&de_ctx->last_reload, NULL);
DetectEngineAddToMaster(de_ctx);
DetectEngineBumpVersion();
DetectEngineMpmCacheService(
DETECT_ENGINE_MPM_CACHE_OP_SAVE | DETECT_ENGINE_MPM_CACHE_OP_PRUNE);
}
}

@ -37,21 +37,25 @@
#include "rust.h"
#include <hs.h>
static const char *HSCacheConstructFPath(const char *folder_path, const char *hs_db_hash)
{
static char hash_file_path[PATH_MAX];
#define HS_CACHE_FILE_VERSION "2"
#define HS_CACHE_FILE_SUFFIX "_v" HS_CACHE_FILE_VERSION ".hs"
static SCMutex g_hs_ref_info_mutex = SCMUTEX_INITIALIZER;
static char *g_hs_ref_info = NULL;
char hash_file_path_suffix[] = "_v1.hs";
static int16_t HSCacheConstructFPath(
const char *dir_path, const char *db_hash, char *out_path, uint16_t out_path_size)
{
char filename[NAME_MAX];
uint64_t r = snprintf(filename, sizeof(filename), "%s%s", hs_db_hash, hash_file_path_suffix);
if (r != (uint64_t)(strlen(hs_db_hash) + strlen(hash_file_path_suffix)))
return NULL;
uint64_t r = snprintf(filename, sizeof(filename), "%s" HS_CACHE_FILE_SUFFIX, db_hash);
if (r != (uint64_t)(strlen(db_hash) + strlen(HS_CACHE_FILE_SUFFIX)))
return -1;
r = PathMerge(hash_file_path, sizeof(hash_file_path), folder_path, filename);
r = PathMerge(out_path, out_path_size, dir_path, filename);
if (r)
return NULL;
return -1;
return hash_file_path;
return 0;
}
static char *HSReadStream(const char *file_path, size_t *buffer_sz)
@ -119,49 +123,111 @@ static void SCHSCachePatternHash(const SCHSPattern *p, SCSha256 *sha256)
SCSha256Update(sha256, (const uint8_t *)p->sids, p->sids_size * sizeof(SigIntId));
}
/**
* \brief Get the hs_database_info string for a reference BLOCK-mode database
* compiled with the current Hyperscan installation.
*
* Compiled lazily on the first call and cached. Thread-safe.
*
* coverity[missing_lock] -- g_hs_ref_info is only set once and never modified after. If a thread
* sees a non-NULL value it is guaranteed to be valid. If multiple threads call this at the same
* time, the first one that gets the lock will set g_hs_ref_info and the others will just read it
* after they get the lock.
*
* \retval Pointer to info string, or NULL on failure. Do not free the ptr.
*/
static const char *HSGetReferenceDbInfo(void)
{
if (g_hs_ref_info != NULL) {
return g_hs_ref_info;
}
hs_database_t *ref_db = NULL;
hs_compile_error_t *compile_err = NULL;
hs_error_t err = hs_compile("Suricata suricatta is the scientific name for the meerkat",
HS_FLAG_SINGLEMATCH, HS_MODE_BLOCK, NULL, &ref_db, &compile_err);
if (err == HS_SUCCESS && ref_db != NULL) {
if (hs_database_info(ref_db, &g_hs_ref_info) != HS_SUCCESS) {
if (g_hs_ref_info)
SCFree(g_hs_ref_info);
g_hs_ref_info = NULL;
}
hs_free_database(ref_db);
}
if (compile_err != NULL) {
SCLogInfo("Failed to compile reference Hyperscan database: %s", compile_err->message);
hs_free_compile_error(compile_err);
}
return g_hs_ref_info;
}
int HSLoadCache(hs_database_t **hs_db, const char *hs_db_hash, const char *dirpath)
{
const char *hash_file_static = HSCacheConstructFPath(dirpath, hs_db_hash);
if (hash_file_static == NULL)
char hash_file_static[PATH_MAX];
int ret = (int)HSCacheConstructFPath(
dirpath, hs_db_hash, hash_file_static, sizeof(hash_file_static));
if (ret != 0)
return -1;
SCLogDebug("Loading the cached HS DB from %s", hash_file_static);
if (!SCPathExists(hash_file_static))
return -1;
FILE *db_cache = fopen(hash_file_static, "r");
char *db_info = NULL;
char *buffer = NULL;
int ret = 0;
if (db_cache) {
size_t buffer_size;
buffer = HSReadStream(hash_file_static, &buffer_size);
if (!buffer) {
SCLogWarning("Hyperscan cached DB file %s cannot be read", hash_file_static);
size_t buffer_size;
buffer = HSReadStream(hash_file_static, &buffer_size);
if (!buffer) {
SCLogWarning("Hyperscan cached DB file %s cannot be read", hash_file_static);
return -1;
}
hs_error_t error = hs_deserialize_database(buffer, buffer_size, hs_db);
if (error != HS_SUCCESS) {
SCLogWarning("Failed to deserialize Hyperscan database of %s: %s", hash_file_static,
HSErrorToStr(error));
ret = -1;
goto freeup;
}
// Verify the loaded database is compatible with the current Hyperscan
// If both the loaded DB and the reference DB fail to load, consider the cache.
SCMutexLock(&g_hs_ref_info_mutex);
const char *ref_info = HSGetReferenceDbInfo();
if (ref_info != NULL) {
if (hs_database_info(*hs_db, &db_info) != HS_SUCCESS || db_info == NULL) {
SCLogDebug("Failed to query info for loaded Hyperscan database %s: %s",
hash_file_static, HSErrorToStr(error));
ret = -1;
SCMutexUnlock(&g_hs_ref_info_mutex);
goto freeup;
}
hs_error_t error = hs_deserialize_database(buffer, buffer_size, hs_db);
if (error != HS_SUCCESS) {
SCLogWarning("Failed to deserialize Hyperscan database of %s: %s", hash_file_static,
HSErrorToStr(error));
if (strcmp(db_info, ref_info) != 0) {
SCLogDebug("Loaded Hyperscan database %s is incompatible with the current "
"Hyperscan installation and will be ignored",
hash_file_static);
ret = -1;
SCMutexUnlock(&g_hs_ref_info_mutex);
goto freeup;
}
ret = 0;
/* Touch file to update modification time so active caches are retained. */
if (SCTouchFile(hash_file_static) != 0) {
SCLogDebug("Failed to update mtime for %s", hash_file_static);
}
goto freeup;
}
SCMutexUnlock(&g_hs_ref_info_mutex);
ret = 0;
/* Touch file to update modification time so active caches are retained. */
if (SCTouchFile(hash_file_static) != 0) {
SCLogDebug("Failed to update mtime for %s", hash_file_static);
}
freeup:
if (db_cache)
fclose(db_cache);
if (buffer)
SCFree(buffer);
if (ret != 0 && *hs_db != NULL) {
hs_free_database(*hs_db);
*hs_db = NULL;
}
if (db_info)
SCFree(db_info);
SCFree(buffer);
return ret;
}
@ -170,15 +236,20 @@ static int HSSaveCache(hs_database_t *hs_db, const char *hs_db_hash, const char
static bool notified = false;
char *db_stream = NULL;
size_t db_size;
int ret = -1;
int ret;
hs_error_t err = hs_serialize_database(hs_db, &db_stream, &db_size);
if (err != HS_SUCCESS) {
SCLogWarning("Failed to serialize Hyperscan database: %s", HSErrorToStr(err));
ret = -1;
goto cleanup;
}
const char *hash_file_static = HSCacheConstructFPath(dstpath, hs_db_hash);
char hash_file_static[PATH_MAX];
ret = (int)HSCacheConstructFPath(
dstpath, hs_db_hash, hash_file_static, sizeof(hash_file_static));
if (ret != 0)
goto cleanup;
SCLogDebug("Caching the compiled HS at %s", hash_file_static);
if (SCPathExists(hash_file_static)) {
// potentially signs that it might not work as expected as we got into
@ -190,7 +261,7 @@ static int HSSaveCache(hs_database_t *hs_db, const char *hs_db_hash, const char
hash_file_static);
}
FILE *db_cache_out = fopen(hash_file_static, "w");
FILE *db_cache_out = fopen(hash_file_static, "wb");
if (!db_cache_out) {
if (!notified) {
SCLogWarning("Failed to create Hyperscan cache file, make sure the folder exist and is "
@ -198,17 +269,15 @@ static int HSSaveCache(hs_database_t *hs_db, const char *hs_db_hash, const char
hash_file_static);
notified = true;
}
ret = -1;
goto cleanup;
}
size_t r = fwrite(db_stream, sizeof(db_stream[0]), db_size, db_cache_out);
if (r > 0 && (size_t)r != db_size) {
if (r != db_size) {
SCLogWarning("Failed to write to file: %s", hash_file_static);
if (r != db_size) {
// possibly a corrupted DB cache was created
r = remove(hash_file_static);
if (r != 0) {
SCLogWarning("Failed to remove corrupted cache file: %s", hash_file_static);
}
// possibly a corrupted DB cache was created
if (remove(hash_file_static) != 0) {
SCLogWarning("Failed to remove corrupted cache file: %s", hash_file_static);
}
}
ret = fclose(db_cache_out);
@ -217,7 +286,6 @@ static int HSSaveCache(hs_database_t *hs_db, const char *hs_db_hash, const char
goto cleanup;
}
ret = 0;
cleanup:
if (db_stream)
SCFree(db_stream);
@ -231,6 +299,14 @@ int HSHashDb(const PatternDatabase *pd, char *hash, size_t hash_len)
SCLogDebug("sha256 hashing failed");
return -1;
}
SCMutexLock(&g_hs_ref_info_mutex);
const char *ref_info = HSGetReferenceDbInfo();
if (ref_info != NULL) {
SCSha256Update(hasher, (const uint8_t *)ref_info, strlen(ref_info));
}
SCMutexUnlock(&g_hs_ref_info_mutex);
SCSha256Update(hasher, (const uint8_t *)&pd->pattern_cnt, sizeof(pd->pattern_cnt));
for (uint32_t i = 0; i < pd->pattern_cnt; i++) {
SCHSCachePatternHash(pd->parray[i], hasher);
@ -270,4 +346,217 @@ void HSSaveCacheIterator(void *data, void *aux)
}
}
void HSCacheFilenameUsedIterator(void *data, void *aux)
{
PatternDatabase *pd = (PatternDatabase *)data;
struct HsInUseCacheFilesIteratorData *iter_data = (struct HsInUseCacheFilesIteratorData *)aux;
if (pd->no_cache || !pd->cached)
return;
char hs_db_hash[SC_SHA256_LEN * 2 + 1]; // * 2 for hex +1 for nul terminator
if (HSHashDb(pd, hs_db_hash, ARRAY_SIZE(hs_db_hash)) != 0) {
return;
}
char *fpath = SCCalloc(PATH_MAX, sizeof(char));
if (fpath == NULL) {
SCLogWarning("Failed to allocate memory for cache file path");
return;
}
if (HSCacheConstructFPath(iter_data->cache_path, hs_db_hash, fpath, PATH_MAX)) {
SCFree(fpath);
return;
}
int r = HashTableAdd(iter_data->tbl, (void *)fpath, (uint16_t)strlen(fpath));
if (r < 0) {
SCLogWarning("Failed to add used cache file path %s to hash table", fpath);
SCFree(fpath);
}
}
/**
* \brief Check if HS cache file is stale by age.
*
* \param mtime File modification time.
* \param cutoff Time cutoff (files older than this will be removed).
*
* \retval true if file should be pruned, false otherwise.
*/
static bool HSPruneFileByAge(time_t mtime, time_t cutoff)
{
return mtime < cutoff;
}
/**
* \brief Check if HS cache file is version-compatible.
*
* \param filename Cache file name.
*
* \retval true if file should be pruned, false otherwise.
*/
static bool HSPruneFileByVersion(const char *filename)
{
if (strlen(filename) < strlen(HS_CACHE_FILE_SUFFIX)) {
return true;
}
const char *underscore = strrchr(filename, '_');
if (underscore == NULL || strcmp(underscore, HS_CACHE_FILE_SUFFIX) != 0) {
return true;
}
return false;
}
int SCHSCachePruneEvaluate(MpmConfig *mpm_conf, HashTable *inuse_caches)
{
if (mpm_conf == NULL || mpm_conf->cache_dir_path == NULL)
return -1;
if (mpm_conf->cache_max_age_seconds == 0)
return 0; // disabled
const time_t now = time(NULL);
if (now == (time_t)-1) {
return -1;
} else if (mpm_conf->cache_max_age_seconds >= (uint64_t)now) {
return 0;
}
DIR *dir = opendir(mpm_conf->cache_dir_path);
if (dir == NULL) {
return -1;
}
struct dirent *ent;
char path[PATH_MAX];
uint32_t considered = 0, removed_by_age = 0, removed_by_version = 0;
const time_t cutoff = now - (time_t)mpm_conf->cache_max_age_seconds;
while ((ent = readdir(dir)) != NULL) {
const char *name = ent->d_name;
size_t namelen = strlen(name);
if (namelen < 3 || strcmp(name + namelen - 3, ".hs") != 0)
continue;
if (PathMerge(path, ARRAY_SIZE(path), mpm_conf->cache_dir_path, name) != 0)
continue;
struct stat st;
/* TOCTOU: race window between stat and unlink is acceptable here.
* On Linux somebody can still modify (use the cache file) between the
* fstat and unlink, on Windows (HS not supported there but still relevant)
* TOCTOU happens when closing the file descriptor and unlinking the file.
* Cache mechanism is best-effort and e.g. not pruning or pruning an extra
* cache file is not problematic.
* Stat is used here to ease file handling as fstat doesn't bring any benefit */
/* coverity[toctou] */
if (SCStatFn(path, &st) != 0 || !S_ISREG(st.st_mode))
continue;
considered++;
const bool prune_by_age = HSPruneFileByAge(st.st_mtime, cutoff);
const bool prune_by_version = HSPruneFileByVersion(name);
if (!prune_by_age && !prune_by_version)
continue;
void *cache_inuse = HashTableLookup(inuse_caches, path, (uint16_t)strlen(path));
if (cache_inuse != NULL)
continue; // in use
/* coverity[toctou] */
int ret = unlink(path);
if (ret == 0 || (ret == -1 && errno == ENOENT)) {
if (prune_by_version)
removed_by_version++;
else if (prune_by_age)
removed_by_age++;
SCLogDebug("File %s removed because of %s%s%s", path, prune_by_age ? "age" : "",
prune_by_age && prune_by_version ? " and " : "",
prune_by_version ? "incompatible version" : "");
} else {
SCLogWarning("Failed to prune \"%s\": %s", path, strerror(errno));
}
}
closedir(dir);
PatternDatabaseCache *pd_cache_stats = mpm_conf->cache_stats;
if (pd_cache_stats) {
pd_cache_stats->hs_dbs_cache_pruned_by_age_cnt = removed_by_age;
pd_cache_stats->hs_dbs_cache_pruned_by_version_cnt = removed_by_version;
pd_cache_stats->hs_dbs_cache_pruned_considered_cnt = considered;
pd_cache_stats->hs_dbs_cache_pruned_cutoff = cutoff;
pd_cache_stats->cache_max_age_seconds = mpm_conf->cache_max_age_seconds;
}
return 0;
}
void *SCHSCacheStatsInit(void)
{
PatternDatabaseCache *pd_cache_stats = SCCalloc(1, sizeof(PatternDatabaseCache));
if (pd_cache_stats == NULL) {
SCLogError("Failed to allocate memory for Hyperscan cache stats");
return NULL;
}
return pd_cache_stats;
}
void SCHSCacheStatsPrint(void *data)
{
if (data == NULL) {
return;
}
PatternDatabaseCache *pd_cache_stats = (PatternDatabaseCache *)data;
char time_str[64];
struct tm tm_s;
struct tm *tm_info = SCLocalTime(pd_cache_stats->hs_dbs_cache_pruned_cutoff, &tm_s);
if (tm_info != NULL) {
strftime(time_str, ARRAY_SIZE(time_str), "%Y-%m-%d %H:%M:%S", tm_info);
} else {
snprintf(time_str, ARRAY_SIZE(time_str), "%" PRIu64 " seconds",
pd_cache_stats->cache_max_age_seconds);
}
if (pd_cache_stats->hs_cacheable_dbs_cnt) {
SCLogPerf("rule group caching - loaded: %u newly cached: %u total cacheable: %u",
pd_cache_stats->hs_dbs_cache_loaded_cnt, pd_cache_stats->hs_dbs_cache_saved_cnt,
pd_cache_stats->hs_cacheable_dbs_cnt);
}
if (pd_cache_stats->hs_dbs_cache_pruned_considered_cnt) {
if (pd_cache_stats->hs_dbs_cache_pruned_by_version_cnt) {
SCLogInfo("rule group cache pruning removed %u/%u of HS caches due to "
"version-incompatibility (not v%s)",
pd_cache_stats->hs_dbs_cache_pruned_by_version_cnt,
pd_cache_stats->hs_dbs_cache_pruned_considered_cnt, HS_CACHE_FILE_VERSION);
}
if (pd_cache_stats->hs_dbs_cache_pruned_by_age_cnt) {
SCLogInfo("rule group cache pruning removed %u/%u of HS caches due to "
"age (older than %s)",
pd_cache_stats->hs_dbs_cache_pruned_by_age_cnt,
pd_cache_stats->hs_dbs_cache_pruned_considered_cnt, time_str);
}
}
}
void SCHSCacheStatsDeinit(void *data)
{
if (data == NULL) {
return;
}
PatternDatabaseCache *pd_cache_stats = (PatternDatabaseCache *)data;
SCFree(pd_cache_stats);
}
void SCHSCacheDeinit(void)
{
SCMutexLock(&g_hs_ref_info_mutex);
if (g_hs_ref_info != NULL) {
SCFree(g_hs_ref_info);
g_hs_ref_info = NULL;
}
SCMutexUnlock(&g_hs_ref_info_mutex);
}
#endif /* BUILD_HYPERSCAN */

@ -35,9 +35,25 @@ struct HsIteratorData {
const char *cache_path;
};
/**
* \brief Data structure to store in-use cache files.
* Used in cache pruning to avoid deleting files that are still in use.
*/
struct HsInUseCacheFilesIteratorData {
HashTable *tbl; // stores file paths of in-use cache files
const char *cache_path;
};
int HSLoadCache(hs_database_t **hs_db, const char *hs_db_hash, const char *dirpath);
int HSHashDb(const PatternDatabase *pd, char *hash, size_t hash_len);
void HSSaveCacheIterator(void *data, void *aux);
void HSCacheFilenameUsedIterator(void *data, void *aux);
int SCHSCachePruneEvaluate(MpmConfig *mpm_conf, HashTable *inuse_caches);
void *SCHSCacheStatsInit(void);
void SCHSCacheStatsPrint(void *data);
void SCHSCacheStatsDeinit(void *data);
void SCHSCacheDeinit(void);
#endif /* BUILD_HYPERSCAN */
#endif /* SURICATA_UTIL_MPM_HS_CACHE__H */

@ -93,6 +93,11 @@ typedef struct PatternDatabaseCache_ {
uint32_t hs_cacheable_dbs_cnt;
uint32_t hs_dbs_cache_loaded_cnt;
uint32_t hs_dbs_cache_saved_cnt;
uint32_t hs_dbs_cache_pruned_by_age_cnt;
uint32_t hs_dbs_cache_pruned_by_version_cnt;
uint32_t hs_dbs_cache_pruned_considered_cnt;
time_t hs_dbs_cache_pruned_cutoff;
uint64_t cache_max_age_seconds;
} PatternDatabaseCache;
const char *HSErrorToStr(hs_error_t error_code);

@ -835,18 +835,53 @@ static int SCHSCacheRuleset(MpmConfig *mpm_conf)
mpm_conf->cache_dir_path);
return -1;
}
PatternDatabaseCache pd_stats = { 0 };
struct HsIteratorData iter_data = { .pd_stats = &pd_stats,
PatternDatabaseCache *pd_stats = mpm_conf->cache_stats;
struct HsIteratorData iter_data = { .pd_stats = pd_stats,
.cache_path = mpm_conf->cache_dir_path };
SCMutexLock(&g_db_table_mutex);
HashTableIterate(g_db_table, HSSaveCacheIterator, &iter_data);
SCMutexUnlock(&g_db_table_mutex);
SCLogNotice("Rule group caching - loaded: %u newly cached: %u total cacheable: %u",
pd_stats.hs_dbs_cache_loaded_cnt, pd_stats.hs_dbs_cache_saved_cnt,
pd_stats.hs_cacheable_dbs_cnt);
return 0;
}
static uint32_t FilenameTableHash(HashTable *ht, void *data, uint16_t len)
{
const char *fname = data;
uint32_t hash = hashlittle_safe(data, strlen(fname), 0);
hash %= ht->array_size;
return hash;
}
static void FilenameTableFree(void *data)
{
SCFree(data);
}
static int SCHSCachePrune(MpmConfig *mpm_conf)
{
if (!mpm_conf || !mpm_conf->cache_dir_path) {
return -1;
}
SCLogDebug("Pruning the Hyperscan cache folder %s", mpm_conf->cache_dir_path);
// we need to initialize hash map of in-use cache files
HashTable *inuse_caches =
HashTableInit(INIT_DB_HASH_SIZE, FilenameTableHash, NULL, FilenameTableFree);
if (inuse_caches == NULL) {
return -1;
}
struct HsInUseCacheFilesIteratorData iter_data = { .tbl = inuse_caches,
.cache_path = mpm_conf->cache_dir_path };
SCMutexLock(&g_db_table_mutex);
HashTableIterate(g_db_table, HSCacheFilenameUsedIterator, &iter_data);
SCMutexUnlock(&g_db_table_mutex);
int r = SCHSCachePruneEvaluate(mpm_conf, inuse_caches);
HashTableFree(inuse_caches);
return r;
}
/**
* \brief Init the mpm thread context.
*
@ -1178,7 +1213,11 @@ void MpmHSRegister(void)
mpm_table[MPM_HS].AddPattern = SCHSAddPatternCS;
mpm_table[MPM_HS].AddPatternNocase = SCHSAddPatternCI;
mpm_table[MPM_HS].Prepare = SCHSPreparePatterns;
mpm_table[MPM_HS].CacheStatsInit = SCHSCacheStatsInit;
mpm_table[MPM_HS].CacheStatsPrint = SCHSCacheStatsPrint;
mpm_table[MPM_HS].CacheStatsDeinit = SCHSCacheStatsDeinit;
mpm_table[MPM_HS].CacheRuleset = SCHSCacheRuleset;
mpm_table[MPM_HS].CachePrune = SCHSCachePrune;
mpm_table[MPM_HS].Search = SCHSSearch;
mpm_table[MPM_HS].PrintCtx = SCHSPrintInfo;
mpm_table[MPM_HS].PrintThreadCtx = SCHSPrintSearchStats;
@ -1212,6 +1251,8 @@ void MpmHSGlobalCleanup(void)
g_db_table = NULL;
}
SCMutexUnlock(&g_db_table_mutex);
SCHSCacheDeinit();
}
/*************************************Unittests********************************/

@ -88,6 +88,8 @@ typedef struct MpmPattern_ {
typedef struct MpmConfig_ {
const char *cache_dir_path;
uint64_t cache_max_age_seconds; /* 0 means disabled/no pruning policy */
void *cache_stats;
} MpmConfig;
typedef struct MpmCtx_ {
@ -173,7 +175,11 @@ typedef struct MpmTableElmt_ {
int (*AddPatternNocase)(struct MpmCtx_ *, const uint8_t *, uint16_t, uint16_t, uint16_t,
uint32_t, SigIntId, uint8_t);
int (*Prepare)(MpmConfig *, struct MpmCtx_ *);
void *(*CacheStatsInit)(void);
void (*CacheStatsPrint)(void *data);
void (*CacheStatsDeinit)(void *data);
int (*CacheRuleset)(MpmConfig *);
int (*CachePrune)(MpmConfig *);
/** \retval cnt number of patterns that matches: once per pattern max. */
uint32_t (*Search)(const struct MpmCtx_ *, struct MpmThreadCtx_ *, PrefilterRuleStore *, const uint8_t *, uint32_t);
void (*PrintCtx)(struct MpmCtx_ *);

@ -1803,6 +1803,10 @@ detect:
# Cache files are created in the standard library directory.
sgh-mpm-caching: yes
sgh-mpm-caching-path: @e_sghcachedir@
# Maximum age for cached MPM databases before they are pruned.
# Accepts a combination of time units (s,m,h,d,w,y).
# Omit to use the default, 0 to disable.
# sgh-mpm-caching-max-age: 7d
# inspection-recursion-limit: 3000
# maximum number of times a tx will get logged for rules without app-layer keywords
# stream-tx-log-limit: 4

Loading…
Cancel
Save