From: Vsevolod Stakhov Date: Sat, 23 May 2026 10:14:42 +0000 (+0100) Subject: [Fix] neural: digest stability under disable_symbols_input X-Git-Tag: 4.1.0~28 X-Git-Url: http://git.ipfire.org/gitweb/index.cgi?a=commitdiff_plain;h=1323fc5fbfeb74754e2f625f51f37462c0fbdba3;p=thirdparty%2Frspamd.git [Fix] neural: digest stability under disable_symbols_input The profile digest forms part of the Redis key holding the trained ANN (rn____). process_settings_elt computed it as lua_util.table_digest(selt.symbols) unconditionally. With disable_symbols_input=true the symbol catalogue does not feed the model -- only providers + fusion + max_inputs determine the input-vector schema (see is_profile_compatible) -- so hashing the unrelated symbol list rotated the digest whenever any rspamd symbol was added/removed elsewhere (a new RBL, a multimap rule, an SA-style rule loaded via multimap's regexp_rules). The trained ANN was orphaned in Redis under the old key and inference silently dropped to zero hits until a new sample set retrained from scratch (weeks under realistic class imbalance). Manual recovery via `redis-cli COPY` of the old key to the new digest was the only fix. Now: when has_providers + disable_symbols_input, the digest is providers_config_digest(rule.providers, rule). Other modes keep the existing symbol-based digest. Migration: any deployment already running disable_symbols_input=true with a trained ANN will see its digest rotate once on first start after this lands. Either let the model retrain, or use the same `redis-cli COPY rn____ rn____` recipe one final time -- after this fix the digest is stable across unrelated rspamd config changes. --- diff --git a/lualib/plugins/neural.lua b/lualib/plugins/neural.lua index 358ad080a0..c2362a0a42 100644 --- a/lualib/plugins/neural.lua +++ b/lualib/plugins/neural.lua @@ -1396,12 +1396,33 @@ local function process_rules_settings() table.sort(selt.symbols) - selt.digest = lua_util.table_digest(selt.symbols) + -- Profile digest -- forms part of the Redis key holding the trained ANN + -- (rn____). It MUST be stable across config + -- changes that don't alter the model's input-vector schema; otherwise + -- the trained ANN is abandoned and inference silently degrades until a + -- new sample set retrains it (weeks under realistic class imbalance). + -- + -- With disable_symbols_input + providers, symbols never enter the input + -- vector (see is_profile_compatible above); the architecture is fully + -- determined by providers + fusion + max_inputs config. Hashing the + -- unrelated symbol catalogue here used to rotate the digest whenever + -- any rspamd symbol was added/removed elsewhere (a new RBL, multimap + -- rule, etc.), and operators had to manually COPY the Redis key over + -- to the new digest to recover. + local has_providers = rule.providers and #rule.providers > 0 + local digest_source + if has_providers and rule.disable_symbols_input then + selt.digest = providers_config_digest(rule.providers, rule) + digest_source = 'providers' + else + selt.digest = lua_util.table_digest(selt.symbols) + digest_source = 'symbols' + end selt.prefix = redis_ann_prefix(rule, selt.name) rspamd_logger.messagex(rspamd_config, - 'use NN prefix for rule %s; settings id "%s"; symbols digest: "%s"', - selt.prefix, selt.name, selt.digest) + 'use NN prefix for rule %s; settings id "%s"; %s digest: "%s"', + selt.prefix, selt.name, digest_source, selt.digest) lua_redis.register_prefix(selt.prefix, N, string.format('NN prefix for rule "%s"; settings id "%s"',