]> git.ipfire.org Git - thirdparty/rspamd.git/commitdiff
[Test] Add functional tests for HTML fuzzy hashing
authorVsevolod Stakhov <vsevolod@rspamd.com>
Sat, 4 Oct 2025 18:41:27 +0000 (19:41 +0100)
committerVsevolod Stakhov <vsevolod@rspamd.com>
Sat, 4 Oct 2025 18:41:27 +0000 (19:41 +0100)
Add Robot Framework tests for HTML fuzzy matching:
- html_template_1.eml: legitimate newsletter template
- html_template_1_variation.eml: same structure, different text
- html_phishing.eml: same structure, phishing CTA domains
- html-fuzzy.robot: test suite with add/check/phishing scenarios

Tests verify:
- HTML fuzzy hash generation and matching
- Template variation detection (same structure, different content)
- Phishing detection (same structure, different CTA domains)
- Integration with fuzzy storage backend

test/functional/cases/120_fuzzy/html-fuzzy.robot [new file with mode: 0644]
test/functional/messages/html_phishing.eml [new file with mode: 0644]
test/functional/messages/html_template_1.eml [new file with mode: 0644]
test/functional/messages/html_template_1_variation.eml [new file with mode: 0644]

diff --git a/test/functional/cases/120_fuzzy/html-fuzzy.robot b/test/functional/cases/120_fuzzy/html-fuzzy.robot
new file mode 100644 (file)
index 0000000..3cdc1c4
--- /dev/null
@@ -0,0 +1,49 @@
+*** Settings ***
+Suite Setup     HTML Fuzzy Setup
+Suite Teardown  Rspamd Redis Teardown
+Resource        lib.robot
+
+*** Variables ***
+${HTML_TEMPLATE_1}         ${RSPAMD_TESTDIR}/messages/html_template_1.eml
+${HTML_TEMPLATE_1_VAR}     ${RSPAMD_TESTDIR}/messages/html_template_1_variation.eml
+${HTML_PHISHING}           ${RSPAMD_TESTDIR}/messages/html_phishing.eml
+${FLAG_HTML_WHITE}         100
+${FLAG_HTML_SPAM}          101
+
+*** Keywords ***
+HTML Fuzzy Setup
+  Set Suite Variable  ${RSPAMD_FUZZY_ALGORITHM}  mumhash
+  Set Suite Variable  ${RSPAMD_FUZZY_SERVER_MODE}  servers
+  Set Suite Variable  ${SETTINGS_FUZZY_CHECK}  servers = "${RSPAMD_LOCAL_ADDR}:${RSPAMD_PORT_FUZZY}"; html_shingles = true; min_html_tags = 5;
+  Rspamd Redis Setup
+
+HTML Fuzzy Add Whitelist
+  [Documentation]  Learn legitimate HTML template
+  ${result} =  Run Rspamc  -h  ${RSPAMD_LOCAL_ADDR}:${RSPAMD_PORT_CONTROLLER}  -w  10  -f  ${FLAG_HTML_WHITE}  fuzzy_add  ${HTML_TEMPLATE_1}
+  Check Rspamc  ${result}
+  Sync Fuzzy Storage
+
+HTML Fuzzy Check Variation
+  [Documentation]  Check variation of same template (different text, same structure)
+  Scan File  ${HTML_TEMPLATE_1_VAR}
+  Expect Symbol  R_TEST_FUZZY_DENIED
+  ${symbols} =  Get Rspamd Symbols
+  Log  Fuzzy symbols: ${symbols}
+
+HTML Fuzzy Check Phishing
+  [Documentation]  Check phishing email (same structure, different CTA domains)
+  Scan File  ${HTML_PHISHING}
+  # Should match structure but CTA differs
+  # Depending on CTA weight, might have lower score or specific handling
+  ${symbols} =  Get Rspamd Symbols
+  Log  Phishing check symbols: ${symbols}
+
+*** Test Cases ***
+HTML Fuzzy Add Whitelist Test
+  HTML Fuzzy Add Whitelist
+
+HTML Fuzzy Variation Match Test
+  HTML Fuzzy Check Variation
+
+HTML Fuzzy Phishing Detection Test
+  HTML Fuzzy Check Phishing
diff --git a/test/functional/messages/html_phishing.eml b/test/functional/messages/html_phishing.eml
new file mode 100644 (file)
index 0000000..e328c21
--- /dev/null
@@ -0,0 +1,28 @@
+From: notification@example.com
+To: user@test.com
+Subject: Your weekly newsletter  
+Content-Type: text/html; charset=utf-8
+
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Newsletter</title>
+</head>
+<body>
+  <div class="header">
+    <a href="https://phishing-site.evil"><img src="https://cdn.example.com/logo.png" alt="Logo"/></a>
+  </div>
+  <div class="content">
+    <h1>Weekly Newsletter</h1>
+    <p>URGENT: Verify your account now!</p>
+    <div class="article">
+      <h2>Security Alert</h2>
+      <p>Your account has been compromised click here immediately</p>
+      <a class="button" href="https://phishing-site.evil/steal-credentials">Verify Now</a>
+    </div>
+  </div>
+  <div class="footer">
+    <p>Unsubscribe: <a href="https://phishing-site.evil/fake">click here</a></p>
+  </div>
+</body>
+</html>
diff --git a/test/functional/messages/html_template_1.eml b/test/functional/messages/html_template_1.eml
new file mode 100644 (file)
index 0000000..4956a9c
--- /dev/null
@@ -0,0 +1,28 @@
+From: notification@example.com
+To: user@test.com
+Subject: Your weekly newsletter
+Content-Type: text/html; charset=utf-8
+
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Newsletter</title>
+</head>
+<body>
+  <div class="header">
+    <a href="https://example.com"><img src="https://cdn.example.com/logo.png" alt="Logo"/></a>
+  </div>
+  <div class="content">
+    <h1>Weekly Newsletter</h1>
+    <p>Here are your top stories this week</p>
+    <div class="article">
+      <h2>Article Title</h2>
+      <p>Article content goes here with some text</p>
+      <a class="button" href="https://example.com/article">Read More</a>
+    </div>
+  </div>
+  <div class="footer">
+    <p>Unsubscribe: <a href="https://example.com/unsubscribe">click here</a></p>
+  </div>
+</body>
+</html>
diff --git a/test/functional/messages/html_template_1_variation.eml b/test/functional/messages/html_template_1_variation.eml
new file mode 100644 (file)
index 0000000..21e109f
--- /dev/null
@@ -0,0 +1,28 @@
+From: notification@example.com
+To: user@test.com
+Subject: Your weekly newsletter
+Content-Type: text/html; charset=utf-8
+
+<!DOCTYPE html>
+<html>
+<head>
+  <title>Newsletter</title>
+</head>
+<body>
+  <div class="header">
+    <a href="https://example.com"><img src="https://cdn.example.com/logo.png" alt="Logo"/></a>
+  </div>
+  <div class="content">
+    <h1>Weekly Newsletter</h1>
+    <p>Different stories for different week</p>
+    <div class="article">
+      <h2>Different Article Title</h2>
+      <p>Completely different article content with other text here</p>
+      <a class="button" href="https://example.com/another-article">Read More</a>
+    </div>
+  </div>
+  <div class="footer">
+    <p>Unsubscribe: <a href="https://example.com/unsubscribe">click here</a></p>
+  </div>
+</body>
+</html>