SpamBlock Pixel Documentation
This guide walks you through installing the SpamBlock pixel, understanding the scoring pipeline, and tuning the available configuration options.
Quick start
- Add the script. Place the pixel on every page that contains a form you want protected.
- Add
data-block-spamto the forms you want enforced. If no forms have the attribute, SpamBlock will protect all forms on the page. - Submit the form. SpamBlock intercepts the submission, runs 13+ detection signals, and only forwards the request when the score is below the configured threshold.
<script
src="https://api.spamblock.io/sdk/pixel/v1.js" defer
></script>
<form data-block-spam>
... your existing fields ...
</form>
Detection signals
Core signals (always enabled):
- Honeypot field detection (+90 points, instant block)
- Disposable email domains (+40 points)
- IP reputation checks (+100 if denied)
- Tor exit nodes (+35 points)
Advanced signals (enabled by default):
- Language detection & mismatch analysis (+30 max)
- Entropy & repetition detection (+30 max)
- Header analysis (User-Agent, Referer, ASN) (+25 max)
- Intent classification (spam keywords) (+15 per keyword)
- Timing & behavioral analysis (+20 max)
- Script detection & homoglyph analysis (+30 max)
- Profanity detection (+30 per word)
Default threshold is 60. Submissions with a score ≥ threshold are blocked. All signals can be toggled individually via configuration.
Configuration reference
Set configuration via data-* attributes directly on the script tag. All attributes are optional with sensible defaults.
Core Configuration
| Attribute | Default | Description |
|---|---|---|
data-max-score |
60 | Highest acceptable score. Requests with a score ≥ threshold are blocked. |
data-debug |
false | Logs detailed scoring output to the browser console; useful during testing. |
Feature Toggles
| Attribute | Default | Description |
|---|---|---|
data-assess-profanity |
true | Enable/disable profanity detection (+30 per word). |
data-assess-language |
true | Enable/disable language detection (page vs text mismatch, +30 max). |
data-language-weight |
10 | Points for language mismatch (default: 10). |
data-expected-languages |
"" | Comma-separated language codes (e.g., "en,de") for multilingual sites. |
data-assess-entropy |
true | Enable/disable entropy analysis (randomness detection, +30 max). |
data-assess-email-address |
false | Send full email addresses (not just domains) for username entropy analysis. Shows console warning for DPA transparency. |
data-assess-timing |
true | Enable/disable timing and behavioral analysis (fast submission, interaction tracking, +20 max). |
data-assess-scripts |
true | Enable/disable script detection and homoglyph analysis (Unicode analysis, +30 max). |
Geo Configuration
| Attribute | Default | Description |
|---|---|---|
data-block-geo |
"" | Comma-separated ISO country codes (e.g., "RU,CN") to block instantly (+100, hard block). |
data-allow-geo |
"" | Comma-separated ISO country codes (e.g., "US,CA,GB") to allowlist. Takes precedence over data-block-geo (+100 if not in allowlist, hard block). |
<script
src="https://api.spamblock.io/sdk/pixel/v1.js"
data-max-score="60"
data-assess-profanity="true"
data-assess-language="true"
data-assess-entropy="true"
data-assess-email-address="false"
data-assess-timing="true"
data-assess-scripts="true"
data-expected-languages="en,de"
data-language-weight="10"
data-debug="false"
data-block-geo=""
data-allow-geo=""
defer
></script>
<form data-block-spam>
... your existing fields ...
</form>
What happens on submit?
- SpamBlock intercepts: The script listens for form submission events and prevents default behavior.
- Data collection: Collects form data (email domain, text fields), timing metrics (time to submit, per-field dwell times), interaction patterns (keyboard/mouse events), and page metadata (language, headers).
- API request: Sends collected data to
/v1/checkfor scoring. - Scoring: Worker evaluates 13+ signals including language detection, entropy analysis, behavioral tracking, script detection, and more. Each signal contributes points to a total score.
- Decision: Worker responds with
{ allow: true/false, score: 0-100, reasons: [...], latencyMs: 123 }. - Action: If
allow: true, form submits normally. Ifallow: false, submission is blocked and custom event is emitted for UI handling.
Scoring Categories
SpamBlock uses category caps to prevent single signals from dominating:
- Content-based: +30 max (profanity, spam keywords)
- Language & Script: +30 max (language detection, script analysis)
- Entropy & Structure: +30 max (entropy, repetition)
- Headers & IP: +25 max (User-Agent, ASN, reputation)
- Timing & Behavioral: +20 max (timing, interactions)
- Geo: +100 (hard block if violated)
- Honeypot: +90 (highest priority after geo)
Server-side integration
No backend changes are required. SpamBlock replays the submission when it passes the score threshold. To inspect results, enable debug logging or forward the console output into your monitoring pipeline.
Downstream Filtering & Marker Fields
When a form submission is allowed (or fails open), SpamBlock automatically injects hidden marker fields into the form before submission. These markers act as authenticity indicators for downstream processing in email systems, CRMs, Zapier/Make automations, and other form handlers.
Marker Fields
The following hidden fields are injected into forms before submission:
| Field Name | Description | Example |
|---|---|---|
_sb_v |
Marker schema version | 1 |
_sb_allow |
Final allow/block decision | true |
_sb_score |
Final numeric score (0-100) | 37 |
_sb_reasons |
Comma-separated reason codes | hp_filled,profanity |
_sb_ts |
UTC timestamp of evaluation (ISO 8601) | 2025-02-14T10:03:11Z |
Marker Behavior
- Markers are only injected when the form is actually submitted: If SpamBlock blocks the submission, no markers are added (form doesn't submit).
- Fail-open cases: If SpamBlock encounters an error (network failure, API error, etc.), markers are still injected with
_sb_allow="true",_sb_score="0", and_sb_reasonsincludingerror_fail_openplus any issues discovered during processing. - Missing markers: If a form submission arrives without SpamBlock marker fields, it indicates the form was submitted before the pixel could evaluate it (e.g., bot bypass, script blocker, no JavaScript). This should be treated as higher risk.
Server-Side Validation
On your server, check for marker presence:
- Marker present with
_sb_allow="true": Submission was evaluated and allowed by SpamBlock. - Marker present with
_sb_allow="true"anderror_fail_openin reasons: Submission was allowed due to fail-open (error occurred, but form was allowed to proceed). - Marker missing: Form was submitted without SpamBlock evaluation (bot bypass, script blocker, etc.). Treat as suspicious (+30 risk score recommended).
Note: The absence of markers is a strong indicator of bot/bypass activity, as legitimate users with JavaScript enabled will always have markers injected before submission.
Detection Signals Explained
Content & Structure Signals
- Language Detection: Detects mismatches between page language, browser language, and detected text language. Useful for catching Russian spam on English sites.
- Entropy Analysis: Detects random strings (high entropy) and repetitive junk (low entropy). Flags suspicious patterns like
x8q2m6k9p4r7t1or!!!!!!. - Header Analysis: Flags suspicious User-Agents (curl, python-requests), missing Referers, and hosting ASNs (AWS, DigitalOcean, etc.).
- Intent Classification: Matches text against known spam/scam keywords (viagra, free money, etc.).
Behavioral Signals
- Timing Analysis: Detects fast submissions (<1.2s) and rapid field filling (<100ms per field), indicating bot automation.
- Interaction Tracking: Flags missing keyboard/mouse events when form took >500ms, indicating script-based filling.
- Focus Patterns: Detects suspicious focus patterns (all fields filled without focus events in <1s).
Advanced Unicode Analysis
- Script Detection: Detects unexpected Unicode scripts (e.g., Cyrillic on English sites) and mixed-script confusables (e.g.,
frеe саѕhwith Cyrillic characters). - Homoglyph Detection: Flags homoglyph substitutions in short fields (e.g.,
Jоhnwith Cyrillic 'о' instead of Latin 'o'). - RTL Control Characters: Detects bidirectional control characters used for obfuscation.
- Emoji Density: Flags high emoji/symbol density (>10%) in fields where uncommon.
Tips & Best Practices
- Testing: Enable
data-debug="true"and try disposable domains such as[email protected], spam phrases, or fast submissions. - Multilingual Sites: Set
data-expected-languages="en,de"to help language and script detection. - Tuning: Adjust
data-max-scorebased on your false positive rate. Start with 60, increase if too many false positives, decrease if missing spam. - Analytics: Capture the log output to monitor performance and tune thresholds.
- Multiple forms: Add
data-block-spamonly where needed if you have forms that should remain untouched. - Performance: Disable timing/behavioral analysis (
data-assess-timing="false") if you need to reduce payload size.