← Glossary / ModSecurity Rule Match

What is ModSecurity Rule Match?

A ModSecurity rule match occurs when an inbound HTTP request triggers one or more anomaly scoring thresholds in the open-source ModSecurity Web Application Firewall. For scrapers, this usually happens because of malformed headers, missing Accept-Language tags, or default Python/Go HTTP client signatures. It's the baseline defense layer for millions of Apache and Nginx servers, and failing it means your pipeline is dropping basic HTTP hygiene.

WAFOWASP CRSHTTP HeadersAnomaly Scoring403 Forbidden

// 02 — definitions

The baseline
bouncer.

How the internet's most widely deployed open-source WAF scores your requests, and why default HTTP clients fail instantly.

Ask a DataFlirt engineer →

TL;DR

ModSecurity evaluates every request against a ruleset (typically the OWASP Core Rule Set). Each violation — like a missing Accept header or an anomalous User-Agent — adds to an anomaly score. If the score exceeds the threshold (usually 5), the request is blocked with a 403. It's deterministic, stateless, and entirely avoidable with proper header management.

01Definition & structure

A ModSecurity rule match happens when an HTTP request violates one of the regular expression patterns defined in the WAF's configuration. ModSecurity operates as a module within Apache, Nginx, or IIS. It inspects the request URI, headers, and body before the application server processes them. If a request looks like a known vulnerability scanner, a default Python script, or lacks standard browser headers, it triggers a match.

02How anomaly scoring works

Modern ModSecurity setups use "Anomaly Scoring Mode." Instead of blocking a request on the first minor infraction, each triggered rule adds points to a cumulative score. A missing Accept-Language header might add 3 points. A suspicious User-Agent adds 5 points. If the total score exceeds the inbound anomaly threshold (typically set to 5), the WAF drops the connection or returns a 403 Forbidden.

03The OWASP Core Rule Set (CRS)

ModSecurity is just the engine; the OWASP CRS provides the actual rules. For scraping engineers, the most relevant sections are the Protocol Enforcement rules (which mandate strict HTTP RFC compliance) and the Scanner Detection rules (which maintain a massive blacklist of default HTTP client User-Agents like curl, wget, python-requests, and scrapy).

04How DataFlirt handles it

We treat ModSecurity bypass as a solved problem of strict HTTP hygiene. Our edge routers don't just spoof User-Agents; they enforce complete header dictionary matching. If a request is routed through a Chrome 124 profile, the exact Accept, Accept-Encoding, Accept-Language, and Sec-Ch-Ua headers are injected in the exact order Chrome sends them. This guarantees an anomaly score of zero.

05The silent drop configuration

While a 403 Forbidden is the standard response to a ModSecurity block, many administrators configure the WAF to silently drop the connection (using the drop action) to frustrate automated tools. If your scraper is experiencing random Connection Reset by Peer or Socket Timeout errors immediately after the TLS handshake, you are likely hitting a ModSecurity drop rule.

// 03 — the scoring model

How anomaly
scoring works.

ModSecurity doesn't usually block on a single minor infraction. It accumulates anomaly scores across the request lifecycle. DataFlirt's request normalizer ensures our outbound score is always zero.

Total Anomaly Score = S_total = Σ Rule_severity

Critical = 5, Error = 4, Warning = 3, Notice = 2. Default block threshold is 5. OWASP CRS Anomaly Scoring

Missing Header Penalty = S += 5

Missing Accept or User-Agent headers trigger immediate critical violations. ModSec Rule 920280 / 920320

DataFlirt Header Compliance = H_valid = 1.0

100% of requests match the exact header dictionary of the advertised browser. DataFlirt Edge Router

// 04 — the waf log

A Python requests
script getting caught.

What the server's modsec_audit.log sees when a naive scraper hits an endpoint without spoofing its headers.

ModSecurity v3OWASP CRS 3.3Anomaly Mode

edge.dataflirt.io — live

CAPTURED

// inbound request
method: "GET /api/products HTTP/1.1"
user-agent: "python-requests/2.28.1"
accept: "*/*"

// rule evaluation phase 1: headers
rule_913100: MATCH // Found User-Agent associated with security scanner/bot
score_increment: +5
rule_920280: MATCH // Request Missing an Accept Header (if absent)

// anomaly scoring
inbound_anomaly_score: 5
inbound_anomaly_threshold: 5

// action
intervention: 403 Forbidden
log_message: "Inbound Anomaly Score Exceeded (Total Score: 5)"

// 05 — failure modes

What triggers
the ruleset.

The most common ModSecurity rules triggered by scraping pipelines. Unlike behavioral ML classifiers, these are static pattern matches that penalize lazy HTTP client configuration.

WAF TYPE · · · · · Signature-based

DEFAULT THRESHOLD · · Score ≥ 5

UPDATED · · · · · · 2026-05-19

Bad Bot User-Agent

Rule 913100 · python-requests, curl, scrapy in UA string

Missing Accept Header

Rule 920280 · Standard browsers always send Accept

Missing Accept-Language

Rule 920270 · Often omitted by headless scripts

Empty Host Header

Rule 920280 · HTTP/1.1 violation, instant block

Protocol Violation

Rule 920100 · Malformed HTTP methods or URIs

// 06 — our stack

Perfect HTTP hygiene,

because signature WAFs are deterministic.

ModSecurity doesn't care how fast you move your mouse or whether your canvas fingerprint is unique. It cares about RFC compliance and known-bad strings. DataFlirt bypasses ModSecurity entirely by ensuring every request leaving our edge perfectly mirrors the header dictionary, ordering, and pseudo-header structure of the specific browser version we are emulating. If we claim to be Chrome 124, our Accept-Language and Sec-Ch-Ua headers match Chrome 124 exactly. The anomaly score stays at zero.

Request Normalization Pipeline

Pre-flight header validation on a DataFlirt worker before hitting a ModSecurity-protected target.

target.waf ModSecurity / OWASP CRS

ua.spoof Chrome 124.0.0.0

header.accept text/html,application/xhtml+xml...

header.order chrome-standard

tls.ja4 t13d1516h2_8daaf6152771

anomaly.score.est 0

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about ModSecurity, OWASP CRS, anomaly scoring, and how to configure scrapers to avoid triggering static WAF rules.

Ask us directly →

What is the difference between ModSecurity and Cloudflare Bot Management? +

ModSecurity is a traditional, signature-based Web Application Firewall. It looks at the static properties of a single request (headers, payload, URI) and matches them against regex rules. Cloudflare Bot Management uses machine learning, behavioral analysis, and active JS challenges to detect automation. ModSecurity is much easier to bypass if you control your HTTP headers.

Why does my scraper work locally but get a 403 ModSecurity block in production? +

Often, it's because your production environment routes through a proxy that alters headers, or your local script is hitting a dev endpoint without the WAF enabled. Ensure your production HTTP client explicitly sets User-Agent, Accept, Accept-Language, and Accept-Encoding to match a real browser.

What is the OWASP Core Rule Set (CRS)? +

It's the default set of rules used by most ModSecurity installations. It provides generic protection against common vulnerabilities (SQLi, XSS) and bad bots. For scrapers, the "Protocol Enforcement" and "Scanner Detection" rule groups are the ones that trigger blocks.

How does DataFlirt handle targets with aggressive ModSecurity rules? +

We use strict header normalization. Every request is mapped to a known-good browser profile. We don't just spoof the User-Agent; we spoof the exact combination and ordering of all headers that the specific browser version would send. This keeps the anomaly score at zero.

Can ModSecurity detect headless browsers like Puppeteer? +

Not directly. ModSecurity operates at the network layer (HTTP/TLS). It cannot execute JavaScript to check navigator.webdriver. However, if your headless browser sends a User-Agent containing "HeadlessChrome", ModSecurity's scanner detection rules will flag it and block the request.

Is it illegal to bypass a ModSecurity WAF? +

Bypassing a WAF by sending well-formed, RFC-compliant HTTP requests is generally just standard web interaction. However, using exploits (like SQL injection) to bypass rules is illegal. DataFlirt only accesses public data using standard, compliant HTTP requests. Always review the target's Terms of Service.

$ dataflirt scope --new-project --target=modsecurity-rule-match READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

Start a pipeline → View pricing

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h

What is ModSecurity Rule Match?

The baselinebouncer.

TL;DR

How anomalyscoring works.

A Python requestsscript getting caught.

What triggersthe ruleset.

Bad Bot User-Agent

Missing Accept Header

Missing Accept-Language

Empty Host Header

Protocol Violation

Perfect HTTP hygiene,

Request Normalization Pipeline

Stay ahead of the pipeline

Data engineeringintel, weekly.

Commonquestions.

Tell us whatto extract.We do the rest.

Related glossary terms

OWASP CRS Block

HTTP Headers

Web Application Firewall (WAF)

User-Agent String