← Glossary / Fake 200 OK Response

What is Fake 200 OK Response?

A fake 200 OK response is a deceptive anti-bot countermeasure where a server identifies a scraper but returns a standard HTTP 200 success code instead of a 403 Forbidden or a CAPTCHA. The response body is intentionally altered — containing missing fields, randomized pricing, or a silent redirect to a honeypot. For data pipelines, this is the most dangerous failure mode because it bypasses network-layer error monitoring and silently poisons the downstream dataset.

Anti-botData PoisoningSilent FailureWAFValidation
// 02 — definitions

The silent
pipeline killer.

Why modern anti-bot vendors prefer to feed you garbage data rather than block your IP outright.

Ask a DataFlirt engineer →

TL;DR

A fake 200 OK response (or "tarpit") is designed to waste your compute and pollute your data warehouse. Vendors like DataDome and Akamai use this tactic for medium-confidence bot scores. Because the HTTP status is 200, naive scrapers parse the poisoned HTML or JSON and write it to production, completely unaware they've been flagged.

01Definition & structure
A fake 200 OK response occurs when a web server or Web Application Firewall (WAF) identifies an incoming request as a bot, but intentionally returns an HTTP 200 status code instead of a standard block (like a 403 or 429). The body of the response is manipulated. It might be a stripped-down version of the page, a page with randomized data, or a completely different template masquerading as the target content. The goal is to deceive the scraper into believing the request succeeded.
02How it works in practice
When a scraper relies purely on HTTP status codes for error handling, a fake 200 bypasses all retry logic. The scraper passes the HTML or JSON to the extraction layer. If the extraction logic uses loose CSS selectors (e.g., grabbing any text inside a .price class), it will extract the poisoned data. This corrupted record is then written to the database, silently degrading the quality of the entire dataset without triggering a single pager alert.
03The cost of silent failures
Fake 200s are devastating because they shift the cost of failure from the network layer to the business layer. A 403 block costs you a fraction of a cent in proxy bandwidth. A fake 200 costs you proxy bandwidth, parsing compute, storage costs, and potentially millions of dollars if that poisoned data (like a fake competitor price) is used to drive automated repricing algorithms in production.
04How DataFlirt handles it
We assume the network is lying. Our pipelines employ strict schema validation and historical anomaly detection. If a field is missing, if the data type coerces incorrectly, or if a numeric value deviates wildly from its historical moving average, the record is quarantined. We monitor payload byte sizes and DOM node counts; sudden drops trigger automatic session invalidation and proxy rotation, ensuring poisoned data never reaches the client.
05Did you know?
Some advanced fake 200 implementations will serve perfectly accurate data for the first 50 requests of a session, and then slowly introduce a 5% variance in pricing data over the next 500 requests. This "boiling frog" approach is designed to defeat basic spot-checks and ensure the poisoned data is deeply integrated into the victim's systems before it is noticed.
// 03 — detection math

How to catch
a fake 200.

You cannot rely on HTTP status codes. DataFlirt's extraction layer uses statistical anomaly detection and schema strictness to identify poisoned payloads before they hit the delivery bucket.

Payload Size Variance = ΔS = |ScurrentμS| / σS
Z-score of response bytes. ΔS > 3 often indicates a stripped DOM. DataFlirt anomaly detection
Field Completeness Drop = Cdrop = fieldsexpectedfieldsextracted
Fake 200s usually omit expensive-to-render dynamic fields. Schema validation layer
DataFlirt Poison Confidence = P = (w1·ΔS) + (w2·Cdrop) + (w3·honeypot)
P > 0.85 triggers automatic session rotation and quarantine. Internal SLO
// 04 — the wire trace

A poisoned payload
in transit.

A scraper hits an e-commerce endpoint. The WAF flags the JA3 fingerprint but returns a 200 OK with subtly altered pricing data to ruin the competitor's intelligence.

HTTP/2Akamai BMPJSON API
edge.dataflirt.io — live
CAPTURED
// inbound request
GET /api/v1/products/sku-9921
user-agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64)..."
ja3_hash: "771,4865-4866-4867... (Python requests)"

// WAF evaluation (internal)
bot_score: 0.88 // high confidence bot
action: "SERVE_POISONED_200"

// response
HTTP/2 200 OK
content-type: "application/json"
body.price: "$14.99" // actual price is $89.99
body.stock: "In Stock"
body.honeypot_token: "x8f92a"

// naive scraper outcome
pipeline.status: SUCCESS
database.write: POISONED DATA COMMITTED
// 05 — poisoning tactics

How the data
gets altered.

When a WAF decides to serve a fake 200, it uses several techniques to degrade the data quality without triggering network-level alerts. Ranked by frequency across DataFlirt's monitored targets.

TARGETS MONITORED ·  ·    1,200+ enterprise domains
FAKE 200 RATE ·  ·  ·  ·  4.2% of flagged traffic
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Price randomization

numeric alteration · Alters numeric values by +/- 20%
02

Stripped dynamic content

DOM reduction · Omits XHR-loaded reviews or inventory
03

Infinite pagination loops

crawler trap · next_page token never resolves to null
04

Honeypot injection

tracking · Inserts invisible tracking links
05

Stale cache serving

data freshness · Returns 30-day old HTML
// 06 — DataFlirt's defense

Trust the schema,

never the status code.

At DataFlirt, we treat HTTP 200s with extreme skepticism. A successful network request is only the first gate. Our extraction layer runs continuous anomaly detection on payload size, DOM structure, and field variance. If a target suddenly drops its average response size by 40%, or if a price field deviates beyond historical volatility bounds, the pipeline halts the write, flags the session as poisoned, and rotates the proxy and fingerprint before retrying.

Poison detection pipeline

Live validation of a 200 OK response from a protected travel aggregator.

http.status 200 OK
payload.bytes 14,202μ=89,400
schema.completeness 0.42missing fields
price.variance +450%anomaly
decision QUARANTINE
action rotate_fingerprint

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about silent tarpits, data poisoning, and how to build resilient extraction pipelines.

Ask us directly →
Why do sites use fake 200s instead of just blocking me? +
It wastes your resources, pollutes your data, and makes debugging significantly harder. A 403 Forbidden is an immediate, clear signal to your infrastructure to rotate proxies or update fingerprints. A fake 200 keeps you burning money on a useless IP while silently corrupting your downstream analytics.
How can I tell if I'm getting a fake 200 OK? +
You must monitor payload sizes, track schema completeness, and implement statistical bounds on numeric fields. If your scraper suddenly extracts zero reviews for every product, or if the HTML payload drops from 120KB to 15KB, you are likely in a tarpit. Network metrics alone will not save you.
Does Playwright or Puppeteer protect against this? +
No. Headless browsers only render what the server sends. If the server sends a perfectly valid HTML page with fake prices, Playwright will happily render and extract the fake prices. Browser automation solves rendering challenges, not data poisoning.
How does DataFlirt prevent poisoned data from reaching my warehouse? +
We decouple network success from extraction success. Every record passes through a strict schema contract and historical anomaly detection. Poisoned payloads fail validation and are quarantined, never written to your delivery bucket. The pipeline automatically rotates the session identity and retries.
Are fake 200s legal for sites to use? +
Yes. Sites have no obligation to serve accurate data to automated clients. Serving decoy data is a standard defensive measure and entirely legal. It is up to the data consumer to verify the integrity of the extracted payload.
What is an infinite pagination loop? +
A specific type of fake 200 where the server continuously returns a valid next_page token, but the items on the page are either duplicates or randomly generated garbage. It traps naive crawlers in an endless, expensive loop, burning proxy bandwidth without yielding new data.
$ dataflirt scope --new-project --target=fake-200-ok-response READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h