Decoded, web scraping.

The complete scraping dictionary — anti-bot, proxies, pipelines, AI agents & legal risk. Written for engineers.

// full glossary index

Web Scraping Core Concepts

HTTP and Network Layer

Anti-Bot and Bypass Techniques

CAPTCHA Core reCAPTCHA v2 Core reCAPTCHA v3 Core hCaptcha Core Cloudflare Bot Management Core Cloudflare Challenge Page Core Turnstile CAPTCHA Core AI CAPTCHA Solver Core Bot Detection Core Browser Fingerprinting Core Mouse Movement Simulation Core Datacenter IP Detection Core Bot Score Core Anti-Detect Browser Core Browser Profile Spoofing Core User-Agent Rotation Core Header Rotation Core Stealth Mode Browsing Core Headless Browser Detection Core WebDriver Flag Removal Core Playwright Stealth Core Puppeteer Extra Stealth Core Scraping Browser Core Real Browser Rendering Core JavaScript Challenge Core Managed Challenge Core Fake 200 OK Response Core FunCaptcha Key Image CAPTCHA Key Canvas Fingerprinting Key WebGL Fingerprinting Key Keystroke Dynamics Key Behavioral Biometrics Key Honeypot Fields Key Honeypot Links Key ASN Blocking Key Residential IP Detection Key Risk Score Key Cookie Rotation Key Navigator Object Spoofing Key Browser Integrity Check Key Soft Block Key Tarpit Response Key Poisoned Data Response Key Decoy Content Injection Key Dynamic Class Name Obfuscation Key Browser Token Validation Key Obfuscated JavaScript Execution Required Key Interaction Gate Key Audio CAPTCHA Ref CAPTCHA Farm Ref Font Fingerprinting Ref Audio Context Fingerprinting Ref Screen Resolution Fingerprinting Ref Timezone Detection Ref Plugin Enumeration Ref Hardware Concurrency Detection Ref Device Memory Detection Ref CSS-Based Text Hiding Ref Client Puzzle Challenge Ref Canary Token in HTML Ref Language Detection Ref

Proxies and IP Management

Browsers and Rendering

Scraping Infrastructure and DevOps

Scraper Maintenance and Change Detection

Scraping Errors and Failure Codes

HTTP 403 Forbidden Core HTTP 429 Too Many Requests Core HTTP 503 Service Unavailable Core HTTP 520 Unknown Error (Cloudflare) Core Empty Response Body Core Proxy Connection Error Core Proxy Timeout Core Selector Not Found Error Core StaleElementReferenceException Core NoSuchElementException Core TimeoutException (WebDriver) Core WebDriverException Core HTTP 407 Proxy Authentication Required Key HTTP 502 Bad Gateway Key HTTP 504 Gateway Timeout Key HTTP 521 Web Server Down (Cloudflare) Key HTTP 522 Connection Timed Out (Cloudflare) Key HTTP 523 Origin Unreachable (Cloudflare) Key HTTP 524 A Timeout Occurred (Cloudflare) Key HTTP 401 Unauthorized Key HTTP 400 Bad Request Key HTTP 408 Request Timeout Key HTTP 500 Internal Server Error Key Partial Response Error Key Truncated HTML Response Key Response Encoding Error Key Malformed JSON Response Key Encoding Mismatch Key Unexpected Redirect Chain Key Redirect Loop Error Key Max Redirects Exceeded Key DNS Resolution Failure Key Connection Refused Error Key Connection Reset by Peer Key TCP Connection Timeout Key Socket Timeout Key Read Timeout Key Proxy Authentication Failure Key Memory Leak in Scraper Key Scraper Out-of-Memory Error Key Infinite Pagination Loop Key ElementClickInterceptedException Key SessionNotCreatedException Key NavigationTimedOut (Browser) Key ERR_EMPTY_RESPONSE Key ERR_CONNECTION_CLOSED Key ERR_NAME_NOT_RESOLVED Key ERR_TOO_MANY_REDIRECTS Key Page Crash (Headless Browser) Key Browser Context Destroyed Error Key Scraper Deadlock Key HTTP 525 SSL Handshake Failed (Cloudflare) Ref HTTP 526 Invalid SSL Certificate (Cloudflare) Ref HTTP 530 Origin DNS Error (Cloudflare) Ref HTTP 451 Unavailable For Legal Reasons Ref Malformed XML Response Ref Content-Length Mismatch Ref DNS NXDOMAIN Error Ref DNS Timeout Ref Write Timeout Ref SSL Certificate Expired Error Ref SSL Hostname Mismatch Ref SOCKS5 Negotiation Failure Ref ERR_CERT_AUTHORITY_INVALID Ref HTTP 410 Gone Ref HTTP 413 Payload Too Large Ref

Anti-Scraping Platforms and Errors

Cloudflare Ray ID Core Cloudflare Error 1006 (Access Denied) Core Cloudflare Error 1010 (Bad Browser Signature) Core Cloudflare Error 1015 (Rate Limited) Core Cloudflare Error 1020 (Access Denied by Firewall Rule) Core Cloudflare Turnstile Failure Core Cloudflare Firewall Rule Match Core Cloudflare Bot Score Threshold Core Cloudflare Super Bot Fight Mode Core Cloudflare Under Attack Mode Core Akamai Bot Manager Core Akamai Sensor Data Block Core PerimeterX / HUMAN Challenge Core DataDome 403 Block Core Imperva Block Page Core Kasada Kami Challenge Core Web Application Firewall (WAF) Core Access Denied Page Core Bot Detected Page Core Interactive Challenge Core Bad Bot Signature Match Core Request Fingerprint Block Core Header Anomaly Detection Core Cloudflare Waiting Room Key Cloudflare Workers Block Key Cloudflare Verified Bot Allowlist Key Cloudflare Error 1002 (Restricted) Key Akamai Pragma Header Detection Key Akamai Reference Number Error Key PerimeterX Cookie Validation Key DataDome Score Threshold Key DataDome Device Check API Key Imperva UTM Cookie Check Key Imperva Reese84 Cookie Key Shape Security Token Rotation Key F5 Distributed Cloud Bot Block Key Radware Bot Manager Block Key Kasada TLS Fingerprint Check Key Arkose Labs FunCaptcha Trigger Key GeeTest CAPTCHA Block Key AWS WAF Block Response Key Google Cloud Armor Block Key Nginx Rate Limit Response Key Fail2Ban IP Block Key ModSecurity Rule Match Key Geo-Block Error Key Country-Level Block Key Missing Accept Header Block Key Suspicious Referrer Block Key Cookie Validation Failure Key CSRF Token Mismatch Key Anti-Automation Token Expiry Key Session Invalidation Key Signed Cookie Validation Key HMAC Token Validation Failure Key Cloudflare Error 1000 (DNS Resolution Error) Ref Cloudflare Error 1016 (Origin DNS Error) Ref Azure Front Door WAF Block Ref Fastly WAF Block Ref Apache mod_evasive Block Ref OWASP CRS Block Ref Cloudflare Error 1018 (Compute Zone Unavailable) Ref

Scraping Performance

Authentication and Access Patterns

Site Structure and Content Patterns

Data Output Delivery and Formatting

Data Cleaning and Transformation

Data Engineering and Pipelines

Databases and Storage

Data Governance and Compliance

AI and LLM in Scraping

Business Intelligence and Data-Led Growth

SERP and Search Engine Scraping

Mobile and App Scraping

Scraping Business and Commercial Models

Developer Tools and Libraries

$ dataflirt pipeline --new --target=your-site READY

Know the terms.
Own the data.

20-minute scoping call. Pilot dataset within the week. Production within two. We scope, build, and operate the extraction pipeline.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h