← Glossary / HTTP Headers

What is HTTP Headers?

HTTP headers are the metadata key-value pairs sent alongside every web request and response. They dictate content negotiation, caching, authentication, and client identity. For scraping pipelines, headers are the primary surface area for bot detection — get your Accept-Language, User-Agent, or Sec-Ch-Ua order wrong, and you're handing the target server cryptographic proof that you aren't a real browser, resulting in an immediate block.

Network LayerBot DetectionHTTP/2Content NegotiationIdentity Spoofing
// 02 — definitions

The metadata
battleground.

Headers are how clients and servers negotiate state. They are also the easiest way to accidentally announce your scraper to a WAF.

Ask a DataFlirt engineer →

TL;DR

HTTP headers transmit essential context like user agent, accepted encodings, and authorization tokens. Modern anti-bot systems like Cloudflare and Akamai BMP don't just look at the values — they analyze the exact order, casing, and HTTP/2 pseudo-header framing. A single missing Sec-Fetch header is enough to trigger a silent shadow-ban.

01Definition & structure
HTTP headers are the core metadata mechanism of the web. They are key-value pairs sent before the actual payload in both requests and responses. They handle everything from content negotiation (Accept-Encoding) and caching (Cache-Control) to authentication (Authorization) and client identity (User-Agent, Sec-Ch-Ua). In HTTP/1.1, they are plain text; in HTTP/2, they are binary-encoded and strictly lowercased.
02How it works in practice
When your scraper makes a request, the target's edge network (like Cloudflare or Fastly) intercepts the headers before routing to the origin server. The WAF runs a coherence check: does the User-Agent match the Sec-Ch-Ua hints? Does the order of the HTTP/2 pseudo-headers (:method, :authority) match how a real Chrome browser sends them? If the headers look synthetic or mismatched, the request is dropped with a 403 Forbidden or a CAPTCHA challenge.
03Header ordering and HTTP/2
In HTTP/1.1, header order technically didn't matter. In HTTP/2, it is a massive fingerprinting vector. Real browsers have hardcoded, deterministic orders for how they pack headers into HTTP/2 frames. A standard Python requests or Go net/http client will alphabetize them or send them in the order they were appended in code. WAFs use this discrepancy to instantly identify non-browser traffic without even looking at the header values.
04How DataFlirt handles it
We don't rely on static header dictionaries. Our infrastructure uses dynamic identity profiles extracted from real residential traffic. When a DataFlirt worker makes a request, it binds a specific TLS fingerprint to the exact HTTP/2 header order, Client Hints, and Accept headers of a real browser. This ensures 100% mathematical coherence, allowing our pipelines to bypass strict WAFs without triggering challenges.
05The Accept-Encoding trap
A common mistake is spoofing a modern Chrome User-Agent but leaving the Accept-Encoding header as gzip, deflate. Modern Chrome always advertises support for br (Brotli) and increasingly zstd. Claiming to be Chrome 124 but not supporting Brotli is a cryptographic impossibility in the real world, and anti-bot systems will flag it immediately.
// 03 — the math

How strict is
header validation?

Anti-bot vendors don't just check if a header exists; they check if the entire bundle is mathematically coherent for the browser you claim to be. DataFlirt models this as a coherence score.

Header Entropy = H(h) = Σ p(vi) · log2 p(vi)
Measures the uniqueness of a header bundle. High entropy = highly identifiable. Information Theory
Detection Probability = Pblock = WAFstrictness × (1Coherence)
A mismatched User-Agent and Sec-Ch-Ua version guarantees P=1. DataFlirt Edge Models
DataFlirt Coherence Score = C = Valid_Pairs / Total_Headers_Sent
Must be exactly 1.0 for production pipelines to bypass modern WAFs. Internal SLO
// 04 — what the server sees

A header mismatch
caught at the edge.

A naive Python scraper attempts to spoof a Chrome 124 User-Agent, but forgets to update the corresponding Client Hints. The WAF catches the discrepancy instantly.

HTTP/2Cloudflare WAFClient Hints
edge.dataflirt.io — live
CAPTURED
// outbound request
:method: "GET"
:authority: "target.com"

// headers
user-agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) Chrome/124.0.0.0"
accept: "text/html,application/xhtml+xml..."
accept-language: "en-US,en;q=0.9"
sec-ch-ua: "\"Chromium\";v=\"122\", \"Not(A:Brand\";v=\"24\"" // mismatch ⚠

// WAF inspection
waf.check: "header_coherence"
waf.ua_version: 124
waf.sec_ch_ua_version: 122
waf.result: "version_mismatch" // FLAG

// response
status: 403 Forbidden // blocked at edge
// 05 — failure modes

Where header spoofing
falls apart.

Ranked by share of block events across unmanaged scraping traffic. Getting the User-Agent right is table stakes; the real complexity lies in the HTTP/2 framing and secondary metadata.

PIPELINES MONITORED ·   300+ active
WAF BLOCKS ·  ·  ·  ·  ·  30d trailing
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

HTTP/2 Pseudo-header order

% of failures · Go/Python default order vs real Chrome
02

Sec-Ch-Ua / UA mismatch

% of failures · Version drift between legacy and modern headers
03

Missing Sec-Fetch-*

% of failures · Absence of fetch metadata flags non-browser clients
04

Incorrect Accept-Encoding

% of failures · Claiming Chrome but missing brotli/zstd support
05

Capitalization anomalies

% of failures · HTTP/1.1 casing errors (HTTP/2 requires lowercase)
// 06 — our stack

Perfect coherence,

down to the HTTP/2 frame.

Spoofing a User-Agent is trivial. Spoofing the entire header stack to perfectly match the advertised browser's exact network behavior is hard. DataFlirt doesn't hardcode header dictionaries. We extract live header profiles from real residential traffic, binding the exact HTTP/2 pseudo-header order, Sec-Fetch metadata, and TLS JA3 fingerprint into a single, immutable identity token. If the profile says Chrome 124 on macOS, the headers match Chrome 124 on macOS flawlessly.

header-profile.json

A live snapshot of a bound header profile used in our edge routing.

profile.id mac-chrome-124-stable
h2.pseudo_order :method, :authority, :scheme, :pathverified
sec_ch_ua.platform "macOS"
accept_encoding gzip, deflate, br, zstdbrotli enabled
header.casing h2-lowercase-enforcedstrict
tls.ja3_bound 771,4865...23matched
coherence.score 1.00production ready

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

About header rotation, Client Hints, HTTP/2 quirks, and how DataFlirt maintains perfect header coherence at scale.

Ask us directly →
Can't I just copy headers from my own browser's network tab? +
Yes, for a weekend project. But static headers rot. Browsers update every few weeks, and WAFs track the distribution of User-Agent versions in the wild. If you are still sending Chrome 118 headers when 99% of the world is on Chrome 124, your traffic stands out as an anomaly and gets blocked.
Is modifying HTTP headers legal? +
Spoofing headers to bypass access controls or impersonate privileged users can touch CFAA/CMA boundaries. However, standard User-Agent rotation for accessing public, unauthenticated data is industry standard and generally lawful. We never spoof authenticated session tokens or authorization headers.
How does DataFlirt handle header rotation? +
We rotate entire identity profiles, not just individual headers. Rotating a User-Agent while keeping the same IP and TLS JA3 fingerprint is an instant flag for modern WAFs. Our edge proxy binds a specific residential IP to a specific TLS fingerprint and its exact corresponding header stack, rotating them as a single cohesive unit.
How do you manage header updates across millions of requests? +
Automated profile extraction. We ingest new browser profiles daily from our residential proxy network, mapping the exact header order and Client Hints for new browser releases. Old profiles are deprecated from the fleet before WAFs flag them as stale.
What are Client Hints (Sec-Ch-Ua)? +
Client Hints are the modern replacement for the User-Agent string. They break down browser metadata into structured headers (e.g., sec-ch-ua-mobile, sec-ch-ua-platform). They are high-entropy and strictly validated by Cloudflare and DataDome. If your UA says Windows but your Client Hints say Linux, you are blocked immediately.
Why does HTTP/2 header order matter? +
HTTP/2 uses HPACK compression, and browsers send headers in a highly specific, deterministic order to optimize this. Standard HTTP clients in Go or Python use a different default order. WAFs inspect the HTTP/2 frame order before they even read the header values — identifying your scraper at the protocol level.
$ dataflirt scope --new-project --target=http-headers READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h