← Glossary / User-Agent String

What is User-Agent String?

User-Agent String is the HTTP request header that identifies the client software, operating system, and device type to the server. For scraping pipelines, it is the most basic and easily spoofed signal of identity, yet getting it wrong—or failing to align it with deeper TLS and JavaScript fingerprints—is the leading cause of immediate blocks at the edge. It is the first lie your scraper tells.

HTTP HeadersIdentitySpoofingNetwork LayerAnti-bot
// 02 — definitions

The first
handshake.

The string that announces who you are, what OS you run, and what engine renders your DOM—and why servers rarely believe it anymore.

Ask a DataFlirt engineer →

TL;DR

The User-Agent string is a legacy HTTP header meant for content negotiation, now primarily used as the first filter in bot detection. While trivial to spoof, modern anti-bot systems cross-reference the User-Agent against TLS fingerprints (JA3/JA4) and JavaScript engine quirks. If your UA says Chrome on Windows but your TLS handshake says Python on Linux, you are blocked.

01Definition & structure
The User-Agent String is an HTTP header sent by the client to identify its software, version, and host operating system. A typical modern string looks like: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36. While originally designed for content negotiation (serving different HTML to different browsers), it is now the foundational layer of bot detection.
02The anatomy of a modern UA
Due to decades of browser wars, the string is a mess of legacy tokens. Mozilla/5.0 is a historical artifact. The actual useful data is buried in the middle: the OS (Windows NT 10.0), the rendering engine (AppleWebKit/537.36), and the actual browser version (Chrome/124.0.0.0). Because it is so messy, Google is actively deprecating it in favor of structured Sec-CH-UA Client Hints.
03The alignment problem
Spoofing a User-Agent is as simple as passing a dictionary to your HTTP client. But if you tell the server you are Chrome on Windows, the server expects your TLS handshake to match Chrome's cipher suite, your HTTP/2 frames to match Chrome's multiplexing behavior, and your TCP window size to match Windows. If any of these misalign, the WAF flags the request as a spoofed bot.
04How DataFlirt handles it
We do not treat the User-Agent as an isolated string. We use comprehensive device profiles. When a pipeline worker spins up, it is assigned a profile that dictates the User-Agent, the Client Hints, the TLS JA4 signature, and the headless browser's navigator properties. The string is guaranteed to match the metal, resulting in zero coherence-based blocks.
05Did you know: Client Hints
Chromium-based browsers have frozen the User-Agent string. The OS version is capped (e.g., Windows 11 still reports as Windows 10), and minor browser versions are scrubbed. To get the real data, servers now request Sec-CH-UA headers. If your scraper spoofs a modern Chrome UA but fails to provide the corresponding Client Hints, it is instantly recognizable as a bot.
// 03 — the alignment matrix

How servers validate
your identity.

A User-Agent is never evaluated in isolation. Anti-bot classifiers calculate a coherence score by comparing the declared UA against network and runtime realities.

Coherence Score = C = UATLS_FingerprintJS_Runtime
Must equal 1.0. Any mismatch across the stack is a fatal flag. WAF Evaluation Logic
Entropy Contribution = H(UA) ≈ 3.5 bits
Low entropy alone, but carries a massive penalty for anomalies. Browser Fingerprinting Models
DataFlirt Rotation Rate = R = Target_Strictness × Session_Age
Rotated only when the underlying device profile rotates. DataFlirt Fleet Scheduler
// 04 — the coherence check

When the User-Agent
doesn't match the wire.

A trace from a WAF evaluating an inbound request where a Python requests script attempts to spoof a Chrome User-Agent.

WAF rulesetTLS inspectionHeader order
edge.dataflirt.io — live
CAPTURED
// inbound request headers
user-agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36..."
accept-encoding: "gzip, deflate"
connection: "keep-alive"

// network layer extraction
tls.ja4: "t12d1008h2_8daaf6152771_b0da82dd1658" // Python urllib3 signature
http2.pseudo_order: missing // HTTP/1.1 used, Chrome uses H2

// coherence evaluation
check.ua_os: "Windows"
check.tcp_window: 65535 // Linux default, mismatch
check.tls_browser: false

// routing decision
score.bot_probability: 0.99
action: BLOCK (403 Forbidden)
// 05 — detection vectors

How fake User-Agents
get caught.

Spoofing the string is easy. Faking the entire stack that the string implies is hard. Here is how WAFs catch lazy User-Agent rotation.

EVALUATED REQUESTS ·  ·   1.2B/day
LEADING CAUSE ·  ·  ·  ·  TLS Mismatch
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

TLS Fingerprint Mismatch

fatal flag · JA3/JA4 does not match the declared browser version
02

Header Order Anomalies

high risk · Browsers send headers in a strict, predictable order
03

Missing Client Hints

medium risk · Sec-CH-UA headers absent on modern Chromium UAs
04

TCP/IP Stack OS Mismatch

medium risk · TTL and Window Size contradict the declared OS
05

JavaScript Feature Support

runtime · DOM probes reveal missing APIs for the declared UA
// 06 — our stack

Bind the string to the metal,

never rotate the header without rotating the hardware profile.

Naive scrapers rotate User-Agents on every request while keeping the same IP and TLS stack. This is a massive red flag. DataFlirt uses device profiles. When we assign a Windows Chrome User-Agent to a session, the underlying proxy routes through a Windows TCP stack, the TLS handshake matches Chrome 124, and the headless browser exposes Windows fonts. The User-Agent is just the label on a fully coherent identity.

Device Profile Binding

A fully coherent session profile generated by DataFlirt.

profile.id win_chr_124_09a
header.user_agent Mozilla/5.0 (Windows NT 10.0...
header.sec_ch_ua "Chromium";v="124"
network.tls_ja4 t13d1516h2_8daaf6152771
network.tcp_os Windows NT
runtime.navigator Win32
coherence_score 1.00

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

About User-Agent spoofing, Client Hints, coherence checks, and how DataFlirt manages identity at scale.

Ask us directly →
Can I just use a list of 10,000 random User-Agents? +
No. Rotating UAs without rotating the underlying TLS fingerprint and IP address actually makes you stand out more. A single IP claiming to be 50 different operating systems in a minute is a guaranteed block. Identity must be stable per session.
What are User-Agent Client Hints (Sec-CH-UA)? +
They are a modern set of HTTP headers introduced by Google to eventually replace the User-Agent string. They break the UA down into structured fields (browser, version, platform). If your scraper sends a Chrome UA but omits the Client Hints, Cloudflare will flag it immediately.
Should I use a mobile User-Agent to scrape? +
Only if you are routing through a mobile proxy (4G/5G) and targeting a mobile-specific endpoint. Sending a mobile UA from an AWS datacenter IP is an obvious anomaly that WAFs catch instantly.
How does DataFlirt manage User-Agent updates? +
We sync our device profiles with global browser market share daily. When Chrome 125 drops, our fleet phases it in over a week, matching the natural adoption curve. We never use outdated or obscure UAs that draw attention.
Is it legal to spoof a User-Agent? +
Yes. The User-Agent is a self-reported string, and there is no legal requirement to identify your software accurately. However, spoofing it to bypass access controls can be a factor in ToS violation disputes.
Why do all User-Agents start with Mozilla/5.0? +
Historical baggage. In the 1990s, web servers only sent advanced HTML to Mozilla (Netscape). To get the good HTML, Internet Explorer spoofed Mozilla. Then Chrome spoofed Safari, which spoofed KHTML, which spoofed Mozilla. It is a 30-year-old chain of lies.
$ dataflirt scope --new-project --target=user-agent-string READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h