← Glossary / Browser Context

What is Browser Context?

Browser Context is an isolated, ephemeral session environment within a single browser instance. It encapsulates cookies, local storage, cache, and permissions independently from other contexts. For scraping pipelines, it's the mechanism that allows a single heavy Chromium process to concurrently run fifty distinct, unlinked scraping sessions without cross-contamination. Mismanage your contexts, and you'll leak session state across proxy boundaries, triggering immediate anti-bot bans.

PlaywrightPuppeteerSession IsolationConcurrencyChromium
// 02 — definitions

Isolate the
state.

The architectural primitive that makes high-concurrency browser automation economically viable without sacrificing session integrity.

Ask a DataFlirt engineer →

TL;DR

A browser context acts like an incognito window on steroids. Instead of launching a new 200MB Chromium instance for every request, frameworks like Playwright and Puppeteer use contexts to spawn lightweight, isolated environments in milliseconds. This drops memory overhead by 90% while guaranteeing that cookies and cache from session A never bleed into session B.

01Definition & structure
A browser context is an isolated session environment that lives inside a parent browser process. It maintains its own independent state, including:
  • Cookies — completely segregated from other contexts
  • Local/Session Storage — no cross-talk between sessions
  • Cache — network responses are cached per-context
  • Permissions — geolocation, notifications, and camera access
Because contexts share the underlying browser executable and OS-level resources, they can be created and destroyed in milliseconds, making them the standard unit of concurrency in modern headless scraping.
02How it works in practice
Instead of running browser.launch() for every URL in your queue, you launch the browser once. For each URL, you call browser.newContext(), pass in a unique proxy and user-agent, open a page, extract the data, and call context.close(). This pattern keeps memory usage flat and prevents the CPU spikes associated with spinning up new Chromium binaries, allowing a single machine to process thousands of pages per minute.
03Memory economics
The primary driver for using contexts is memory efficiency. A bare Chromium instance requires roughly 150MB of RAM just to idle. If you launch 50 separate browsers, you consume 7.5GB of RAM before loading a single webpage. By using contexts, the base 150MB is paid once. Each additional context adds only 15–25MB of overhead. 50 contexts in one browser consumes less than 1.5GB of RAM, fundamentally altering the unit economics of a scraping fleet.
04How DataFlirt handles it
We treat browser processes as long-lived infrastructure and contexts as ephemeral workers. Our fleet orchestrator dynamically scales the number of active contexts per browser based on real-time memory pressure and main-thread CPU contention. If a target site is particularly JS-heavy, we automatically reduce the context density on that node to prevent the shared main thread from blocking, ensuring consistent extraction latency across the pipeline.
05The proxy leakage trap
The most common mistake engineers make when migrating to contexts is configuring the proxy at the browser level rather than the context level. If you launch the browser with Proxy A, all 50 contexts will route through Proxy A, instantly burning the IP due to rate limits. In Playwright, proxy settings must be explicitly passed into newContext() to ensure each isolated session actually has an isolated network path.
// 03 — the economics of isolation

How contexts
scale hardware.

Contexts change the fundamental unit economics of headless scraping. DataFlirt's fleet scheduler relies on these ratios to pack maximum concurrency onto bare-metal nodes without triggering OOM kills.

Memory footprint = Mtotal = Browserbase + (Contexts × Contextoverhead)
Base browser is ~150MB. Each context adds only 15–25MB plus page weight. Chromium process model
Creation latency = Tcontext = IPCoverhead + Targetinit
Typically 10–15ms, compared to 800ms+ for a full browser launch. Playwright CDP metrics
DataFlirt packing density = Maxcontexts = RAMavail / (1.2 × Spikemax)
The 1.2 safety factor prevents cascading process deaths during heavy DOM rendering. Internal fleet SLO
// 04 — process telemetry

Spawning 50 sessions
in one process.

A trace from a DataFlirt worker node initializing a Playwright instance and rapidly spinning up isolated contexts for a concurrent retail scrape.

Playwright v1.42CDPChromium 124
edge.dataflirt.io — live
CAPTURED
// init base browser
browser.launch: 214ms pid: 49102
memory.baseline: 142MB

// spawn context pool
context_01.init: 12ms proxy: "192.0.2.14"
context_02.init: 14ms proxy: "198.51.100.7"
context_03.init: 11ms proxy: "203.0.113.42"
...
context_50.init: 15ms proxy: "192.0.2.88"

// verify isolation
context_01.cookies: 0
context_02.cookies: 0
cross_context_leakage: false

// execution state
worker.status: ready
memory.total: 890MB // ~15MB overhead per context
pipeline.concurrency: 50
// 05 — failure modes

Where contexts
break down.

While contexts isolate state, they share the underlying browser process. This shared architecture introduces specific failure modes that don't exist in single-instance setups.

PIPELINES MONITORED ·   410+ active
WORKER NODES ·  ·  ·  ·   Bare-metal
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Main thread blocking

CPU bottleneck · Heavy JS in one context stalls all others
02

OOM cascading crashes

Memory limit · One heavy page takes down the whole browser
03

Shared IP leaks

Config error · Forgetting to bind proxy per context
04

Context destroyed errors

Target closed · Navigation interrupted unexpectedly
05

Shared TLS state

Network layer · Some low-level stacks bypass context isolation
// 06 — our architecture

One process,

fifty pristine identities.

At DataFlirt, we never launch a browser per request. Our fleet architecture relies on long-lived Chromium processes that act as host environments. When a scrape job arrives, we inject a fresh context, bind it to a specific residential proxy, execute the extraction, and tear the context down. The host browser remains running, amortizing the heavy startup cost across thousands of requests while guaranteeing zero state leakage between jobs.

Worker Node 04 — Context Pool

Live telemetry of a shared browser process managing concurrent contexts.

host.pid 88412
host.uptime 4h 12m
contexts.active 42nominal
contexts.queued 8
memory.usage 1.4 GB / 4.0 GBsafe
cpu.main_thread 68%high
isolation.status verified
crashes.1h 0

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about context isolation, memory management, proxy binding, and how DataFlirt scales headless browsers.

Ask us directly →
What's the difference between a browser context and a new tab? +
Tabs share cookies, local storage, and cache. Contexts do not. If you log into a site on Tab 1, Tab 2 will be logged in. If you open a new Context, it is completely blank. Contexts are isolated environments; tabs are just different views within the same environment.
How many contexts can I run in one browser process? +
It depends entirely on your hardware and the target site's weight. A blank context takes ~15MB of RAM, but rendering a heavy React SPA can spike it to 150MB+. On modern hardware, CPU main-thread contention usually bottlenecks before RAM. 30 to 50 contexts per process is a safe upper bound for typical scraping workloads.
Do contexts share the same proxy IP? +
By default, yes, if the proxy is set at the browser launch level. In modern frameworks like Playwright, you must explicitly define the proxy configuration at the context creation step. This ensures each context routes its traffic through a different IP, preventing cross-session IP leakage.
Can anti-bot systems detect that I'm using multiple contexts? +
Not directly. An anti-bot script running inside Context A cannot see Context B. However, if multiple contexts share the same IP, or if the underlying browser's hardware concurrency fingerprint is identical across 50 simultaneous requests to the same target, the aggregate behavior will be flagged as bot traffic.
How does DataFlirt handle a context crash? +
If a single context crashes (e.g., due to a navigation timeout), we catch the exception, destroy the context, and retry the job. If the underlying browser process crashes (usually an Out of Memory kill), all attached contexts die instantly. Our orchestrator detects the PID death, spins up a new browser, and redistributes the lost jobs.
Should I reuse contexts or destroy them after a scrape? +
For stateless scraping, destroy the context immediately after extraction to free up memory. For authenticated scraping, you can serialize the context state (cookies and local storage) to disk, destroy the context, and rehydrate it later to resume the session without logging in again.
$ dataflirt scope --new-project --target=browser-context READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h