← Glossary / Vertical Scaling

What is Vertical Scaling?

Vertical scaling (scaling up) in a scraping context means adding more CPU, RAM, or network bandwidth to a single worker node rather than adding more nodes to a cluster. While horizontal scaling is the default for distributed crawling, vertical scaling is often the brute-force answer to memory-hungry headless browser workloads. It is the difference between running 100 Playwright instances on one massive 64-core machine versus spreading them across 32 smaller instances.

InfrastructureResource ProvisioningHeadless BrowsersOOM Limits
// 02 — definitions

Bigger boxes,
more threads.

When to throw more hardware at a single machine, and when the operating system limits make vertical scaling a dead end.

Ask a DataFlirt engineer →

TL;DR

Vertical scaling increases the capacity of a single server. In scraping, it is almost exclusively used to handle the massive RAM requirements of concurrent headless browsers (Puppeteer/Playwright). However, it has a hard ceiling: eventually, you hit OS file descriptor limits, single-thread event loop bottlenecks, or the cloud provider's maximum instance size.

01Definition & structure

Vertical scaling is the process of increasing the compute power (CPU, RAM, Disk I/O, Network bandwidth) of an existing server to handle a larger workload. In web scraping, this usually means upgrading a cloud instance (e.g., moving from an AWS t3.medium to an m5.4xlarge) to support higher concurrency without changing the underlying code architecture.

It is the simplest form of scaling because it requires zero distributed systems logic. The scraper still runs as a single application, reading from a single local queue, and writing to a single local disk or database connection.

02When to scale vertically vs horizontally

Scale vertically when your bottleneck is memory per process. Headless browsers like Puppeteer and Playwright are notoriously RAM-heavy. You cannot split the memory required to render a single complex React application across two different servers. You must have enough RAM on the executing node to hold the entire DOM, the JavaScript heap, and the GPU buffers.

Scale horizontally when your bottleneck is network I/O, IP reputation, or sheer request volume. Ten small servers with ten different IP addresses will always out-scrape one massive server with a single IP address when hitting rate-limited targets.

03The memory wall

The most common failure mode in vertical scaling is the Out-Of-Memory (OOM) crash. Scraping engineers often increase the concurrency limit in their code to match their new 32-core CPU, forgetting that 100 concurrent Chrome tabs will consume 25GB of RAM. The CPU might be at 15% utilization, but the OS kernel will invoke the OOM Killer to terminate the browser processes, causing silent pipeline failures and corrupted data extracts.

04How DataFlirt handles it

We treat vertical scaling as a workload-specific necessity, not a default strategy. Our orchestration layer profiles the memory footprint of a target during the initial pilot phase. If a target requires heavy DOM manipulation, we pin that pipeline to our render-heavy node groups — bare-metal servers with 128GB+ of RAM. We monitor the memory-per-worker ratio in real-time; if a target site deploys a bloated update that increases RAM usage, our nodes automatically scale vertically before the OOM killer can interrupt the data feed.

05Did you know?

Vertical scaling has diminishing returns due to OS-level constraints. Even if you provision a server with 1TB of RAM and 128 cores, a single scraping process will eventually hit the ulimit for open file descriptors (which includes network sockets). By default, many Linux distributions cap this at 1,024. If you don't explicitly tune the OS kernel parameters, your massive server will start throwing EMFILE: too many open files errors while 99% of its hardware sits idle.

// 03 — the resource model

How many browsers
fit on one node?

Vertical scaling for rendering nodes is entirely memory-bound. DataFlirt's orchestrator uses these calculations to bin-pack browser contexts onto bare-metal nodes before triggering an up-scale event.

Max Headless Concurrency = C = (Node_RAM - OS_Overhead) / RAM_per_Browser
Assuming 250MB per Chrome tab. CPU usually bottlenecks after RAM. DataFlirt infrastructure sizing
Event Loop Block Rate = B = Active_Threads / CPU_Cores
If B > 1.5 on Node.js/Python, context switching destroys throughput. OS process scheduling limits
Vertical Cost Efficiency = E = Throughput_Gain / Instance_Cost_Delta
Cloud instance pricing is exponential. 2x the RAM often costs >2x the price. FinOps standard model
// 04 — resource exhaustion

Hitting the memory wall
on a rendering node.

A live trace of a worker node running out of memory due to heavy SPA rendering, followed by an automated vertical scale-up event to a larger instance class.

PlaywrightOOM KillerAWS EC2
edge.dataflirt.io — live
CAPTURED
// node-042 telemetry (t3.xlarge - 16GB RAM)
cpu.load_avg: 3.14, 2.89, 2.45
mem.used: 15.8GB / 16.0GB
process.playwright: 42 active contexts

// kernel panic
syslog: Out of memory: Killed process 1492 (chrome)
worker.status: CRITICAL - dropping jobs

// vertical scale event triggered
orchestrator.action: "provision_replacement"
instance.target: "c6i.4xlarge" // 32 vCPU, 64GB RAM
state.migration: complete

// node-043 telemetry (post-scale)
mem.used: 16.2GB / 64.0GB
worker.status: STABLE
// 05 — the bottlenecks

What limits a
single machine.

You cannot scale vertically forever. These are the primary constraints that force scraping pipelines to eventually adopt horizontal distribution.

MAX BROWSER TABS ·  ·  ·  ~200 per node
RAM PER TAB ·  ·  ·  ·    150–400 MB
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

RAM exhaustion (Headless)

hard limit · Chrome processes consume memory linearly with concurrency
02

CPU context switching

throughput killer · Too many threads fighting for too few cores degrades performance
03

Ephemeral port exhaustion

network limit · OS runs out of available TCP ports (~65k max) for outbound requests
04

File descriptor limits

OS limit · ulimit -n caps the number of open sockets and files
05

Language runtime limits

GIL / Event Loop · Python GIL or Node.js single thread cannot utilize 64 cores natively
// 06 — our architecture

Bigger boxes,

for heavier DOMs.

DataFlirt uses vertical scaling selectively. For pure HTTP/API scraping, we scale horizontally across thousands of lightweight micro-containers. But for complex JavaScript rendering pipelines that require full DOM execution, WebGL fingerprinting, and visual diffing, we route tasks to vertically scaled, memory-optimized bare-metal nodes. You cannot split a single Chrome tab's memory footprint across two servers. When the DOM is heavy, the box must be big.

node-provisioning.yaml

Dynamic instance selection based on target rendering requirements.

target.type SPA · React · Heavy DOM
engine.required Playwright · Chromium
concurrency.target 100 parallel sessions
ram.estimated 32 GB minimum
node.selected r6g.2xlarge
network.interface Up to 10 Gbps
provisioning.status active

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about hardware provisioning, memory limits, and when to scale up versus when to scale out.

Ask us directly →
Why scale vertically instead of horizontally? +
Vertical scaling is simpler to manage. You don't have to deal with distributed state, message queues, or network latency between worker nodes. If you have a script that works on your laptop but needs to run 10x faster, moving it to a 32-core cloud instance is a one-line infrastructure change. Horizontal scaling requires re-architecting the application.
How much RAM does a headless browser actually need? +
A blank Chrome tab takes about 50MB. A typical e-commerce product page takes 150–300MB. A heavy Single Page Application (SPA) with memory leaks can easily spike to 1GB+ per tab. If you want to run 50 concurrent browsers safely, you need at least 16GB of RAM dedicated purely to the browser processes, plus OS overhead.
Does vertical scaling help with IP blocking? +
No. It actually makes it worse if you aren't careful. A massive 128-core server still only has one primary egress IP address. If you send 5,000 requests per second from that single IP, you will be blocked instantly. Vertical scaling must always be paired with a robust proxy pool to distribute the network identity.
What is the 'blast radius' problem with vertical scaling? +
If you put all your scraping concurrency on one massive $2,000/month server and that server experiences a kernel panic or a network partition, your entire pipeline halts. Horizontal scaling limits the blast radius — if one small node dies, the orchestrator simply routes the queue to the remaining 99 nodes.
How does Node.js handle massive vertical scaling? +
Poorly, by default. Node.js is single-threaded. If you deploy a standard Node scraper on a 64-core machine, it will max out one core at 100% while the other 63 cores sit idle. To utilize vertical scaling in Node or Python, you must explicitly use the cluster module, worker threads, or multiprocessing to spawn a process per core.
How does DataFlirt decide when to scale vertically? +
We profile the target during the scoping phase. If the target requires full browser rendering and the DOM consumes >250MB per page, we assign the job to our memory-optimized vertical node pools. If the target is a clean JSON API, we assign it to our horizontally scaled serverless fleet. We match the infrastructure shape to the target's technical constraints.
$ dataflirt scope --new-project --target=vertical-scaling READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h