← Glossary / Requests Per Second (RPS)

What is Requests Per Second (RPS)?

Requests Per Second (RPS) is the fundamental throughput metric of a scraping pipeline, measuring how many HTTP requests are dispatched and acknowledged within a one-second window. It dictates pipeline duration, proxy pool exhaustion rates, and the probability of triggering target rate limits. For data engineers, tuning RPS is a balancing act between data freshness and pipeline survival — push too hard, and you burn your IP reputation; go too slow, and your data is stale before it hits S3.

ThroughputConcurrencyRate LimitingProxy ExhaustionPipeline Tuning
// 02 — definitions

Speed vs
survival.

The mechanics of pipeline throughput, and why raw speed is often the enemy of reliable data extraction.

Ask a DataFlirt engineer →

TL;DR

RPS measures the velocity of your fetch layer. While it's tempting to maximize concurrency to hit high RPS, production pipelines are usually bottlenecked by target server capacity, anti-bot rate limits, and proxy pool diversity. A stable 50 RPS pipeline that runs 24/7 is infinitely more valuable than a 500 RPS pipeline that gets permanently IP-banned after ten minutes.

01Definition & structure

Requests Per Second (RPS) is the standard metric for measuring the throughput of a web scraper or API client. It represents the total number of HTTP requests successfully dispatched and acknowledged within a one-second window.

RPS is a function of two variables: concurrency (how many parallel workers or threads you have running) and latency (how long the target server takes to respond). If you have 10 concurrent workers and the target responds in 500ms, your RPS is 20. If the target slows down to 2000ms, your RPS drops to 5, even though your concurrency hasn't changed.

02How it works in practice

In a production scraping environment, RPS is rarely a flat line. It fluctuates based on network conditions, proxy health, and target server load. A pipeline is typically configured with a target RPS and a maximum concurrency.

The orchestration layer dispatches workers to hit the target RPS. If the target server starts rate-limiting (returning HTTP 429s) or tarpitting (intentionally delaying responses), the effective RPS will drop. Sophisticated pipelines monitor this drop and automatically throttle their concurrency to avoid burning through their proxy pool or getting permanently banned.

03The proxy exhaustion problem

High RPS requires a massive proxy pool. If a target allows 1 request per minute per IP, and you want to scrape at 100 RPS, you need 6,000 unique IPs actively rotating at any given moment. If your pool only has 2,000 IPs, you will exhaust your clean IPs in 20 seconds.

Once exhausted, the scraper is forced to reuse IPs before their cooldown period ends, leading to immediate blocks, CAPTCHAs, and a collapse in data yield. This is why scaling RPS is fundamentally a proxy management challenge, not a compute challenge.

04How DataFlirt handles it

We treat RPS as a dynamic ceiling, not a static configuration. Our orchestration engine uses adaptive backpressure: we start below the requested RPS and ramp up while monitoring the target's latency and error rates. If we detect a 5% increase in response times, we halt the ramp. If we see a single HTTP 429, we immediately back off by 20%.

By prioritizing "safe RPS" over "maximum RPS", our pipelines maintain 99.9% uptime and preserve the health of our residential proxy pools, ensuring consistent data delivery without triggering anti-bot defenses.

05Did you know?

Many modern anti-bot systems (like Cloudflare and Akamai) don't just measure RPS per IP — they measure RPS across a specific TLS fingerprint or browser profile. If you distribute 1,000 RPS across 1,000 different IPs, but all requests share the exact same JA3 hash and User-Agent, the bot manager will cluster the traffic and block the entire distributed fleet simultaneously.

// 03 — the math

How to calculate
your RPS limits.

RPS isn't just a dial you turn up. It is mathematically bound by your network latency, your proxy pool size, and your target's tolerance. DataFlirt uses these models to auto-tune pipeline concurrency.

Concurrency requirement (Little's Law) = C = RPS × Latencyavg
To sustain 100 RPS with a 2-second average response time, you need exactly 200 concurrent workers. Queueing Theory
Proxy pool exhaustion limit = RPSmax = Pool_Size / Cooldown_Seconds
A 10,000 IP pool with a 5-minute (300s) cooldown per IP caps your safe throughput at ~33 RPS. DataFlirt infrastructure planning
Effective Data Yield = RPSeff = RPSraw × (1Block_Rate)
High raw RPS with a 40% block rate yields less actual data than a throttled, compliant crawl. Pipeline SLOs
// 04 — throughput tuning

Finding the
soft-block ceiling.

A live trace of a DataFlirt auto-tuner probing a target's rate limit. We ramp concurrency until we hit a 429 or a CAPTCHA, then back off to a safe sustained RPS.

auto-tuneHTTP 429exponential backoff
edge.dataflirt.io — live
CAPTURED
// phase 1: ramp up
target: "api.retail-target.com/v2/catalog"
concurrency: 10 rps: 12.4 status: 200 OK
concurrency: 25 rps: 31.8 status: 200 OK
concurrency: 50 rps: 64.2 status: 200 OK

// phase 2: threshold detection
concurrency: 75 rps: 92.1 status: 200 OK
concurrency: 100 rps: 118.4 status: 429 Too Many Requests
warn: rate limit triggered at ~115 RPS

// phase 3: backoff and stabilize
action: applying exponential backoff (30s)
action: rotating proxy pool segment
concurrency: 60 rps: 75.0 status: 200 OK
slo_check: 75 RPS > required 40 RPS
pipeline.status: locked at 75 RPS
// 05 — throughput bottlenecks

What actually
limits your RPS.

Ranked by frequency of occurrence across DataFlirt's high-volume pipelines. Raw compute is rarely the bottleneck; network constraints and target defenses dictate the ceiling.

PIPELINES ANALYZED ·  ·   850+ active
WINDOW ·  ·  ·  ·  ·  ·   90d trailing
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Target anti-bot rate limits

89% of pipelines · WAFs enforcing strict IP or session quotas
02

Proxy pool size & cooldown

72% of pipelines · Running out of clean IPs before cooldown expires
03

Target server capacity

54% of pipelines · Target infrastructure degrading under load
04

Network I/O & bandwidth

38% of pipelines · Saturated egress links on scraping worker nodes
05

CPU/Memory per worker

21% of pipelines · DOM parsing and JS rendering overhead
// 06 — our architecture

Throttle intelligently,

scale horizontally.

DataFlirt doesn't guess at RPS limits. Our orchestration layer uses an adaptive backpressure algorithm that continuously monitors target latency, 429 rates, and proxy health. If a target's response time degrades, we automatically reduce RPS to prevent cascading timeouts and IP bans. For massive datasets, we scale horizontally across thousands of IPs, keeping the per-IP RPS near zero while the aggregate pipeline RPS hits the thousands.

Pipeline throughput telemetry

Live metrics from a global e-commerce pricing pipeline.

pipeline.id ecom-global-pricing
aggregate.rps 1,240 req/s
per_ip.rps 0.04 req/s
target.latency 840ms
block.rate 0.02%nominal
active.workers 450 nodes
status optimal throughput

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about scaling throughput, managing concurrency, and avoiding rate limits.

Ask us directly →
What is the difference between concurrency and RPS? +
Concurrency is the number of active, in-flight connections at any given moment. RPS is the number of completed requests per second. They are linked by latency: if your concurrency is 100 and each request takes 2 seconds to complete, your RPS is 50. You cannot increase RPS without either increasing concurrency or decreasing latency.
How do I calculate the RPS I need for a project? +
Divide your total URL count by your required time window in seconds. If you need to scrape 1,000,000 product pages every 24 hours, you need 1,000,000 / 86,400 = ~11.5 RPS. Always pad this by 20-30% to account for retries, network hiccups, and proxy rotations.
Is it illegal to scrape at a very high RPS? +
High RPS itself isn't strictly illegal, but if your request volume degrades the target server's performance, it can cross the line into a Denial of Service (DoS) attack or a "trespass to chattels" claim. Courts look at whether the scraping caused actual damage to the host's infrastructure. We strongly advise capping RPS well below the target's capacity.
Why does my RPS start high and then drop over time? +
This is the classic signature of proxy exhaustion or target tarpitting. As your IPs get flagged, the target starts issuing CAPTCHAs or dropping connections. Your workers spend more time waiting for timeouts or executing retries, which spikes your average latency. Because RPS = Concurrency / Latency, your throughput plummets.
How does DataFlirt handle sudden rate limits? +
We use adaptive backpressure. When our edge nodes detect an uptick in 429s or latency spikes, the scheduler immediately pauses the affected workers, rotates the IP segment, and resumes at a 20% lower concurrency ceiling. We find the new safe limit dynamically without manual intervention.
Can I achieve 10,000+ RPS on a single target? +
Yes, but it requires massive horizontal scaling, a premium residential proxy pool to distribute the load, and — crucially — a target infrastructure capable of handling it. We routinely run enterprise catalog syncs at these speeds, but only against CDNs (like Cloudflare or Fastly) that can absorb the traffic without impacting the target's origin servers.
$ dataflirt scope --new-project --target=requests-per-second-(rps) READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h