← Glossary / Request Queue Depth

What is Request Queue Depth?

Request queue depth is the number of pending URLs waiting to be fetched by a scraper's worker pool at any given millisecond. It acts as the primary backpressure metric in a distributed scraping pipeline. When queue depth grows unbounded, your pipeline is discovering new links faster than it can fetch them, leading to stale data, massive memory consumption, and eventual out-of-memory crashes. Managing it requires dynamic concurrency scaling and aggressive deduplication before URLs ever hit the queue.

ConcurrencyBackpressureDistributed CrawlingThroughputOOM Risk
// 02 — definitions

The pipeline's
waiting room.

The operational metric that dictates whether your crawler is keeping pace with discovery or slowly drowning in its own backlog.

Ask a DataFlirt engineer →

TL;DR

Request queue depth measures pending fetch tasks. A stable queue depth means your worker pool is perfectly sized for your discovery rate. A continuously rising depth indicates a bottleneck — usually proxy latency, target rate limits, or insufficient worker concurrency — that will eventually crash the pipeline if left unmanaged.

01Definition & structure

Request queue depth is the total count of URLs that have been discovered, deduplicated, and scheduled for fetching, but have not yet been assigned to an available worker thread. In a standard crawler architecture, "discovery workers" parse HTML to find new links and push them to the queue, while "fetch workers" pop links from the queue to make HTTP requests.

The queue depth is the buffer between these two asynchronous processes. It absorbs micro-bursts in discovery (like parsing a sitemap with 50,000 links) and smooths out the workload for the fetchers.

02How it works in practice

In a healthy pipeline, queue depth looks like a sawtooth wave. A sitemap is parsed, the queue spikes to 10,000, and over the next few minutes, the fetch workers drain it back to zero. The problem arises when the baseline of that wave constantly shifts upward. If you discover 100 URLs per second but can only fetch 80 per second due to proxy latency, your queue depth grows by 20 every second. Within an hour, you have 72,000 pending requests, and your data freshness SLA is destroyed.

03The OOM death spiral

Many developers start with in-memory queues (like Python's queue.Queue). If the queue depth grows unbounded, the memory footprint of the crawler expands until the operating system terminates the process with an Out Of Memory (OOM) error. The pipeline crashes, the in-memory queue is wiped, and the crawl must restart from scratch. This is why production systems enforce strict maximum queue sizes and use external brokers like Redis.

04How DataFlirt handles it

We treat queue depth as a primary auto-scaling trigger. Our orchestrator monitors the derivative of the queue depth (ΔQ). If ΔQ is positive for more than 60 seconds, we automatically provision additional worker pods to increase the fetch rate. If the target site is rate-limiting us (meaning we cannot safely add more workers), we apply backpressure—we temporarily suspend the discovery workers, stopping the influx of new URLs until the fetch workers can clear the backlog.

05Did you know: LIFO vs FIFO

The way you pop items off the queue drastically changes your crawler's behavior. A FIFO (First-In, First-Out) queue creates a breadth-first crawl—it will scrape every category page on a site before it scrapes a single product page. A LIFO (Last-In, First-Out) queue creates a depth-first crawl—it will dive straight down to the products of the first category it finds. Most advanced pipelines abandon both in favor of Priority Queues, where URLs are scored and sorted by business value before fetching.

// 03 — queue dynamics

Is the queue
stable?

Little's Law governs queue stability. DataFlirt's orchestrator uses these derivatives to automatically scale worker pods before a queue breaches its critical threshold.

Queue Growth Rate = ΔQ = RatediscoveryRatefetch
If ΔQ > 0 for sustained periods, the pipeline requires intervention. Queueing Theory
Time to Drain = Tdrain = Qdepth / (Ratefetch × Concurrency)
Estimated time to clear the backlog assuming zero new discoveries. DataFlirt Pipeline Telemetry
DataFlirt Auto-Scale Trigger = Workersnew = ⌈ Qdepth / Target_Tdrain
Scales up pods when drain time exceeds the SLA for data freshness. Internal Orchestrator Logic
// 04 — orchestrator trace

Queue pressure
and auto-scaling.

A live trace from a Redis-backed task queue during a high-volume e-commerce category crawl. The orchestrator detects rising depth and provisions more workers to stabilize the flow.

Redis QueueAuto-scalerWorker Pods
edge.dataflirt.io — live
CAPTURED
// t=0s : baseline
queue.depth: 14,200
workers.active: 40
rate.discovery: 1,200 req/s
rate.fetch: 850 req/s
queue.delta: +350 req/s // warning: growing

// t=45s : threshold breached
queue.depth: 29,950
alert: queue_depth_critical
action: provision_worker_pods

// t=60s : scaling event
pods.requested: +20
pods.status: provisioning...
pods.status: ready

// t=120s : recovery
workers.active: 60
rate.fetch: 1,350 req/s
queue.delta: -150 req/s // draining
queue.status: recovering
// 05 — bottleneck sources

Why the queue
backs up.

The most common reasons a request queue grows unbounded, ranked by frequency across DataFlirt's distributed crawling infrastructure.

PIPELINES ANALYZED ·  ·   850+ active
METRIC ·  ·  ·  ·  ·  ·   Queue Spikes
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Proxy latency spikes

network layer · Slow residential IPs reduce effective fetch rate
02

Target rate limiting (429s)

anti-bot · Forced retries push URLs back into the queue
03

Insufficient worker concurrency

compute · Discovery outpaces available fetch threads
04

Infinite pagination loops

logic error · Broken extractors feed garbage URLs to the queue
05

Heavy JS rendering overhead

browser · Playwright contexts locking up worker threads
// 06 — DataFlirt's architecture

Decoupled queues,

priority routing, and backpressure.

We don't use a single monolithic queue. DataFlirt implements priority-tiered, domain-sharded queues backed by a Redis cluster. This ensures that a slow target domain doesn't block workers from processing fast domains. When a specific queue depth spikes due to target rate limits, our orchestrator applies backpressure to the discovery workers for that domain, pausing URL extraction until the fetch workers can safely drain the backlog.

queue.metrics.live

Real-time telemetry from a domain-sharded queue on a retail pipeline.

queue.id q-retail-in-04
routing.strategy domain-shardedpriority-fifo
depth.current 4,102stable
depth.peak_1h 18,450
rate.discovery 450 req/s
rate.fetch 455 req/sdraining
backpressure.state inactive

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

About queue management, backpressure, scaling limits, and how DataFlirt prevents memory exhaustion in massive crawls.

Ask us directly →
What is considered a 'healthy' queue depth? +
There is no single number. A healthy queue depth is one that is stable or draining. If you have 100,000 URLs in the queue but your fetch rate exceeds your discovery rate, the queue is healthy. If you have 5,000 URLs but the depth is growing by 100 per second, your pipeline is failing. Stability is the metric, not absolute size.
Why not just add more workers when the queue grows? +
Because the bottleneck might not be compute. If the queue is growing because the target site is issuing HTTP 429 (Too Many Requests) or your proxy pool is exhausted, adding more workers will actually make the problem worse by triggering harder bans. You must diagnose the bottleneck before scaling concurrency.
How does queue depth relate to memory leaks? +
In-memory queues (like Python's asyncio.Queue) store URL strings and metadata in RAM. If discovery outpaces fetching, the queue grows unbounded until the process consumes all available memory and the OS kills it (OOM). Production pipelines must use disk-backed or external queues (like Redis or RabbitMQ) and implement hard depth limits.
How does DataFlirt handle infinite pagination loops filling the queue? +
We use strict URL deduplication via Bloom filters before a URL ever enters the queue. Additionally, our orchestrator tracks depth-per-path. If a crawler reaches page 5,000 of a category that typically has 50 pages, the orchestrator flags it as a logic anomaly, pauses discovery for that path, and alerts an engineer.
Is it legal to scale up workers aggressively to drain a queue? +
Scaling must always respect the target's infrastructure and ToS. Ignoring a Crawl-delay directive in robots.txt or overwhelming a server to clear your backlog can cross the line into Denial of Service (DoS) territory. DataFlirt's auto-scaler is hard-capped by the target's maximum safe concurrency threshold, regardless of our internal queue depth.
Should I use FIFO or LIFO for my scraping queue? +
FIFO (First-In, First-Out) results in a breadth-first crawl, which is standard for discovering categories before products. LIFO (Last-In, First-Out) results in a depth-first crawl, which reaches product pages faster but can leave broad sections of the site undiscovered if the crawl is interrupted. Production systems usually use Priority Queues, scoring URLs based on business value.
$ dataflirt scope --new-project --target=request-queue-depth READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h