← Glossary / Non-Blocking I/O

What is Non-Blocking I/O?

Non-Blocking I/O is an asynchronous execution model where a single thread initiates network requests and immediately moves on to other tasks rather than waiting idly for the server to respond. In scraping infrastructure, it is the fundamental difference between a script that maxes out at 50 concurrent requests per CPU core and a worker that comfortably sustains 5,000. If your pipeline scales by adding more servers instead of optimizing the event loop, you are paying for idle CPU cycles.

AsyncEvent LoopConcurrencyNetwork I/OThroughput
// 02 — definitions

Stop waiting
for the wire.

The architectural shift from thread-per-request to event-driven concurrency, and why it is mandatory for high-throughput data extraction.

Ask a DataFlirt engineer →

TL;DR

Non-blocking I/O allows a single process to handle thousands of concurrent network connections by delegating the wait time to the operating system. Instead of blocking the thread while a target server takes 800ms to return HTML, the worker processes other callbacks. It is the core mechanism behind Node.js, Python's asyncio, and Go's goroutines.

01Definition & structure

Non-blocking I/O is a system call behavior where operations that would normally suspend the executing thread (like reading from a network socket) return immediately. If data is not yet available, the OS returns an error code (like EAGAIN) instead of putting the thread to sleep.

In scraping, this is paired with an event notification system (epoll on Linux, kqueue on macOS). The scraper registers thousands of sockets with the OS and asks, "Tell me when any of these have data." The single thread only wakes up to process sockets that are actually ready to be read.

02How it works in practice

When you use libraries like aiohttp or httpx, the underlying engine opens a TCP socket in non-blocking mode. It sends the HTTP GET request and yields control back to the event loop. The event loop picks up the next task (perhaps sending another request or parsing a completed one).

Milliseconds later, the OS receives the HTTP response packets. It flags the file descriptor as readable. On the next tick of the event loop, the engine sees the flag, reads the buffer, and resumes the specific coroutine waiting for that data. The CPU is never idle while waiting for the network.

03The memory footprint advantage

A standard OS thread requires a dedicated memory stack (often 1MB to 8MB). If you want 10,000 concurrent requests using a thread-per-request model, you need gigabytes of RAM just for the thread stacks, plus the massive CPU overhead of context switching between them.

A non-blocking coroutine is just a state machine allocated on the heap. It takes a few kilobytes. You can fit hundreds of thousands of them in a gigabyte of RAM. This is why non-blocking architectures are the only viable path for high-scale URL discovery and availability monitoring.

04How DataFlirt handles it

We do not use Python for our edge fetchers. While asyncio is great, the Global Interpreter Lock (GIL) and garbage collection pauses introduce unpredictable latency at scale. Our fetch layer is written in Rust using the Tokio runtime.

This allows a single DataFlirt edge node to maintain tens of thousands of concurrent TLS connections to target servers and proxies, utilizing 100% of the network pipe while keeping CPU usage under 20%. The heavy lifting of parsing is shipped over a message queue to dedicated worker pools.

05The CPU-bound trap

The most common failure mode in async scraping is accidentally blocking the event loop. If you fetch 1,000 pages concurrently, and they all return at the same time, your script will try to parse 1,000 HTML documents sequentially.

If parsing one document takes 10ms, the 1,000th document will wait 10 seconds to be processed. During those 10 seconds, the event loop is blocked. Network buffers fill up, keep-alive connections drop, and the OS starts dropping packets. Always offload heavy parsing to a thread pool or a separate process.

// 03 — concurrency math

How much can
one core do?

The theoretical limit of concurrent requests is bound by memory and file descriptors, not CPU speed. DataFlirt uses these models to pack workers densely without triggering event loop lag.

Little's Law for Concurrency = L = λ × W
Concurrency (L) equals arrival rate (λ) times average wait time (W). Queueing Theory
Memory per connection = Mtotal = C × (Msocket + Mbuffer)
10,000 sockets at 16KB each is only ~160MB of RAM. OS Network Stack
Event loop lag = Tlag = TactualTexpected
Lag > 50ms means CPU-bound work is blocking the I/O thread. DataFlirt Node Health SLO
// 04 — event loop trace

Multiplexing 4,000 sockets
on a single thread.

A live trace of an asynchronous fetcher dispatching requests and handling OS-level epoll events without blocking execution.

epollasynciofile descriptors
edge.dataflirt.io — live
CAPTURED
// worker initialization
event_loop: "epoll"
max_file_descriptors: 65535

// dispatching batch
req_01: dispatched -> "https://target.com/page/1"
req_02: dispatched -> "https://target.com/page/2"
thread_status: "idle / polling OS"

// 42ms later
epoll_wait: [fd_14 ready]
req_02: headers_received

// 110ms later
epoll_wait: [fd_13 ready, fd_14 ready]
req_02: body_complete (142 KB) -> 200 OK
req_01: timeout_exceeded -> ERR_TIMEOUT

// metrics
active_connections: 4,192
cpu_utilization: 14% // thread is mostly waiting
// 05 — bottleneck shifts

Where the limits
move to.

Once you eliminate thread-blocking, the bottlenecks in a scraping pipeline shift from CPU scheduling to OS-level network constraints and memory management.

MAX CONCURRENCY ·  ·  ·   50k per node
TARGET OS ·  ·  ·  ·  ·   Linux (epoll)
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Ephemeral port exhaustion

TCP limits · Running out of local ports (65,535 max) before sockets close.
02

File descriptor limits

ulimit -n · OS refusing to open new sockets due to process limits.
03

Event loop blocking

CPU-bound · Parsing large HTML blocks the thread, delaying I/O callbacks.
04

Memory fragmentation

Buffers · Allocating and freeing thousands of small read buffers.
05

DNS resolution bottlenecks

UDP limits · Standard gethostbyname is synchronous; requires async DNS.
// 06 — edge architecture

Never block the loop,

isolate the CPU-bound work.

DataFlirt separates the I/O layer from the extraction layer. Our fetchers are pure non-blocking state machines written in Rust, capable of holding 50,000 open sockets per node. When a payload arrives, it is immediately handed off to a separate thread pool for HTML parsing and JSON decoding. Mixing I/O and CPU work in the same async loop is the number one cause of degraded scraper throughput.

worker-node-04.metrics

Live telemetry from a non-blocking fetcher node during a high-volume catalog crawl.

worker.id fetch-rust-04
architecture tokio / epoll
active_sockets 12,405stable
event_loop_lag 1.2mshealthy
fd_usage 12,480 / 65535
cpu_bound_queue offloaded to parser pool
throughput 4,100 req/s

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about asynchronous scraping, event loops, and scaling network I/O.

Ask us directly →
What is the difference between non-blocking I/O and multithreading? +
Multithreading assigns one OS thread per request. If the request takes 2 seconds, the thread sleeps for 2 seconds, consuming RAM and context-switching overhead. Non-blocking I/O uses a single thread to fire the request, registers a callback with the OS, and immediately moves to the next task. It handles thousands of requests with a fraction of the memory.
Why does my async Python scraper still max out the CPU? +
You are likely doing CPU-bound work on the event loop. Fetching the HTML is non-blocking, but parsing it with BeautifulSoup or lxml is synchronous. If parsing takes 50ms, the event loop stalls for 50ms, and no other network callbacks can fire. Offload parsing to a separate process pool.
How many concurrent requests can one non-blocking worker handle? +
It depends on the OS limits and memory, not the CPU. A modern Linux server can easily hold 50,000 open sockets on a single core using epoll, provided you have tuned ulimit -n and the ephemeral port range. DataFlirt caps workers at 15,000 to leave headroom for proxy negotiation.
What happens when a proxy is slow in a non-blocking model? +
Nothing breaks. The socket remains open and the OS tracks it, but the worker thread ignores it until data arrives. Slow proxies are devastating to thread-per-request models because they tie up the thread pool, but in a non-blocking architecture, a 10-second proxy response time costs almost zero resources.
How does DataFlirt monitor event loop health? +
We track event loop lag — the difference between when a timer is scheduled to fire and when it actually executes. If lag exceeds 50ms, it means a synchronous operation has hijacked the thread. The node automatically stops accepting new URLs from the queue until the lag recovers.
Does headless browsing use non-blocking I/O? +
The communication between your script and the browser (via CDP) is non-blocking, but the browser itself is heavily multi-threaded and CPU-bound. You cannot run 5,000 Playwright instances on a single core just because your Python script uses asyncio. Headless browsers scale by CPU and RAM, not by event loop efficiency.
$ dataflirt scope --new-project --target=non-blocking-i/o READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h