What is Load Balancer?
A load balancer is the traffic cop of a distributed scraping architecture, sitting between your job scheduler and the fleet of worker nodes. It distributes outbound fetch requests and inbound data processing tasks across available compute resources to prevent any single node from bottlenecking. In high-throughput pipelines, an intelligent load balancer doesn't just route round-robin — it routes based on target domain rate limits, proxy pool health, and worker memory pressure.