← Glossary / TCP Connection Timeout

What is TCP Connection Timeout?

TCP connection timeout occurs when a client attempts to establish a network connection with a server, but the server fails to respond to the initial SYN packet within the allotted time window. In scraping pipelines, this rarely means the target is actually down. It usually indicates a silent network-layer drop — often an IP ban, a misconfigured proxy gateway, or a firewall silently discarding packets from your ASN.

Network LayerScraping ErrorsProxy HealthSYN DropTimeouts
// 02 — definitions

The silent
drop.

Why your scraper is hanging on the first packet, and how to distinguish a dead proxy from a hostile target.

Ask a DataFlirt engineer →

TL;DR

A TCP connection timeout happens during the 3-way handshake before any HTTP data is sent. If the server's firewall drops your SYN packet instead of rejecting it, your client waits until the OS-level timeout (often 30–120s) expires. In scraping, this is the classic signature of an IP block or a dead proxy node.

01Definition & structure
A TCP connection timeout is a failure at the transport layer. Before HTTP can happen, TCP requires a 3-way handshake: the client sends a SYN, the server replies with a SYN-ACK, and the client confirms with an ACK. If the client sends the SYN but receives nothing back, it will wait for a predefined period before throwing an ETIMEDOUT error. No HTTP request was ever sent.
02How it works in practice
When a scraper hits a connection timeout, the thread executing the request is blocked. If you are running 100 concurrent workers and 50 of them hit dead proxies that take 60 seconds to time out, your effective throughput halves instantly. Managing timeouts is less about fixing the network and more about protecting your worker pool from starvation.
03Silent drops vs. active rejections
Firewalls have two ways to block an IP: reject or drop. A rejection sends a TCP RST (reset) packet, which immediately closes the connection and throws an ECONNREFUSED error. A drop simply discards the packet, forcing the client to wait for a timeout. WAFs prefer silent drops for scrapers because it ties up the scraper's resources.
04How DataFlirt handles it
We don't let the OS dictate our network pacing. Our fetch layer overrides default socket behaviors, enforcing a strict 1.5-second ceiling on the TCP handshake. If a proxy node is dead or blackholed, we detect it instantly, kill the socket, and re-route the request through a healthy IP. This keeps our worker utilization near 100% even when proxy pools degrade.
05Did you know?
The default TCP SYN retransmission behavior in Linux uses an exponential backoff algorithm. It typically sends the first SYN, waits 1 second, sends another, waits 2 seconds, sends another, waits 4 seconds, and so on. If you don't explicitly set a connect_timeout in your HTTP client, your scraper might hang for over two minutes on a single dead IP.
// 03 — the math

How long should
you wait?

Default OS timeouts are disastrous for scraping throughput. DataFlirt's network stack aggressively bounds connection attempts to fail fast and rotate IPs before worker threads starve.

Effective timeout = Teff = min(Tclient, Tos, Tproxy)
The binding constraint is the shortest timeout in the network chain. Network Engineering 101
SYN retries = Rsyn = 3
Linux default often retries at 1s, 3s, and 7s before giving up. Kernel TCP stack
DataFlirt fast-fail = Tconn = RTTp99 + 1500ms
We kill unacknowledged connections aggressively to free up the worker. DataFlirt internal SLO
// 04 — packet trace

A handshake
that never completes.

A trace of a worker thread attempting to connect through a residential proxy node that has been silently blackholed by the target's firewall.

TCP SYNport 443timeout
edge.dataflirt.io — live
CAPTURED
// init connection
worker: "thread-042"
target: "104.18.22.41:443"
proxy: "res-node-8812"

// 3-way handshake attempt
[0.000s] -> SYN sent
[1.005s] -> SYN retransmit (1)
[3.012s] -> SYN retransmit (2)
[7.025s] -> SYN retransmit (3)

// timeout reached
[10.000s] error: ETIMEDOUT
state: connection failed

// recovery
action: proxy rotated
retry: queued
// 05 — root causes

Why the SYN
gets dropped.

Ranked by frequency across DataFlirt's proxy telemetry. When a connection times out, it's rarely a target outage. It's almost always an infrastructure or security layer silently discarding your packets.

SAMPLE SIZE ·  ·  ·  ·    1.2B requests
WINDOW ·  ·  ·  ·  ·  ·   30d trailing
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Target firewall silent drop

IP ban · WAF drops packets from known bad ASNs
02

Dead proxy exit node

infra failure · Residential peer went offline abruptly
03

Proxy gateway overload

bottleneck · Provider cannot allocate outbound sockets
04

Target listen queue full

overload · Server kernel dropping SYNs under load
05

ISP routing blackhole

routing · BGP route failure to the target
// 06 — connection management

Fail fast,

rotate faster.

A default HTTP client will wait up to two minutes for a dead connection to time out. In a high-concurrency scraping pipeline, this leads to thread starvation — your workers aren't scraping, they're just waiting for packets that will never arrive. DataFlirt's edge nodes implement aggressive, sub-second connection timeouts. If the proxy or target doesn't ACK the SYN immediately, we kill the socket, mark the IP as degraded, and retry on a fresh route before the default client would have even sent its first retransmission.

Socket lifecycle config

DataFlirt worker connection parameters for high-throughput pipelines.

connect_timeout 1500ms
read_timeout 8000ms
tcp_keepalive true
max_retries 3 attempts
proxy_rotation on_timeout
thread_state active

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

About connection timeouts, thread starvation, proxy health, and how DataFlirt keeps pipelines moving when the network drops packets.

Ask us directly →
What is the difference between a connection timeout and a read timeout? +
A connection timeout happens during the initial TCP 3-way handshake — your client sent a SYN but never got a SYN-ACK. A read timeout happens after the connection is established — you sent the HTTP GET request, but the server is taking too long to send the response body back.
Why does the server drop the packet instead of rejecting it? +
Security posture. If a firewall sends a TCP RST (reset) packet, it confirms to the scanner or scraper that the port is open and a machine is listening. By silently dropping the packet (a "blackhole"), the firewall forces the attacker to wait for a timeout, drastically slowing down automated scanning and scraping tools.
How do I fix a persistent connection timeout? +
Rotate your proxy. If the timeout only happens on specific target domains, your current IP or ASN is likely blackholed by their WAF. If it happens across all domains, your proxy gateway is likely down or overloaded.
How does DataFlirt prevent thread starvation from timeouts? +
We decouple our application timeouts from the OS kernel defaults. Instead of waiting 60+ seconds for a dead proxy to respond, our workers enforce a strict 1.5-second connection timeout. If the handshake isn't complete, the worker abandons the socket, flags the proxy node, and grabs a new IP.
Is it legal to aggressively retry timed-out connections? +
Yes, but it's operationally reckless if the target is actually experiencing an outage. We implement circuit breakers: if a target times out across multiple distinct proxy ASNs simultaneously, we assume the target is down and pause the pipeline, rather than hammering an offline server.
Can a high concurrency rate cause connection timeouts? +
Yes. If you send requests faster than the target server can process them, its TCP listen backlog queue will fill up. Once the queue is full, the server's kernel will start silently dropping new incoming SYN packets, resulting in connection timeouts on your end. This is why rate limiting is critical.
$ dataflirt scope --new-project --target=tcp-connection-timeout READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h