← Glossary / POST Request

What is POST Request?

POST Request is an HTTP method used to submit data to a specified resource, typically causing a change in state or triggering a complex backend query. In scraping, POST requests are the primary vehicle for bypassing simple URL-based routing—used to submit search forms, authenticate sessions, query GraphQL endpoints, and retrieve JSON payloads from undocumented APIs. Unlike GET requests, POSTs carry a body payload, making them harder to cache but richer in query parameters, which often exposes hidden backend filters to data pipelines.

HTTP MethodPayloadsAPI ScrapingStatefulGraphQL
// 02 — definitions

Beyond the
URL bar.

When the data you need isn't addressable via a simple link, you have to ask the server for it directly by submitting a structured payload.

Ask a DataFlirt engineer →

TL;DR

A POST request sends data to the server in the request body rather than the URL. It's the backbone of modern API scraping, GraphQL queries, and form submissions. Because POSTs are non-idempotent and rarely cached at the edge, they often hit the origin server directly, making them subject to stricter rate limits and WAF payload inspections.

01Definition & structure
A POST Request is an HTTP method designed to send data to the server. Unlike a GET request, which appends data to the URL string, a POST request carries its data in the request body. This payload can be formatted as JSON, XML, multipart form data, or raw binary. Because the data isn't exposed in the URL, POST is used for sensitive operations, large payloads, and complex queries (like GraphQL).
02How it works in practice
In web scraping, POST requests are highly prized. Modern single-page applications (SPAs) rarely load data via the initial HTML. Instead, the frontend JavaScript constructs a JSON payload containing your search filters, pagination cursors, and session data, and POSTs it to an undocumented backend API. By intercepting and replicating this POST request, scrapers can bypass HTML parsing entirely and receive clean, structured JSON directly from the database.
03The caching penalty
By HTTP specification, POST requests are non-idempotent—meaning the server assumes every request changes state. Consequently, Content Delivery Networks (CDNs) do not cache them. Every POST request you send will bypass the edge and hit the target's origin server. This makes POST scraping slower (higher Time To First Byte) and much more likely to trigger IP bans if you exceed the origin's strict rate limits.
04How DataFlirt handles it
We treat POST endpoints as first-class data sources. Our pipeline engineers reverse-engineer the required payload schemas, mapping out which fields are static, which are dynamic (like timestamps or nonces), and which control pagination. We then deploy stateful fetchers that automatically acquire necessary CSRF tokens via pre-flight requests before executing high-concurrency POST floods, carefully throttled to respect origin database capacity.
05Did you know?
Many developers use POST requests for data retrieval (like complex search queries) simply because GET URLs have length limits (typically around 2,048 characters). If a search filter requires passing an array of 500 product IDs, the frontend must use a POST request. Discovering these "search-via-POST" endpoints is one of the most effective ways to extract bulk catalog data without paginating through hundreds of HTML pages.
// 03 — the payload model

How expensive
is a POST?

POST requests bypass edge caches by design. This means higher latency and stricter rate limits. DataFlirt models origin load to prevent pipeline throttling and maintain stealth.

Origin Latency = Torigin = Tnetwork + Tcompute + Tdb
POSTs force database lookups. TTFB is inherently higher than cached GETs. Network fundamentals
WAF Risk Score = R = payload_entropy × header_anomaly
Malformed JSON or missing Content-Type headers spike detection risk instantly. DataFlirt WAF evasion model
DataFlirt Concurrency Budget = C = target_rps / Torigin
We scale POST concurrency inversely to origin latency to avoid 429s. Internal SLO
// 04 — payload inspection

Intercepting an
undocumented API.

A live trace of a scraper replicating a frontend search filter. The pipeline submits a JSON payload to a hidden POST endpoint, bypassing the HTML render entirely.

HTTP/2application/jsonundocumented API
edge.dataflirt.io — live
CAPTURED
// outbound POST request
method: POST path: "/api/v2/inventory/search"
content-type: "application/json"
x-csrf-token: "a9f8b7c6d5e4f3g2h1" // dynamically extracted

// request body (payload)
payload.category: "industrial_motors"
payload.filters.in_stock: true
payload.pagination.cursor: "eyJvZmZzZXQiOjEwMH0="

// edge WAF inspection
waf.content_length: 128 MATCH
waf.json_valid: true
waf.action: PASS // routed to origin

// origin response
status: 200 OK
response.records: 50
response.next_cursor: "eyJvZmZzZXQiOjE1MH0="
pipeline.state: cursor updated, queuing next POST
// 05 — failure modes

Where POST
requests break.

Ranked by share of extraction failures across DataFlirt's API scraping pipelines. POST requests are brittle because they require exact state replication and strict formatting.

PIPELINES MONITORED ·   180+ API targets
FAILURE WINDOW ·  ·  ·    30d trailing
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

CSRF Token Mismatch

% of failures · Missing or expired anti-forgery tokens
02

Content-Type Errors

% of failures · Sending JSON but declaring form-urlencoded
03

Missing Required Fields

% of failures · Backend expects fields the frontend hides
04

Origin Rate Limiting

% of failures · Uncached POSTs trigger 429s faster than GETs
05

WAF Payload Inspection

% of failures · Blocked due to anomalous JSON key ordering
// 06 — our stack

Reverse-engineer the payload,

replicate the state.

Scraping a POST endpoint requires exact replication of the client's state. Missing a single anti-forgery token, sending the wrong Content-Type, or failing to stringify a nested JSON object correctly will trigger a 400 Bad Request or a silent WAF block. DataFlirt's pipeline engine automatically intercepts browser traffic, extracts the dynamic payload schema, and maps it to our distributed fetchers—turning brittle form submissions into stable, high-throughput API calls.

post-fetcher.config

Live configuration for a stateful POST pipeline targeting a B2B directory.

endpoint.url /api/graphql
method POST
headers.content application/json
auth.csrf_strategy pre-flight GET
payload.schema validated
rate_limit.origin 12 req/s max
pipeline.status active · 99.9% success

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

About HTTP methods, undocumented APIs, payload formatting, and how DataFlirt scales POST-heavy pipelines.

Ask us directly →
What is the difference between GET and POST in scraping? +
GET requests encode parameters in the URL (e.g., ?page=2&sort=price) and are meant to be idempotent and cacheable. POST requests send parameters in the request body. In scraping, finding a POST endpoint often means you've found the backend API, allowing you to request clean JSON data instead of parsing messy HTML.
How do you find hidden POST APIs on a website? +
Open the browser's Network tab, filter by Fetch/XHR, and perform an action on the site (like applying a search filter or clicking 'Load More'). Look for POST requests. The payload tab will show you exactly what JSON or form data the frontend sent to the server to retrieve the data.
Why do my POST requests return 403 Forbidden or 400 Bad Request? +
Usually, it's a state mismatch. You are likely missing a CSRF token (which must be scraped from a prior GET request), sending the wrong Content-Type header, or your JSON payload is missing a required field that the server expects. WAFs also block POSTs if the header order looks like a script rather than a browser.
Can you cache POST requests? +
Technically yes, but practically no. CDNs and edge caches (like Cloudflare or Fastly) do not cache POST requests by default because POST implies a state change. This means every POST request you send hits the target's origin server, which is why rate limits on POST endpoints are typically much stricter than on GET endpoints.
How does DataFlirt handle CSRF tokens for POST pipelines? +
We use a two-step fetch architecture. A lightweight pre-flight worker executes a GET request to the target page, extracts the dynamic CSRF token from the DOM or cookies, and passes it to the state store. The primary worker then injects that token into the headers or payload of the subsequent POST request, ensuring perfect state replication.
Are GraphQL endpoints always POST requests? +
Almost always. While GraphQL can technically operate over GET, the queries are usually too large and complex for URL parameters. Scraping a GraphQL endpoint involves sending a POST request with a JSON body containing the query and variables. DataFlirt pipelines routinely reverse-engineer these queries to extract exact data shapes without over-fetching.
$ dataflirt scope --new-project --target=post-request READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h