SYSTEM all green source united.com queue 18,402 routes p99 latency 842ms dataflirt.com · scraper/united-com
RUN | 64 active pipelines | united.com live

United flight data,
at warehouse scale.

We extract route schedules, dynamic pricing signals, fare class availability, and aircraft configurations from united.com. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Flights extracted
412K /day
Fare updates
1.8M /24h
Seat maps parsed
89K /run
Active pipelines
64
Uptime
99.94%
Data Dictionary

Every field we extract from united.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Flight Schedules objects from united.com. All fields typed and schema-versioned.

flight_numberorigin_airportdestination_airportdeparture_time_localarrival_time_localflight_duration_minutesaircraft_typeoperating_carrierstop_countlayover_detailsdistance_miles
flight_schedules
● 200 OK
"flight_number": "UA412",
"origin_airport": "SFO",
"destination_airport": "EWR",
"departure_time_local": "2026-08-14T08:30:00",
"flight_duration_minutes": 325,
"aircraft_type": "Boeing 777-200",
"operating_carrier": "United Airlines",
"stop_count": 0
# flight_numberorigin_airportdestination_airportdeparture_time_localarrival_time_localflight_duration_minutes
1
2
3

Complete list of extractable fields for Pricing & Fares objects from united.com. All fields typed and schema-versioned.

flight_numbersearch_datedeparture_datebasic_economy_pricestandard_economy_pricepremium_plus_pricepolaris_business_pricecurrencytax_breakdownfare_basis_coderefundable_flag
pricing_& fares
● 200 OK
"flight_number": "UA412",
"departure_date": "2026-08-14",
"basic_economy_price": 249.0,
"standard_economy_price": 299.0,
"polaris_business_price": 1249.0,
"currency": "USD",
"fare_basis_code": "KAA2AQEN",
"refundable_flag": false
# flight_numbersearch_datedeparture_datebasic_economy_pricestandard_economy_pricepremium_plus_price
1
2
3

Complete list of extractable fields for Seat Availability objects from united.com. All fields typed and schema-versioned.

flight_numberdeparture_datetotal_capacityeconomy_seats_availablepremium_plus_seats_availablepolaris_seats_availableexit_row_premium_feeblocked_seats_countseat_map_timestamp
seat_availability
● 200 OK
"flight_number": "UA412",
"departure_date": "2026-08-14",
"total_capacity": 276,
"economy_seats_available": 42,
"polaris_seats_available": 4,
"exit_row_premium_fee": 89.0,
"blocked_seats_count": 12,
"seat_map_timestamp": "2026-07-01T14:22:00Z"
# flight_numberdeparture_datetotal_capacityeconomy_seats_availablepremium_plus_seats_availablepolaris_seats_available
1
2
3

Complete list of extractable fields for MileagePlus Rewards objects from united.com. All fields typed and schema-versioned.

flight_numberorigindestinationdeparture_datesaver_award_miles_economyeveryday_award_miles_economysaver_award_miles_polaristaxes_fees_cashcurrencymixed_cabin_flagaward_inventory_status
mileageplus_rewards
● 200 OK
"flight_number": "UA412",
"origin": "SFO",
"destination": "EWR",
"saver_award_miles_economy": 15000,
"everyday_award_miles_economy": 32500,
"saver_award_miles_polaris": 60000,
"taxes_fees_cash": 5.6,
"currency": "USD",
"mixed_cabin_flag": false
# flight_numberorigindestinationdeparture_datesaver_award_miles_economyeveryday_award_miles_economy
1
2
3

Complete list of extractable fields for Flight Status objects from united.com. All fields typed and schema-versioned.

flight_numberflight_datescheduled_departureestimated_departureactual_departurestatus_codegate_originterminal_origindelay_minutescancellation_reason
flight_status
● 200 OK
"flight_number": "UA412",
"flight_date": "2026-08-14",
"scheduled_departure": "2026-08-14T08:30:00",
"estimated_departure": "2026-08-14T09:15:00",
"status_code": "Delayed",
"gate_origin": "G3",
"terminal_origin": "3",
"delay_minutes": 45
# flight_numberflight_datescheduled_departureestimated_departureactual_departurestatus_code
1
2
3

Capabilities

Everything you need from United, nothing you do not

Our United scraper navigates complex booking flows, dynamic inventory loading, and heavy edge protection to deliver accurate flight and pricing data.

Full Route Schedules

Extract origin, destination, layovers, flight duration, and operating carrier details for any specified route network.

Dynamic Fare Tracking

Capture real-time pricing across all cabins, from Basic Economy up to Polaris Business, including tax breakdowns.

MileagePlus Award Pricing

Track Saver versus Everyday award availability and monitor mileage requirements alongside cash copays.

Seat Map and Inventory

Parse seat maps to calculate remaining available seats per cabin, blocked inventory, and premium seating fees.

Aircraft Intelligence

Extract equipment type, tail number, WiFi availability, and cabin configuration for specific flight legs.

Multi-Currency Support

Extract point-of-sale specific pricing in USD, EUR, GBP, INR, and other supported currencies.

Flight Status and Punctuality

Monitor real-time delays, gate changes, and cancellations to build historical punctuality datasets.

Ancillary Fees

Capture checked baggage fee matrices, seat selection costs, and priority boarding upgrade prices.

Scheduled and Streaming Modes

Run one-off bulk route exports or configure continuous pipelines at hourly or daily cadences.

// engagement pipeline

From route list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide origin-destination pairs, travel dates, or hub codes. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy and Playwright crawlers, proxy rotation, session management, and edge protection bypass for united.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, price-outlier detection, and sample route checks before full launch.

Delivery
ongoing

JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our United pipeline handles the hard parts

Airlines invest heavily in scraping detection to protect yield management strategies. Here is how we stay resilient.

pipeline-monitor · united.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Edge protection bypass

United uses aggressive edge protection to block automated searches. Our crawlers use residential ISP proxies with realistic browser fingerprints and full cookie session management to bypass these challenges.

Complex state management
Multi-step booking flows

Flight searches require maintaining strict cookie state across multiple redirects. We handle the complete session lifecycle to ensure pricing data loads correctly without triggering security resets.

Dynamic rendering
Asynchronous fare loading

United loads fare matrices asynchronously via XHR after the initial page load. We use Playwright to intercept these specific network requests, extracting clean JSON payloads directly from the backend.

Schema stability
Resilient DOM selectors

The United booking engine updates frequently. Our selector strategy uses multiple fallback chains so a minor layout change does not break your data pipeline overnight.

IP reputation
High-volume search distribution

High-frequency flight searches from a single IP trigger rate limits instantly. We distribute requests across thousands of residential IPs to maintain high throughput without burning proxy nodes.

Applications

Who uses United data and how

Teams across industries use united.com data to build competitive products and smarter operations.

01
Price Intelligence and Parity

Online Travel Agencies monitor direct-channel pricing to ensure parity and optimise their own commission structures.

02
Yield Management

Competitor airlines track United fare class inventory and pricing adjustments to calibrate their own revenue management algorithms.

03
Travel Aggregation

Metasearch engines populate their cache with high-frequency pricing updates to improve user search latency.

04
Corporate Travel Auditing

Enterprise travel managers validate negotiated corporate rates against public fares to ensure contract compliance.

05
Loyalty Program Analysis

Analysts track MileagePlus devaluation trends and Saver award availability to assess program liability and consumer value.

06
Route Network Planning

Aviation consultants analyse frequency, equipment deployment, and capacity on specific hubs to identify market opportunities.

Why DataFlirt

"Airline pricing is the original dynamic market. United adjusts fares and inventory thousands of times daily, data that remains invisible without automated, high-frequency extraction."

Extracting flight data from legacy carriers requires navigating aggressive edge protection, complex multi-step booking flows, and heavy asynchronous rendering. DataFlirt manages the proxy rotation, session state, and schema maintenance so your analysts can focus on yield optimisation rather than bot mitigation.

Technical Spec

United scraper technical capabilities

Everything supported by our united.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for asynchronous fare matrices
Supported
Edge protection bypass
Automated solver integration for aggressive bot mitigation
Supported
Residential proxy rotation
ISP-grade residential IPs from US pools rotated per session
Supported
Multi-step search flows
Maintains cookie state across complex search and filter parameters
Supported
MileagePlus Award pricing
Extracts both cash and mileage requirements for award flights
Supported
Seat map parsing
Calculates available versus blocked seats per cabin class
Supported
Change detection
Hash-based diffing to emit only records with changed fares
Supported
Webhook delivery
HTTP POST per record for real-time pricing workflows
Supported
MileagePlus member profiles
Requires authenticated user sessions and violates terms of service
Partial
Corporate negotiated fares
Requires specific corporate login credentials and SSO bypass
Partial
Infrastructure

Infrastructure powering the United pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows for complex flight searches.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US regions. Rotation happens per-session to maintain state while avoiding rate limits and IP bans.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested array format
CSV
Flat file with typed columns for tabular analysis
XLS
Excel compatible format for business users
Parquet
Columnar format for BigQuery, Snowflake, and Athena
AWS S3
Direct bucket delivery compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoint to query your extracted datasets
BigQuery
Streamed directly into your dataset with schema auto-detect
Snowflake
Stage and COPY INTO workflow for incremental updates
PostgreSQL
Upsert into your existing schema with conflict resolution
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About united.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping united.com legal?

Scraping publicly available information from united.com is generally permissible under applicable law. DataFlirt targets only public, non-authenticated flight schedules, pricing, and seat availability. We do not extract personal data or circumvent authentication walls. Clients should review United terms of service and consult legal counsel for specific use cases.

How do you handle United bot protection?

We use residential ISP proxies, full Playwright browser sessions with realistic fingerprints, and request timing modelled on human behaviour. We monitor for blocking patterns in real time and trigger pool rotation automatically.

Can you track MileagePlus award availability?

Yes. We track Saver and standard award availability, extracting the required mileage and the associated cash copay for taxes and fees.

How fresh is the pricing data?

Real-time streaming pipelines achieve sub-30-minute latency for pricing signals on a defined route list. Full network refreshes at daily cadence complete within a defined execution window.

Do you extract seat maps?

Yes. We parse the seat map payload to calculate available seats, blocked seats, and premium seating fees for specific flights.

What is the minimum viable engagement?

Our packages start at a defined route list, typically 500 to 10,000 origin-destination pairs, with daily delivery. We price based on volume and delivery frequency.

Can I request a sample dataset before committing?

Yes. We provide a sample run of up to 50 routes as part of the pre-engagement scoping process so you can validate schema fit and data quality.

$ dataflirt scope --new-project --source=united.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off route schedule dump or a continuous price-monitoring feed across thousands of flights, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →