SYSTEM all green source zwift.com queue 12,408 events p99 latency 184ms dataflirt.com · scraper/zwift-com
RUN · 42 active pipelines · zwift.com live

Zwift telemetry,
at warehouse scale.

We extract event schedules, route profiles, ZwiftPower race results, and club leaderboards. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Events tracked
4,192 /day
Race results
84.2K /run
Route metrics
142 /sync
Active pipelines
42
Uptime
99.98%
Data Dictionary

Every field we extract from zwift.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Events & Races objects from zwift.com. All fields typed and schema-versioned.

event_idtitleworldroutedistance_kmelevation_mcategory_enforcementstart_timeorganizerregistered_count
events_& races
● 200 OK
"event_id": "349102",
"title": "Tour de Zwift Stage 2",
"world": "Watopia",
"distance_km": 42.5,
"elevation_m": 342,
"start_time": "2024-01-14T18:00:00Z",
"category_enforcement": true,
"registered_count": 1492
# event_idtitleworldroutedistance_kmelevation_m
1
2
3

Complete list of extractable fields for Routes & Worlds objects from zwift.com. All fields typed and schema-versioned.

route_idnameworldleadin_distancetotal_distancetotal_elevationdifficulty_ratingbadge_xpzwift_insider_linksprint_segments
routes_& worlds
● 200 OK
"route_id": "watopia_figure_8",
"name": "Figure 8",
"world": "Watopia",
"total_distance": 29.8,
"total_elevation": 234,
"badge_xp": 580,
"difficulty_rating": 3.2,
"leadin_distance": 0.2
# route_idnameworldleadin_distancetotal_distancetotal_elevation
1
2
3

Complete list of extractable fields for ZwiftPower Results objects from zwift.com. All fields typed and schema-versioned.

zwptidrider_namecategorypositiontime_secondsgap_secondsavg_wattswkgavg_hrweight_kg
zwiftpower_results
● 200 OK
"zwptid": "849201",
"rider_name": "J. Doe",
"category": "B",
"position": 14,
"time_seconds": 3412,
"wkg": 3.8,
"avg_watts": 284,
"avg_hr": 165
# zwptidrider_namecategorypositiontime_secondsgap_seconds
1
2
3

Complete list of extractable fields for Segments objects from zwift.com. All fields typed and schema-versioned.

segment_idnametypedistance_mavg_gradientmax_gradientworldroute_associationsrecord_time
segments
● 200 OK
"segment_id": "alpe_du_zwift",
"name": "Alpe du Zwift",
"type": "KOM",
"distance_m": 12200,
"avg_gradient": 8.5,
"record_time": 2014,
"world": "Watopia",
"max_gradient": 14.2
# segment_idnametypedistance_mavg_gradientmax_gradient
1
2
3

Complete list of extractable fields for Clubs objects from zwift.com. All fields typed and schema-versioned.

club_idnamedescriptionmember_countpublic_statusevent_countcreated_atowner_nametags
clubs
● 200 OK
"club_id": "99382",
"name": "DIRT",
"member_count": 14291,
"public_status": true,
"event_count": 42,
"tags": "['racing', 'social']",
"owner_name": "M. Smith",
"created_at": "2018-04-12T00:00:00Z"
# club_idnamedescriptionmember_countpublic_statusevent_count
1
2
3

Capabilities

Extract every metric from Watopia to Makuri Islands

Our Zwift scraper handles the complex web of Zwift.com event listings and ZwiftPower race telemetry. We parse route parameters, categorical enforcement rules, and historical results with precision.

Event Discovery

Scrape the full calendar of upcoming rides and races, including category limits and registered rider counts.

Route Topography

Extract distance, elevation gain, lead-in metrics, and badge XP for every defined route.

ZwiftPower Integration

Pull race results, categorical upgrades, and 20-minute power metrics directly from ZwiftPower.

Segment Analysis

Track KOM, Sprint, and lap segment metrics across all worlds and map expansions.

World Rotation Tracking

Monitor active worlds and schedule changes to align with event planning.

Club Intelligence

Extract public club rosters, event histories, and membership counts.

Equipment Metadata

Map frames, wheels, and aerodynamic ratings from public databases to rider profiles.

Category Enforcement

Track A/B/C/D pen limits and zFTP/zMAP thresholds for competitive integrity.

Real-Time Streaming

Poll event participant counts leading up to the start gun for accurate grid sizing.

Historical Race Mining

Archive past event results for longitudinal rider analysis and team scouting.

// engagement pipeline

From event calendar to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide event filters, club IDs, or route targets. We design the extraction schema together.

Pipeline Build
d 2–4

We configure crawlers for Zwift APIs and ZwiftPower web interfaces, managing proxy rotation and session states.

Validation & QA
d 4–6

Schema validation, null-rate checks on power metrics, and sample result sets before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

Navigating Zwift's fragmented data ecosystem

Zwift data is split between the main event portal, the companion app API, and ZwiftPower. We unify these streams into a single structured schema.

pipeline-monitor · zwift.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Multi-source unification
Merging Zwift event data with ZwiftPower results

Event metadata lives on Zwift.com, while detailed race telemetry lives on ZwiftPower. We map IDs across both platforms to deliver a unified record containing both the route topography and the final sprint watts.

API rate limiting
Managing throughput against undocumented endpoints

Zwift limits request velocity on their public event schedules. We distribute polling across residential proxies to maintain high-frequency updates without triggering automated bans.

Dynamic rendering
Executing Playwright for complex result tables

ZwiftPower relies heavily on client-side rendering for race results. We use full browser sessions to execute JavaScript, ensuring all power data and category modifications are captured accurately.

Schema stability
Adapting to category enforcement updates

Zwift frequently updates their zFTP and zMAP logic. Our extraction schemas are versioned and monitored for field drift, ensuring your warehouse tables do not break when the platform changes its rules.

Change detection
Only pushing new race results

We hash race results and event schedules to detect modifications. Downstream pipelines only receive new events or updated rider placements, saving compute costs.

Applications

Who uses Zwift data

Teams across industries use zwift.com data to build competitive products and smarter operations.

01
Esports Analytics

Teams analyse competitor w/kg, sprint timing, and historical race performance to build race strategies.

02
Event Organizers

Track participant turnout, category distribution, and peak engagement times to optimise calendar slots.

03
Coaching Platforms

Ingest race results to adjust athlete training zones and track fitness progression over the indoor season.

04
Equipment Manufacturers

Correlate virtual frame and wheel choices with race outcomes in top-tier events to guide marketing.

05
Community Portals

Build independent leaderboards and route completion trackers for specific clubs or regional groups.

06
Academic Research

Analyse large scale human performance data across diverse demographics using historical race telemetry.

Why DataFlirt

"Zwift generates millions of data points every hour, but extracting historical race telemetry requires navigating a maze of undocumented APIs and legacy web portals."

Most teams struggle to unify Zwift's modern event schedule with ZwiftPower's legacy result structures. DataFlirt manages the proxy rotation, session handling, and schema normalisation required to turn virtual cycling into structured, queryable warehouse tables.

Technical Spec

Zwift scraper technical capabilities

Everything supported by our zwift.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

ZwiftPower race results
Full result tables including w/kg, avg watts, and category placement
Supported
Event schedule polling
Continuous monitoring of the public event calendar
Supported
Route and segment topography
Distance, elevation, and segment markers for all worlds
Supported
Public club directories
Club names, member counts, and public event histories
Supported
Rider category history
Historical zFTP and zMAP metrics for registered riders
Supported
Automated proxy rotation
Distributed request routing to bypass rate limits
Supported
Change detection diffs
Hash-based diffs for event schedule modifications
Supported
Live race telemetry
Live X/Y coordinate tracking requires direct UDP packet sniffing or authenticated API access
Partial
Private user profiles
Weight and height data for non-ZwiftPower registered users is restricted by privacy settings
Partial
Infrastructure

Infrastructure powering the Zwift pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across multiple regions. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested array structures
CSV
Flat file with typed columns for spreadsheet analysis
XLS
Legacy Excel format for offline reporting
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoints for on-demand data retrieval
BigQuery
Streamed directly into your dataset with schema auto-detect
Snowflake
Stage and COPY INTO workflow for incremental updates
Postgres
Upsert into your existing schema with conflict resolution
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About zwift.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Zwift legal?

Scraping publicly available event schedules and race results is generally permissible. DataFlirt targets only public, non-authenticated data. We do not extract private telemetry or violate user privacy settings. Clients should review Zwift's ToS and consult legal counsel for specific use cases.

Do you extract data from Zwift or ZwiftPower?

Both. We merge event metadata from Zwift's primary schedule with the detailed race results and power metrics hosted on ZwiftPower.

Can I scrape live race telemetry?

No. Live rider coordinates and instantaneous power output require authenticated API access or UDP packet interception. We extract post-race results and aggregate metrics.

How frequently can you poll the event schedule?

We can poll the event calendar at sub-15-minute intervals to capture late additions or rider registration surges prior to the start time.

Do you capture w/kg and zFTP data?

Yes. We extract all public performance metrics from ZwiftPower race results, including 20-minute power, w/kg, and category enforcement thresholds.

What is the minimum viable engagement?

Our packages start based on event volume or route coverage. Contact us with your specific data requirements for a scoped quote.

$ dataflirt scope --new-project --source=zwift.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. From continuous event monitoring to historical race analysis. We scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →