Zwift Scraper: Virtual Cycling Events, Routes & Race Data Extraction

Data Dictionary

Every field we extract from zwift.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Events & Races objects from zwift.com. All fields typed and schema-versioned.

event_idtitleworldroutedistance_kmelevation_mcategory_enforcementstart_timeorganizerregistered_count

"event_id": "349102",
"title": "Tour de Zwift Stage 2",
"world": "Watopia",
"distance_km": 42.5,
"elevation_m": 342,
"start_time": "2024-01-14T18:00:00Z",
"category_enforcement": true,
"registered_count": 1492

#	event_id	title	world	route	distance_km	elevation_m
1
2
3

Complete list of extractable fields for Routes & Worlds objects from zwift.com. All fields typed and schema-versioned.

route_idnameworldleadin_distancetotal_distancetotal_elevationdifficulty_ratingbadge_xpzwift_insider_linksprint_segments

"route_id": "watopia_figure_8",
"name": "Figure 8",
"world": "Watopia",
"total_distance": 29.8,
"total_elevation": 234,
"badge_xp": 580,
"difficulty_rating": 3.2,
"leadin_distance": 0.2

#	route_id	name	world	leadin_distance	total_distance	total_elevation
1
2
3

Complete list of extractable fields for ZwiftPower Results objects from zwift.com. All fields typed and schema-versioned.

zwptidrider_namecategorypositiontime_secondsgap_secondsavg_wattswkgavg_hrweight_kg

"zwptid": "849201",
"rider_name": "J. Doe",
"category": "B",
"position": 14,
"time_seconds": 3412,
"wkg": 3.8,
"avg_watts": 284,
"avg_hr": 165

#	zwptid	rider_name	category	position	time_seconds	gap_seconds
1
2
3

Complete list of extractable fields for Segments objects from zwift.com. All fields typed and schema-versioned.

segment_idnametypedistance_mavg_gradientmax_gradientworldroute_associationsrecord_time

"segment_id": "alpe_du_zwift",
"name": "Alpe du Zwift",
"type": "KOM",
"distance_m": 12200,
"avg_gradient": 8.5,
"record_time": 2014,
"world": "Watopia",
"max_gradient": 14.2

#	segment_id	name	type	distance_m	avg_gradient	max_gradient
1
2
3

Complete list of extractable fields for Clubs objects from zwift.com. All fields typed and schema-versioned.

club_idnamedescriptionmember_countpublic_statusevent_countcreated_atowner_nametags

"club_id": "99382",
"name": "DIRT",
"member_count": 14291,
"public_status": true,
"event_count": 42,
"tags": "['racing', 'social']",
"owner_name": "M. Smith",
"created_at": "2018-04-12T00:00:00Z"

#	club_id	name	description	member_count	public_status	event_count
1
2
3

Capabilities

Extract every metric from Watopia to Makuri Islands

Our Zwift scraper handles the complex web of Zwift.com event listings and ZwiftPower race telemetry. We parse route parameters, categorical enforcement rules, and historical results with precision.

Event Discovery

Scrape the full calendar of upcoming rides and races, including category limits and registered rider counts.

Route Topography

Extract distance, elevation gain, lead-in metrics, and badge XP for every defined route.

ZwiftPower Integration

Pull race results, categorical upgrades, and 20-minute power metrics directly from ZwiftPower.

Segment Analysis

Track KOM, Sprint, and lap segment metrics across all worlds and map expansions.

World Rotation Tracking

Monitor active worlds and schedule changes to align with event planning.

Club Intelligence

Extract public club rosters, event histories, and membership counts.

Equipment Metadata

Map frames, wheels, and aerodynamic ratings from public databases to rider profiles.

Category Enforcement

Track A/B/C/D pen limits and zFTP/zMAP thresholds for competitive integrity.

Real-Time Streaming

Poll event participant counts leading up to the start gun for accurate grid sizing.

Historical Race Mining

Archive past event results for longitudinal rider analysis and team scouting.

// engagement pipeline

From event calendar to warehouse record

Brief in. Clean data out.

Define Scope

d 0

Provide event filters, club IDs, or route targets. We design the extraction schema together.

Pipeline Build

d 2–4

We configure crawlers for Zwift APIs and ZwiftPower web interfaces, managing proxy rotation and session states.

Validation & QA

d 4–6

Schema validation, null-rate checks on power metrics, and sample result sets before full launch.

Delivery

ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

Navigating Zwift's fragmented data ecosystem

Zwift data is split between the main event portal, the companion app API, and ZwiftPower. We unify these streams into a single structured schema.

// fingerprinting

Identity rotation

TLS fingerprintrandomised

User-agentrotated

IP poolresidential

Challenges blocked0

// pagination

Page coverage

48,291 pages queued running

// observability

Pipeline health

99.9%

uptime

142ms

p99 lat

0.3%

null rate

alerts

Multi-source unification

Merging Zwift event data with ZwiftPower results

Event metadata lives on Zwift.com, while detailed race telemetry lives on ZwiftPower. We map IDs across both platforms to deliver a unified record containing both the route topography and the final sprint watts.

API rate limiting

Managing throughput against undocumented endpoints

Zwift limits request velocity on their public event schedules. We distribute polling across residential proxies to maintain high-frequency updates without triggering automated bans.

Dynamic rendering

Executing Playwright for complex result tables

ZwiftPower relies heavily on client-side rendering for race results. We use full browser sessions to execute JavaScript, ensuring all power data and category modifications are captured accurately.

Schema stability

Adapting to category enforcement updates

Zwift frequently updates their zFTP and zMAP logic. Our extraction schemas are versioned and monitored for field drift, ensuring your warehouse tables do not break when the platform changes its rules.

Change detection

Only pushing new race results

We hash race results and event schedules to detect modifications. Downstream pipelines only receive new events or updated rider placements, saving compute costs.

Applications

Who uses Zwift data

Teams across industries use zwift.com data to build competitive products and smarter operations.

Esports Analytics

Teams analyse competitor w/kg, sprint timing, and historical race performance to build race strategies.

Event Organizers

Track participant turnout, category distribution, and peak engagement times to optimise calendar slots.

Coaching Platforms

Ingest race results to adjust athlete training zones and track fitness progression over the indoor season.

Equipment Manufacturers

Correlate virtual frame and wheel choices with race outcomes in top-tier events to guide marketing.

Community Portals

Build independent leaderboards and route completion trackers for specific clubs or regional groups.

Academic Research

Analyse large scale human performance data across diverse demographics using historical race telemetry.

Technical Spec

Zwift scraper technical capabilities

Everything supported by our zwift.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

ZwiftPower race results

Full result tables including w/kg, avg watts, and category placement

Supported

Event schedule polling

Continuous monitoring of the public event calendar

Supported

Route and segment topography

Distance, elevation, and segment markers for all worlds

Supported

Public club directories

Club names, member counts, and public event histories

Supported

Rider category history

Historical zFTP and zMAP metrics for registered riders

Supported

Automated proxy rotation

Distributed request routing to bypass rate limits

Supported

Change detection diffs

Hash-based diffs for event schedule modifications

Supported

Live race telemetry

Live X/Y coordinate tracking requires direct UDP packet sniffing or authenticated API access

Partial

Private user profiles

Weight and height data for non-ZwiftPower registered users is restricted by privacy settings

Partial

Infrastructure

Infrastructure powering the Zwift pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus

Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across multiple regions. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON

Newline-delimited or nested array structures

CSV

Flat file with typed columns for spreadsheet analysis

XLS

Legacy Excel format for offline reporting

Parquet

Columnar format for BigQuery, Snowflake, Athena

AWS S3

Direct bucket delivery compatible with any data lake

Webhook

HTTP POST per record for real-time downstream processing

API

REST endpoints for on-demand data retrieval

BigQuery

Streamed directly into your dataset with schema auto-detect

Snowflake

Stage and COPY INTO workflow for incremental updates

Postgres

Upsert into your existing schema with conflict resolution

Direct bucket delivery — compatible with any data lake

// faq

Common questions.

About zwift.com scraping, legality, and pipeline operations.

Ask us directly →

Is scraping Zwift legal?

Scraping publicly available event schedules and race results is generally permissible. DataFlirt targets only public, non-authenticated data. We do not extract private telemetry or violate user privacy settings. Clients should review Zwift's ToS and consult legal counsel for specific use cases.

Do you extract data from Zwift or ZwiftPower?

Both. We merge event metadata from Zwift's primary schedule with the detailed race results and power metrics hosted on ZwiftPower.

Can I scrape live race telemetry?

No. Live rider coordinates and instantaneous power output require authenticated API access or UDP packet interception. We extract post-race results and aggregate metrics.

How frequently can you poll the event schedule?

We can poll the event calendar at sub-15-minute intervals to capture late additions or rider registration surges prior to the start time.

Do you capture w/kg and zFTP data?

Yes. We extract all public performance metrics from ZwiftPower race results, including 20-minute power, w/kg, and category enforcement thresholds.

What is the minimum viable engagement?

Our packages start based on event volume or route coverage. Contact us with your specific data requirements for a scoped quote.

Zwift telemetry,
at warehouse scale.

Every field we extract from zwift.com

Extract every metric from Watopia to Makuri Islands

From event calendar to warehouse record

Navigating Zwift's fragmented data ecosystem

Who uses Zwift data

Zwift scraper technical capabilities

Infrastructure powering the Zwift pipeline

Your data, your destination

Common questions.

Tell us what
to extract.
We do the rest.

Data Extraction for Every Industry

Zwift telemetry, at warehouse scale.

Every field we extract from zwift.com

Extract every metric from Watopia to Makuri Islands

From event calendar to warehouse record

Navigating Zwift's fragmented data ecosystem

Who uses Zwift data

Zwift scraper technical capabilities

Infrastructure powering the Zwift pipeline

Your data, your destination

Common questions.

Tell us whatto extract. We do the rest.

Data Extraction for Every Industry

Zwift telemetry,
at warehouse scale.

Tell us what
to extract.
We do the rest.