We extract player leaderboards, advanced metrics (fWAR, wRC+), ZiPS/Steamer projections, and minor league stats from FanGraphs. Delivered as clean JSON, CSV, or Parquet to S3 or BigQuery on your daily cadence.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Batting Leaderboards objects from fangraphs.com. All fields typed and schema-versioned.
"player_id": "10155", "name": "Mike Trout", "team": "LAA", "woba": 0.395, "wrc_plus": 155, "fwar": 6.4, "strikeout_pct": 0.231
| # | player_id | name | team | games | plate_appearances | home_runs |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Pitching Stats objects from fangraphs.com. All fields typed and schema-versioned.
"player_id": "19361", "name": "Corbin Burnes", "era": 2.89, "fip": 3.12, "xfip": 3.25, "k_per_9": 10.4, "fwar": 5.1
| # | player_id | name | team | innings_pitched | era | fip |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Projections objects from fangraphs.com. All fields typed and schema-versioned.
"player_id": "15640", "name": "Aaron Judge", "projection_system": "ZiPS", "projected_hr": 42, "projected_wrc_plus": 162, "projected_zips_war": 7.1, "updated_at": "2024-03-15T08:00:00Z"
| # | player_id | name | projection_system | year | projected_pa | projected_hr |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for RosterResource objects from fangraphs.com. All fields typed and schema-versioned.
"team_id": "NYY", "position": "CF", "player_name": "Aaron Judge", "roster_status": "Active 26-Man", "minor_league_option": false, "salary_estimate": 40000000
| # | team_id | team_name | position | player_id | player_name | roster_status |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Prospect Board objects from fangraphs.com. All fields typed and schema-versioned.
"name": "Jackson Holliday", "organization": "BAL", "future_value": 70, "scouting_hit": 60, "scouting_game_power": 55, "risk": "Low", "eta": "2024"
| # | prospect_id | name | organization | current_level | future_value | scouting_hit |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Our FanGraphs scraper handles every layer of the platform: advanced leaderboards, minor league splits, RosterResource depth charts, and daily projection updates.
Extract wRC+, fWAR, xFIP, SIERA, and hundreds of other advanced metrics across standard, advanced, and batted ball dashboards.
Automated daily extraction of batting, pitching, and fielding leaderboards updated immediately after the overnight statistical processing.
Capture ZiPS, Steamer, ATC, and THE BAT projections for upcoming seasons and rest-of-season forecasts.
Monitor depth charts, payroll estimates, minor league options, and Rule 5 eligibility across all 30 MLB organisations.
Extract performance data from Rookie complex leagues up to Triple-A, including translated metrics.
Capture O-Swing%, Z-Contact%, pitch velocity, and pitch value metrics from the Pitch Info datasets.
Extract complete season-by-season player data back to 1871 for long-term sabermetric research.
Pull platoon splits, home/away performance, and high-leverage situation data for granular analysis.
Extract scouting grades (hit, power, speed) and Future Value (FV) scores directly from THE BOARD.
We map FanGraphs fg_id to MLBAM, Retrosheet, and Baseball-Reference IDs for immediate database joins.
Brief in. Clean data out.
Provide specific leaderboards, player IDs, or projection systems. We design the extraction schema together.
We configure API interceptors and Playwright crawlers to navigate FanGraphs' complex data grids.
Schema validation, null-rate checks, and cross-reference ID mapping verification before full launch.
JSON / CSV / Parquet pushed to your S3 bucket or BigQuery dataset on your daily schedule.
FanGraphs relies on complex JavaScript data tables and heavy client-side rendering. We extract the raw JSON payloads to ensure complete data capture.
FanGraphs uses complex React data grids. We intercept the backend API calls rather than scraping the DOM where possible, ensuring high-fidelity data extraction without missing columns.
Leaderboards load via dynamic XHR. We orchestrate pagination parameters to extract full historical datasets spanning tens of thousands of rows in seconds.
Heavy requests to FanGraphs endpoints trigger IP bans. We distribute requests across US-based residential proxies to maintain high throughput without interruptions.
We map FanGraphs specific fg_id identifiers to standard MLBAM and Retrosheet IDs, allowing you to easily join the extracted sabermetrics with your internal databases.
Stats change daily during the season. We run diffs to only update players whose stats have registered new events, optimising warehouse compute costs.
Teams use extracted projections and minor league data to evaluate trade targets and free agents.
High-stakes fantasy players aggregate ZiPS and Steamer projections for draft preparation and in-season management.
Syndicates feed daily split data and pitcher xFIP into predictive models for MLB moneylines and prop bets.
Sports agencies use fWAR and wRC+ comparables to negotiate arbitration and free agent contracts.
Daily fantasy platforms use split data and RosterResource lineups to generate optimal player projections.
Researchers analyse historical pitch values and plate discipline metrics for sabermetric publications.
"FanGraphs holds the definitive public record of advanced baseball statistics, but extracting millions of daily data points from complex JavaScript grids requires dedicated infrastructure."
Most analysts waste hours manually exporting CSVs from FanGraphs leaderboards. DataFlirt automates this process entirely. We navigate the complex data grids, intercept the backend API calls, and deliver clean, structured sabermetrics directly to your data warehouse. You get the data daily, without the manual toil.
Everything supported by our fangraphs.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
We bypass fragile DOM scraping by intercepting FanGraphs' backend XHR requests, extracting clean JSON payloads directly from their data grids.
We distribute requests across residential ISP proxies to bypass rate limits during large historical data backfills.
Pipelines are orchestrated via Apache Airflow on AWS Lambda, ensuring daily updates are delivered reliably by 3 AM EST.
Data delivered to where your team already works — no new tooling required.
About fangraphs.com scraping, legality, and pipeline operations.
Ask us directly →Scraping publicly available statistical data is generally permissible under applicable law, as raw factual data (like baseball statistics) is not copyrightable. DataFlirt extracts only public, non-authenticated leaderboards and projections. We do not circumvent authentication walls for FanGraphs+ content.
Pipelines typically run daily, scheduled overnight after all MLB games conclude and FanGraphs updates their backend databases. We can also configure custom schedules for projection updates.
Yes. We maintain a cross-reference matrix mapping the FanGraphs fg_id to standard MLBAM, Retrosheet, and Baseball-Reference IDs for seamless integration with your existing datasets.
Yes, we extract data across all minor league levels, including Triple-A down to rookie complex leagues, as well as translated minor league statistics.
Instead of attempting to scrape the complex React/ExtJS DOM, we intercept the raw XHR/API responses feeding the grids. This ensures 100% data fidelity and prevents missing columns.
Yes, we can extract archived ZiPS and Steamer projections from past seasons, provided they are still accessible via the public leaderboards.
20-minute scoping call. Pilot dataset within the week. Production within two. Stop manually downloading CSVs. Get automated, daily sabermetrics delivered directly to your warehouse.