SYSTEM all green source michelinguide.com queue 18,492 pages p99 latency 215ms dataflirt.com · scraper/michelinguide-com
RUN - 41 active pipelines - michelinguide.com live

Michelin Guide data,
ready for analysis.

We extract restaurant ratings, inspector reviews, hotel Keys, cuisine metadata, and geolocation coordinates from the Michelin Guide. Delivered as clean JSON, CSV, or Parquet to S3 or BigQuery on your schedule.

Restaurants extracted
16,843 /run
Hotels tracked
5,192 /run
Inspector reviews
22,035 /total
Active pipelines
41
Uptime
99.98%
Data Dictionary

Every field we extract from michelinguide.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Restaurants & Awards objects from michelinguide.com. All fields typed and schema-versioned.

restaurant_idnameaward_typecuisine_typeprice_bracketaddresscitycountrylatitudelongitudechef_namewebsite_urlphone_number
restaurants_& awards
● 200 OK
"restaurant_id": "349821",
"name": "Osteria Francescana",
"award_type": "3 Stars",
"cuisine_type": "Creative",
"price_bracket": "$$$$",
"city": "Modena",
"country": "Italy",
"chef_name": "Massimo Bottura"
# restaurant_idnameaward_typecuisine_typeprice_bracketaddress
1
2
3

Complete list of extractable fields for Inspector Reviews objects from michelinguide.com. All fields typed and schema-versioned.

restaurant_idreview_textlanguagepublished_datespecialtieswine_list_notableatmosphere_tagsinspector_notes
inspector_reviews
● 200 OK
"restaurant_id": "349821",
"review_text": "A meal here is a memorable event, blending tradition with avant-garde techniques.",
"language": "en",
"published_date": "2023-11-14",
"specialties": "['Five ages of Parmigiano Reggiano', 'Oops! I Dropped the Lemon Tart']",
"wine_list_notable": true
# restaurant_idreview_textlanguagepublished_datespecialtieswine_list_notable
1
2
3

Complete list of extractable fields for Hotels & Keys objects from michelinguide.com. All fields typed and schema-versioned.

hotel_idhotel_namekey_ratingdescriptionaddresslatitudelongitudeprice_per_night_startamenitiesbooking_urldesign_style
hotels_& keys
● 200 OK
"hotel_id": "88421",
"hotel_name": "Aman Tokyo",
"key_rating": "3 Keys",
"price_per_night_start": 1200,
"amenities": "['Spa', 'Pool', 'Fitness Centre', 'Restaurant']",
"design_style": "Minimalist Japanese",
"booking_url": "https://guide.michelin.com/en/hotels/..."
# hotel_idhotel_namekey_ratingdescriptionaddresslatitude
1
2
3

Complete list of extractable fields for Facilities & Services objects from michelinguide.com. All fields typed and schema-versioned.

restaurant_idwheelchair_accessiblevalet_parkingair_conditioningvegetarian_menugreat_viewcounter_diningterrace_diningprivate_rooms
facilities_& services
● 200 OK
"restaurant_id": "349821",
"wheelchair_accessible": true,
"valet_parking": false,
"air_conditioning": true,
"vegetarian_menu": true,
"private_rooms": true,
"terrace_dining": false
# restaurant_idwheelchair_accessiblevalet_parkingair_conditioningvegetarian_menugreat_view
1
2
3

Complete list of extractable fields for Search & Discovery objects from michelinguide.com. All fields typed and schema-versioned.

keywordlocationradius_kmfilter_awardfilter_cuisinepositionresult_nameresult_urlscraped_timestamp
search_& discovery
● 200 OK
"location": "Paris",
"filter_award": "1 Star",
"position": 1,
"result_name": "Septime",
"result_url": "/en/ile-de-france/paris/restaurant/septime",
"scraped_timestamp": "2026-05-12T09:14:33Z"
# keywordlocationradius_kmfilter_awardfilter_cuisineposition
1
2
3

Capabilities

Everything you need from the Guide - structured and clean

Our Michelin Guide scraper handles the platform's map-based pagination, Next.js hydration states, and multi-language routing to deliver accurate hospitality data.

Star & Award Extraction

Capture 1, 2, and 3 Michelin Stars, Bib Gourmand recognitions, Green Stars, and Selected restaurant status across all global regions.

Hotel Key Tracking

Extract the new Michelin Key ratings for hotels, including price estimates, design tags, and amenity lists.

Inspector Review Mining

Pull full written inspector reviews, atmosphere tags, and notable dish mentions for sentiment analysis and enrichment.

Precise Geolocation

Extract accurate latitude and longitude coordinates directly from the map state, bypassing standard address normalisation issues.

Multi-Language Support

Scrape content across en, fr, it, ja, and other regional subdirectories to capture localised inspector notes.

Price & Cuisine Metadata

Track price brackets, primary cuisine classifications, and specific dietary offerings like vegetarian or vegan menus.

Booking & Contact Links

Extract official website URLs, phone numbers, and integrated booking partner links for lead generation.

Chef Identification

Capture head chef names associated with Starred and Selected restaurants for industry mapping.

Change Detection

Run continuous pipelines that detect newly added restaurants, upgraded Stars, or removed listings during annual announcements.

// engagement pipeline

From global map to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Select target regions, award categories, or specific data types like hotels vs restaurants. We design the schema.

Pipeline Build
d 2–4

We configure crawlers to handle Next.js state extraction, map pagination, and regional routing.

Validation & QA
d 4–6

Schema validation, coordinate accuracy checks, and translation consistency verification before launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket or BigQuery dataset on an agreed schedule.

Under the hood

How our Michelin pipeline handles the hard parts

Extracting data from modern map-driven SPA architectures requires specific techniques. Here is how we ensure reliable delivery.

pipeline-monitor · michelinguide.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
State extraction
Next.js hydration parsing

The Michelin Guide relies on Next.js. Instead of brittle DOM scraping, we extract the raw JSON payloads embedded in the page hydration state, ensuring 100% accuracy for coordinates, awards, and IDs.

Pagination
Map-based boundary traversal

Standard pagination is limited on map-centric sites. Our crawlers systematically divide global regions into coordinate bounding boxes, querying the backend API to ensure zero dropped listings.

Language routing
Consistent locale normalisation

Michelin uses complex regional subdirectories (e.g., /en/ile-de-france/paris). We force consistent locale headers and map regional URLs to a unified schema, preventing duplicate entries across languages.

Anti-bot layer
Residential proxy rotation

To prevent IP bans during full-catalogue sweeps, we route traffic through residential proxy pools, rotating IPs per request and managing session cookies effectively.

Change tracking
Annual release monitoring

Michelin announces awards regionally throughout the year. We monitor specific regional indexes daily, pushing diffs immediately when new Stars, Bib Gourmands, or Keys are published.

Applications

Who uses Michelin Guide data - and how

Teams across industries use michelinguide.com data to build competitive products and smarter operations.

01
F&B Market Research

Hospitality groups track cuisine trends, pricing models, and geographic density of premium dining to inform expansion strategies.

02
Travel Aggregator Enrichment

OTAs and luxury travel platforms ingest Michelin Star and Key data to badge premium inventory on their own platforms.

03
Supplier Targeting

Premium food and beverage distributors use the database to identify and target high-end restaurants and executive chefs.

04
Real Estate Investment

Commercial real estate analysts correlate Michelin Star density with neighbourhood gentrification and property value trends.

05
Competitor Benchmarking

Hotel groups monitor the new Michelin Key awards to benchmark their properties against local luxury competitors.

06
AI Training Data

LLM developers use the highly structured, multi-lingual inspector reviews to train sentiment and culinary classification models.

Why DataFlirt

"The Michelin Guide remains the gold standard for global hospitality data, but extracting its map-based Next.js architecture requires dedicated infrastructure."

Most teams underestimate the investment required: reliable Michelin scraping requires handling map-driven pagination, extracting Next.js hydration states, and managing rate limits across global regions. DataFlirt absorbs that complexity so your engineers can focus on the analysis, not the pipeline.

Technical Spec

Michelin Guide scraper - technical capabilities

Everything supported by our michelinguide.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

Next.js state extraction
Direct parsing of __NEXT_DATA__ JSON for perfect data fidelity
Supported
Map-based pagination
Bounding box coordinate traversal for complete geographic coverage
Supported
Multi-language support
Extraction across en, fr, it, ja, and other regional locales
Supported
Coordinate extraction
Precise latitude and longitude for every restaurant and hotel
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed awards or details
Supported
Webhook delivery
HTTP POST per record or batch for immediate updates
Supported
Proxy rotation
ISP-grade residential IPs to bypass rate limiting during deep crawls
Supported
User saved lists
Extracting private user favourite lists requires account authentication
Partial
Premium booking portal auth
Accessing backend reservation availability requires partner credentials
Partial
Infrastructure

Infrastructure powering the Michelin pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles regional crawl orchestration and deduplication. Playwright is deployed selectively to handle complex map interactions and trigger hydration states.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across global regions. Rotation happens per-request to ensure uninterrupted access to regional Michelin directories.

Cloud-Native Orchestration

Pipelines run on AWS ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state is stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested arrays
CSV
Flat file with typed columns
XLS
Excel compatible format for business teams
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoint to query your extracted datasets
BigQuery
Streamed directly into your dataset
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About michelinguide.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping the Michelin Guide legal?

Scraping publicly available information from the Michelin Guide is generally permissible under applicable law. DataFlirt targets only public, non-authenticated restaurant, hotel, and award data. We do not extract personal user data or circumvent authentication walls. Clients should review the site's ToS and consult legal counsel for specific use cases.

Can you extract precise geographic coordinates?

Yes. The Michelin Guide relies heavily on map-based discovery. We extract the exact latitude and longitude coordinates embedded in the application state for every listed property.

How frequently can you update the data?

While the global catalogue is relatively static, regional awards are announced on specific dates throughout the year. We can configure pipelines to run daily change-detection sweeps to catch new additions immediately.

Do you extract the new Michelin Hotel Keys?

Yes. We extract 1, 2, and 3 Key ratings, along with hotel descriptions, amenities, price estimates, and booking URLs.

How do you handle multiple languages?

We can target specific regional subdirectories (e.g., /en, /fr, /ja) to extract inspector reviews in your preferred language, or scrape multiple locales simultaneously and map them to a unified schema.

What is the minimum viable engagement?

Our packages start at full extractions of specific regions or award tiers (e.g., all Starred restaurants globally). Contact us with your specific data requirements for a scoped quote.

Can I request a sample dataset?

Yes. We provide a sample run of up to 500 records as part of the pre-engagement scoping process, allowing you to validate the schema and coordinate accuracy before committing.

$ dataflirt scope --new-project --source=michelinguide.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off global restaurant dump or continuous tracking of new Star additions, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →