SYSTEM all green source cntraveler.com queue 14,892 pages p99 latency 185ms dataflirt.com · scraper/cntraveler-com

RUN · 42 active pipelines · cntraveler.com live

CNTraveler data,
at warehouse scale.

We extract hotel reviews, destination guides, restaurant recommendations, and Readers' Choice rankings from Condé Nast Traveler. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Get data from cntraveler.com → See how it works

Hotels extracted

32,410 /run

Destination guides

8,941 /run

Editorial articles

142K /total

Active pipelines

Uptime

99.98%

◆ Hotel Reviews◆ Destination Guides◆ Readers' Choice Awards◆ Restaurant Recommendations◆ Curated Itineraries◆ Cruise Ship Ratings◆ Airline Reviews◆ Gold List Properties◆ Hot List Winners◆ Editorial Metadata◆ Managed Pipeline◆ S3 Delivery◆ Bengaluru HQ◆ Hotel Reviews◆ Destination Guides◆ Readers' Choice Awards◆ Restaurant Recommendations◆ Curated Itineraries◆ Cruise Ship Ratings◆ Airline Reviews◆ Gold List Properties◆ Hot List Winners◆ Editorial Metadata◆ Managed Pipeline◆ S3 Delivery◆ Bengaluru HQ

Data Dictionary

Every field we extract from cntraveler.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Hotel Reviews objects from cntraveler.com. All fields typed and schema-versioned.

hotel_namelocationratingeditor_reviewprice_rangeamenitiesreaders_choice_winnergold_list_statusbooking_urlimage_urls

"hotel_name": "The Ritz-Carlton, Kyoto",
"location": "Kyoto, Japan",
"rating": 98.4,
"price_range": "$$$$",
"readers_choice_winner": true,
"gold_list_status": true,
"editor_review": "A riverside sanctuary blending traditional ryokan aesthetics with modern luxury."

#	hotel_name	location	rating	editor_review	price_range	amenities
1
2
3

Complete list of extractable fields for Destination Guides objects from cntraveler.com. All fields typed and schema-versioned.

destination_nameregioncountrybest_time_to_visitcurrencylanguagetop_hotelstop_restaurantsthings_to_doauthor

"destination_name": "Amalfi Coast",
"country": "Italy",
"best_time_to_visit": "May to September",
"top_hotels": "['Le Sirenuse', 'Hotel Santa Caterina']",
"top_restaurants": "['La Sponda', 'Lo Scoglio']",
"language": "Italian"

#	destination_name	region	country	best_time_to_visit	currency	language
1
2
3

Complete list of extractable fields for Readers' Choice Awards objects from cntraveler.com. All fields typed and schema-versioned.

award_yearcategoryregionrankentity_namescoreprevious_rankdescriptionurl

"award_year": 2023,
"category": "Top 50 Hotels in the World",
"rank": 1,
"entity_name": "Ballyfin",
"score": 99.2,
"previous_rank": 4

#	award_year	category	region	rank	entity_name	score
1
2
3

Complete list of extractable fields for Restaurant Reviews objects from cntraveler.com. All fields typed and schema-versioned.

restaurant_namecitycuisineprice_tierchefmust_orderatmosphereeditor_ratingaddress

"restaurant_name": "Pujol",
"city": "Mexico City",
"cuisine": "Mexican",
"price_tier": "$$$",
"chef": "Enrique Olvera",
"must_order": "Mole Madre"

#	restaurant_name	city	cuisine	price_tier	chef	must_order
1
2
3

Complete list of extractable fields for Editorial Articles objects from cntraveler.com. All fields typed and schema-versioned.

article_titleauthorpublish_dateupdate_datecategorytagscontent_bodyhero_image_urlrelated_articles

"article_title": "The 21 Best Places to Go in 2024",
"author": "CN Traveler Editors",
"publish_date": "2023-11-15",
"category": "Inspiration",
"tags": "['Travel Guide', '2024']",
"update_date": "2024-01-05"

#	article_title	author	publish_date	update_date	category	tags
1
2
3

Capabilities

Everything you need from Condé Nast Traveler

Our CNTraveler scraper handles every layer of the platform, extracting structured data from complex editorial layouts, infinite scroll feeds, and interactive maps.

Full Hotel Directory Extraction

Extract property details, editor reviews, amenities, and pricing tiers across all global regions.

Readers' Choice Data Mining

Capture historical and current rankings for hotels, resorts, cities, islands, and airlines.

Destination Guide Aggregation

Compile curated itineraries, best-time-to-visit recommendations, and local laws for thousands of cities.

Restaurant & Bar Curation

Extract editor-approved dining spots, signature dishes, and atmosphere descriptors.

Gold List & Hot List Tracking

Monitor the properties that make Condé Nast Traveler's highly coveted annual editor lists.

Cruise & Airline Ratings

Scrape detailed reviews of cruise itineraries, cabin classes, and airline lounge experiences.

Editorial Article Parsing

Extract full text, author metadata, publication dates, and embedded media from travel features.

High-Resolution Image Capture

Extract CDN links for professional photography galleries associated with properties and destinations.

Scheduled Content Syncing

Run one-off bulk exports or configure continuous pipelines at weekly cadences for new content.

// engagement pipeline

From URL list to warehouse record

Brief in. Clean data out.

Define Scope

d 0

Provide destination URLs, hotel categories, or award years. We design the extraction schema together.

Pipeline Build

d 2–4

We configure Scrapy crawlers, proxy rotation, and JavaScript rendering to handle media-heavy page loads.

Validation & QA

d 4–6

Schema validation, null-rate checks, and sample data review before full launch.

Delivery

ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our CNTraveler pipeline handles the hard parts

Modern media sites rely on heavy JavaScript frameworks and aggressive caching. Here is how we extract structured data reliably.

// fingerprinting

Identity rotation

TLS fingerprintrandomised

User-agentrotated

IP poolresidential

Challenges blocked0

// pagination

Page coverage

48,291 pages queued running

// observability

Pipeline health

99.9%

uptime

142ms

p99 lat

0.3%

null rate

alerts

JavaScript rendering

Full Playwright execution for Next.js hydration

CNTraveler uses modern frontend frameworks. We run full browser sessions to execute JavaScript, hydrate React components, and trigger lazy-loaded image galleries that headless clients miss entirely.

Anti-bot layer

Residential proxy rotation

Media publishers deploy Web Application Firewalls to block automated scraping. Our crawlers use residential ISP proxies with realistic browser fingerprints to bypass rate limits.

Infinite scroll handling

Automated pagination extraction

Destination guides and article feeds rely on infinite scrolling. Our scripts simulate user scrolling behaviour to capture the complete document tree before extraction.

Schema stability

Resilient selectors for editorial layouts

Editorial content formats vary wildly between standard articles, galleries, and interactive maps. We use multi-layer fallback chains to ensure consistent data extraction across all template types.

Change detection

Only re-scrape updated articles

We maintain a hash index of last-seen publication and modification dates. Subsequent runs only pull newly published or updated guides, reducing compute overhead.

Applications

Who uses CNTraveler data

Teams across industries use cntraveler.com data to build competitive products and smarter operations.

Travel Aggregator Enrichment

OTAs and booking platforms enrich their property listings with third-party editor reviews and award badges to drive conversion.

Luxury Brand Intelligence

Hospitality brands monitor their inclusion in the Gold List, Hot List, and Readers' Choice Awards against competitor sets.

Market Research

Tourism boards analyse destination sentiment, recommended itineraries, and trending regions to inform marketing spend.

AI Training Data

LLM developers use high-quality, editorially curated travel guides and hotel descriptions to fine-tune travel recommendation models.

Sentiment Analysis

Hospitality holding companies track qualitative descriptors used by professional travel writers to assess property positioning.

Content Curation

Travel agents and concierge services ingest curated restaurant and activity data to build bespoke client itineraries.

Why DataFlirt

"Condé Nast Traveler holds decades of the most authoritative hospitality curation on the internet. Extracting it from unstructured editorial layouts requires purpose-built infrastructure."

Most teams underestimate the complexity of scraping modern media publications. Extracting clean, structured data from disparate editorial templates, interactive maps, and infinite-scroll galleries requires full JavaScript rendering and resilient selector strategies. DataFlirt manages the infrastructure so your data science team can focus on analysis.

Technical Spec

CNTraveler scraper technical capabilities

Everything supported by our cntraveler.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering

Full Playwright sessions for React hydration and lazy-loaded galleries

Supported

Residential proxy rotation

ISP-grade residential IPs to bypass WAF rate limits

Supported

Readers' Choice historical data

Extract award data across all historical years available on the site

Supported

Interactive map extraction

Capture coordinates and POI metadata from embedded Mapbox widgets

Supported

Gallery image CDN links

Extract high-resolution image URLs without watermarks where available

Supported

Author & metadata parsing

Extract publication dates, update timestamps, and author bios

Supported

Change detection

Hash-based diff to only emit records with changed fields since last run

Supported

Subscriber-only premium content

Articles locked behind the Condé Nast digital subscription paywall

Partial

User saved itineraries

Private lists and saved places tied to individual user accounts

Partial

Infrastructure

Infrastructure powering the CNTraveler pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus

Scrapy + Playwright Stack

Scrapy handles crawl orchestration and deduplication. Playwright handles JavaScript rendering, infinite scrolling, and React hydration. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies. Rotation happens per-request to bypass media publisher WAFs and rate limits without triggering blocks.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state is stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON

Newline-delimited or nested, schema versioned per run

CSV

Flat file with typed columns, Excel/Sheets compatible

Parquet

Columnar format for BigQuery, Snowflake, Athena

AWS S3

Direct bucket delivery, compatible with any data lake

Webhook

HTTP POST per record for real-time downstream processing

API

RESTful endpoint to query extracted destination data on demand

XLS

Legacy spreadsheet format for non-technical business teams

BigQuery

Streamed directly into your dataset with schema auto-detect

Direct bucket delivery — compatible with any data lake

// faq

Common questions.

About cntraveler.com scraping, legality, and pipeline operations.

Ask us directly →

Is scraping Condé Nast Traveler legal?

Scraping publicly available editorial content is generally permissible under applicable law, provided it does not infringe on copyright for republication. DataFlirt extracts factual data like hotel names, locations, ratings, award status, and snippets for analytical use. We do not bypass subscription paywalls. Clients should consult legal counsel regarding fair use and copyright.

How do you handle different article layouts?

CNTraveler uses various templates for standard articles, galleries, and listicles. Our selector strategy uses multi-layer fallback chains incorporating CSS, XPath, and LD+JSON to normalise unstructured editorial text into clean, structured schemas.

Can you extract data from the Readers' Choice Awards?

Yes. We extract the complete hierarchy of Readers' Choice data, including award year, category, regional rank, property name, and numerical score across all available historical data.

Do you capture high-resolution images?

We extract the CDN URLs for all property and destination images embedded in articles and galleries. We do not download the binary files directly, but provide the links for your systems to ingest.

How fresh is the data?

For editorial sites like CNTraveler, we typically configure weekly or monthly pipeline runs to capture newly published guides, updated hotel reviews, and annual award releases. One-off historical extractions are also available.

What is the minimum viable engagement?

Our smallest packages start at a defined category extraction, such as all European hotel reviews, with monthly delivery. For full-site extraction, we price based on volume and delivery frequency.

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off dump of historical Readers' Choice winners or a continuous feed of new hotel reviews, we scope, build, and operate the pipeline. Tell us what you need.

Start a cntraveler.com pipeline → View pricing

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h

Services

Data Extraction for Every Industry

View All Services →

🛍️ eCommerce → 🔍 Search Engine → ⚽ Sports Data → 📱 App Store → 🍕 Food Delivery → 📉 Betting Odds → ✈️ Aviation & Flight → 🛒 Grocery → 🎓 E-Learning → 💹 Stock Market → 🏠 Real Estate → 🤖 AI Training Data → 🧠 LLM Data → 📰 News → ⭐ Reviews → 💼 Job Board → 🏥 Healthcare → 💊 Pharma → 🏢 Company Data → 🤝 B2B Marketplace → 🚗 Automotive → 🌍 Travel → 🏨 Hospitality → 🪙 Cryptocurrency → 💡 IP & Patents → 📈 SEO Data → ⚖️ Legal → 🛡️ Insurance → 📲 Mobile App → 📸 Influencer → 🏛️ Government → 🚚 Transportation → 🎟️ Events → 📂 Directory → ⚡ Dynamic Websites → 📄 PDF Extraction → ✍️ Blog Content → ☁️ Weather → 🖥️ Cloud Scraping → 👨‍💻 Managed Service →

CNTraveler data, at warehouse scale.

Every field we extract from cntraveler.com

Everything you need from Condé Nast Traveler

From URL list to warehouse record

How our CNTraveler pipeline handles the hard parts

Who uses CNTraveler data

CNTraveler scraper technical capabilities

Infrastructure powering the CNTraveler pipeline

Your data, your destination

Common questions.

Tell us whatto extract. We do the rest.

Data Extraction for Every Industry

CNTraveler data,
at warehouse scale.

Tell us what
to extract.
We do the rest.