SYSTEM all green source frommers.com queue 12,408 pages p99 latency 218ms dataflirt.com · scraper/frommers-com
RUN . 42 active pipelines . frommers.com live

Travel intelligence,
structured for scale.

We extract destination hierarchies, hotel reviews, restaurant ratings, and attraction data from Frommers. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Destinations mapped
4,892 /run
POI records
184K /run
Expert reviews
212K /total
Active pipelines
42
Uptime
99.98%
Data Dictionary

Every field we extract from frommers.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Destinations objects from frommers.com. All fields typed and schema-versioned.

destination_idcontinentcountryregioncitydescription_htmlbest_time_to_visitcurrencylanguageparent_destination_idpage_urlscraped_at
destinations
● 200 OK
"destination_id": "D-18492",
"continent": "Europe",
"country": "Italy",
"city": "Rome",
"currency": "Euro",
"language": "Italian",
"page_url": "https://www.frommers.com/destinations/rome"
# destination_idcontinentcountryregioncitydescription_html
1
2
3

Complete list of extractable fields for Hotels objects from frommers.com. All fields typed and schema-versioned.

poi_idnamedestination_idexpert_ratingprice_tieraddressphonewebsiteamenitiesexpert_review_textneighborhoodlatitudelongitude
hotels
● 200 OK
"poi_id": "H-93812",
"name": "Hotel Hassler Roma",
"expert_rating": 3,
"price_tier": "$$$$",
"neighborhood": "Spanish Steps",
"address": "Piazza Trinità dei Monti 6, Rome",
"latitude": 41.9061,
"longitude": 12.4833
# poi_idnamedestination_idexpert_ratingprice_tieraddress
1
2
3

Complete list of extractable fields for Restaurants objects from frommers.com. All fields typed and schema-versioned.

poi_idnamedestination_idcuisine_typeexpert_ratingprice_tieraddressopening_hoursexpert_review_textneighborhoodlatitudelongitude
restaurants
● 200 OK
"poi_id": "R-44102",
"name": "Roscioli Salumeria con Cucina",
"cuisine_type": "Roman",
"expert_rating": 2,
"price_tier": "$$$",
"neighborhood": "Campo de' Fiori",
"address": "Via dei Giubbonari 21/22, Rome"
# poi_idnamedestination_idcuisine_typeexpert_ratingprice_tier
1
2
3

Complete list of extractable fields for Attractions objects from frommers.com. All fields typed and schema-versioned.

poi_idnamedestination_idcategoryexpert_ratingadmission_feeopening_hoursaddressexpert_review_textneighborhoodlatitudelongitude
attractions
● 200 OK
"poi_id": "A-11094",
"name": "Colosseum",
"category": "Historic Site",
"expert_rating": 3,
"admission_fee": "16 EUR",
"neighborhood": "Ancient Rome",
"address": "Piazza del Colosseo 1, Rome"
# poi_idnamedestination_idcategoryexpert_ratingadmission_fee
1
2
3

Complete list of extractable fields for Itineraries objects from frommers.com. All fields typed and schema-versioned.

itinerary_idtitledestination_idduration_daysauthordescriptiontarget_audienceday_by_day_planpage_urlscraped_at
itineraries
● 200 OK
"itinerary_id": "I-5521",
"title": "Rome in 3 Days",
"duration_days": 3,
"author": "Donald Strachan",
"target_audience": "First-time visitors",
"page_url": "https://www.frommers.com/destinations/rome/itineraries/in-3-days"
# itinerary_idtitledestination_idduration_daysauthordescription
1
2
3

Capabilities

Extract the entire travel guide catalogue

Our infrastructure parses deeply nested destination hierarchies, unstructured expert reviews, and dynamic map layers into a normalised schema.

Destination Taxonomy Mapping

Maintain parent-child relationships across continents, countries, regions, cities, and neighborhoods.

Hotel & Accommodation Data

Extract property names, expert star ratings, price tiers, amenities, and full editorial reviews.

Restaurant Intelligence

Capture cuisine types, pricing indicators, opening hours, and location data for dining POIs.

Attraction Cataloguing

Aggregate historic sites, museums, and nightlife venues with admission fees and operating schedules.

Expert Rating Normalisation

Standardise the proprietary Frommers star rating system into queryable numerical fields.

Geo-coordinate Extraction

Intercept map API payloads to extract exact latitude and longitude coordinates for POIs.

Itinerary Parsing

Deconstruct suggested itineraries into structured day-by-day arrays with linked POIs.

Article & Blog Mining

Scrape travel tips, news, and best-of lists with author attribution and publication dates.

Content Change Detection

Monitor guide update timestamps to only re-scrape content that has been modified by editors.

// engagement pipeline

From target destinations to warehouse records

Brief in. Clean data out.

Define Scope
d 0

Provide specific continents, countries, or cities. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy crawlers, handle pagination, and map the destination taxonomy.

Validation & QA
d 4–6

Schema validation, null-rate checks, and nested relationship mapping before full launch.

Delivery
ongoing

JSON, CSV, or Parquet pushed to your S3 bucket or warehouse on an agreed cadence.

Under the hood

How we handle complex travel data structures

Extracting editorial travel guides requires structural awareness. Here is how we parse unstructured text and nested hierarchies.

pipeline-monitor · frommers.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Hierarchical crawling
Maintaining parent-child destination relationships

Travel guides rely on taxonomy. We inject parent destination IDs into every child POI record, ensuring your database understands that the Colosseum belongs to Ancient Rome, which belongs to Rome, which belongs to Italy.

Unstructured text parsing
Extracting entities from editorial paragraphs

Frommers relies heavily on prose. We use custom regex pipelines and NLP post-processing to extract implicit amenities, opening hours, and pricing details buried within expert review text.

Map data extraction
Intercepting background API calls for coordinates

Exact lat/long coordinates are often hidden within dynamic map renders. Our Playwright instances intercept the background XHR requests to extract precise spatial data for every POI.

Change detection
Only updating modified guides

Travel content updates sporadically. We monitor publication timestamps and maintain a hash index of last-seen values, pushing only diffs to reduce downstream processing load.

Pagination handling
Deep crawling of POI lists

Major cities have hundreds of hotels and restaurants split across complex pagination structures. Our crawlers traverse every page to ensure complete catalogue extraction.

Applications

Who uses Frommers data and how

Teams across industries use frommers.com data to build competitive products and smarter operations.

01
OTA & Booking Platforms

Enrich existing hotel and attraction listings with trusted editorial reviews and expert ratings to increase conversion.

02
Travel Planning Apps

Feed structured POI data and suggested itineraries into proprietary trip planning algorithms and AI assistants.

03
Location Intelligence

Map the density of highly rated restaurants and attractions to evaluate real estate or retail opportunities.

04
Market Research

Analyse destination popularity trends and pricing tiers across different global regions.

05
Content Aggregators

Build comprehensive travel portals by combining Frommers expert reviews with user-generated content from other sources.

06
AI Training Data

Train large language models on high-quality, professionally edited travel writing and destination descriptions.

Why DataFlirt

"Frommers holds decades of curated travel intelligence, but accessing it programmatically requires parsing deeply nested destination hierarchies."

Extracting travel guides at scale involves more than simple HTTP requests. It requires maintaining complex parent-child relationships across continents, countries, and cities while parsing unstructured expert reviews into clean, queryable fields. DataFlirt handles this structural complexity natively.

Technical Spec

Frommers scraper technical capabilities

Everything supported by our frommers.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

Destination taxonomy mapping
Maintains exact hierarchical relationships from continent down to neighborhood
Supported
Expert review extraction
Captures full editorial text and normalises proprietary star ratings
Supported
Map API interception
Extracts exact latitude and longitude from dynamic map payloads
Supported
Price tier capture
Records qualitative price indicators for normalisation
Supported
Change detection (diffs)
Monitors update timestamps to only emit modified records
Supported
Webhook delivery
HTTP POST per record or batch integration
Supported
User account profiles
Personal saved trips and user profile data are gated behind authentication
Partial
Bookings and transactions
Third-party booking integrations and live availability are not scraped
Partial
Infrastructure

Infrastructure powering the pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and dependency mapping. Playwright executes JavaScript to intercept map APIs and dynamic content.

Residential Proxy Infrastructure

We route requests through ISP-grade residential IPs to prevent rate limiting during deep catalogue traversal.

Cloud-Native Orchestration

Airflow handles scheduling and dependency management, running on AWS Lambda and ECS for sustained throughput.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Nested structures perfect for hierarchical destination data
CSV
Flat file with typed columns for POI lists
XLS
Excel compatible output for editorial teams
Parquet
Columnar format for BigQuery and Snowflake
AWS S3
Direct bucket delivery on an automated schedule
Webhook
HTTP POST per record for immediate ingestion
API
REST endpoints to query extracted datasets
PostgreSQL
Direct upsert into your relational database
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About frommers.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Frommers legal?

Scraping publicly available travel guides and POI data is generally permissible. We target only public, non-authenticated editorial content. We do not circumvent authentication walls or extract personal data. Clients should consult their own legal counsel regarding specific commercial use cases.

How do you handle destination hierarchies?

We build a relational map during the crawl. Every POI (hotel, restaurant, attraction) is tagged with a parent destination ID, allowing you to easily query all locations within a specific city, region, or country.

How fresh is the data?

Travel guides do not require real-time streaming. We typically run full catalogue refreshes on a weekly or monthly cadence, relying on publication timestamps to detect changes and emit diffs.

Can you extract precise geo-coordinates?

Yes. While coordinates are sometimes missing from the raw HTML, our Playwright integration intercepts the background map API calls to extract accurate latitude and longitude for mapping applications.

What is the minimum viable engagement?

We typically scope engagements starting at a single continent or major country level. Contact us with your target regions for a specific volume estimate.

Can I request a sample dataset?

Yes. We provide a sample extraction of a single major city (e.g. Rome or Paris) including all associated POIs and itineraries so you can validate the schema before committing.

$ dataflirt scope --new-project --source=frommers.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a specific country guide or the entire global catalogue, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →