SYSTEM all green source mariages.net queue 12,491 vendors p99 latency 184ms dataflirt.com · scraper/mariages-net
RUN · 18 active pipelines · mariages.net live

French wedding data,
at warehouse scale.

We extract venue catalogues, vendor pricing models, capacity metrics, and review corpora from Mariages.net. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Vendors extracted
48,219 /run
Reviews processed
312K /run
Price updates
14.2K /week
Active pipelines
18
Uptime
99.98%
Data Dictionary

Every field we extract from mariages.net

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Venues objects from mariages.net. All fields typed and schema-versioned.

vendor_idnamecategoryregioncityaddresscapacity_mincapacity_maxprice_startingspaces_availableaccommodationmenu_typesratingreview_counturl
venues
● 200 OK
"vendor_id": "v149201",
"name": "Chateau de la Ligne",
"capacity_max": 250,
"price_starting": 4500.0,
"rating": 4.9,
"review_count": 84,
"region": "Gironde"
# vendor_idnamecategoryregioncityaddress
1
2
3

Complete list of extractable fields for Vendors objects from mariages.net. All fields typed and schema-versioned.

vendor_idnamecategorysub_categoryregioncityprice_startingratingreview_countresponse_timeawards_wondescriptionfaq_counturl
vendors
● 200 OK
"vendor_id": "p83912",
"name": "Studio Marie Photographie",
"category": "Photo et vidéo",
"price_starting": 1200.0,
"response_time": "24h",
"rating": 5.0,
"review_count": 42
# vendor_idnamecategorysub_categoryregioncity
1
2
3

Complete list of extractable fields for Reviews objects from mariages.net. All fields typed and schema-versioned.

review_idvendor_idauthor_namewedding_daterating_overallrating_qualityrating_responserating_valuerating_flexibilityreview_textresponse_textdate_posted
reviews
● 200 OK
"review_id": "r991023",
"vendor_id": "v149201",
"rating_overall": 5.0,
"review_text": "Lieu magique pour notre mariage. Le domaine est magnifiquement entretenu.",
"wedding_date": "2025-06-14",
"date_posted": "2025-07-02"
# review_idvendor_idauthor_namewedding_daterating_overallrating_quality
1
2
3

Complete list of extractable fields for Promotions objects from mariages.net. All fields typed and schema-versioned.

offer_idvendor_idtitlediscount_pctdiscount_absdescriptionconditionsvalid_untilclaim_counturl
promotions
● 200 OK
"offer_id": "o4412",
"vendor_id": "p83912",
"title": "Remise Hivernale",
"discount_pct": 10,
"valid_until": "2026-03-31",
"conditions": "Valable pour les mariages entre novembre et mars."
# offer_idvendor_idtitlediscount_pctdiscount_absdescription
1
2
3

Complete list of extractable fields for Search Rankings objects from mariages.net. All fields typed and schema-versioned.

keywordregioncategorypositionvendor_idnameratingreview_countfeatured_badgeprice_startingscraped_at
search_rankings
● 200 OK
"keyword": "traiteur mariage",
"region": "Ile-de-France",
"position": 3,
"vendor_id": "c55190",
"name": "Gourmet Reception",
"featured_badge": true,
"scraped_at": "2026-05-12T08:14:00Z"
# keywordregioncategorypositionvendor_idname
1
2
3

Capabilities

Extract the French wedding economy

Our Mariages.net scraper captures the entire vendor directory: venue specifications, dynamic pricing, promotional offers, and the review corpus. We handle pagination, layout variations, and regional filtering.

Venue Capacity & Specs

Extract maximum guest counts, available spaces (indoor/outdoor), accommodation availability, and catering restrictions for every listed venue.

Pricing Baselines

Capture starting prices, menu costs per head, and package structures across all vendor categories.

Review & Reputation Mining

Full review text, category ratings (quality, response, value, flexibility), wedding dates, and vendor replies paginated across all listings.

Vendor Responsiveness

Track stated response times and engagement metrics, useful for identifying active versus dormant vendor profiles.

Wedding Awards Tracking

Extract historical Wedding Awards badges and recognition years to identify top performing regional vendors.

Promotional Offers

Monitor active discounts, percentage reductions, and specific terms and conditions tied to seasonal bookings.

Regional Coverage

Crawl by specific departments or regions to build targeted datasets for local market analysis.

FAQ Extraction

Extract structured Q&A pairs from vendor profiles covering payment terms, travel policies, and minimum booking requirements.

Scheduled Updates

Configure pipelines at monthly or quarterly cadences to track new vendor registrations and pricing adjustments over time.

// engagement pipeline

From category list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide target regions, vendor categories, or specific profile URLs. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy crawlers, handle regional pagination, and manage request routing to bypass rate limits.

Validation & QA
d 4–6

Schema validation, null-rate checks, and location normalisation before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

Navigating the Mariages.net architecture

Directory scraping involves deep pagination and inconsistent vendor profile structures. Here is how we ensure data completeness.

pipeline-monitor · mariages.net · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Deep pagination
Exhaustive category crawling

Directory categories often span hundreds of pages. We implement stateful crawlers that handle pagination tokens and parameter variations to ensure no vendor is missed deep in the results.

Profile variations
Handling inconsistent vendor layouts

A photographer profile differs structurally from a venue profile. Our schema uses conditional parsing logic based on the vendor sub-category to normalise fields like capacity, pricing, and amenities.

JavaScript interactions
Revealing hidden contact data

Certain fields, such as phone numbers or full descriptions, require click-to-reveal interactions. We use Playwright to trigger these DOM events and extract the hydrated data.

Rate limiting
Distributed request routing

Aggressive crawling triggers IP blocks. We distribute requests across French residential and mobile proxy pools, matching local user request patterns to maintain high throughput.

Data normalisation
Cleaning unstructured text

Vendor descriptions and FAQs contain varied formatting. We strip HTML, normalise whitespace, and standardise price formats before the data reaches your warehouse.

Applications

Who uses Mariages.net data

Teams across industries use mariages.net data to build competitive products and smarter operations.

01
B2B Lead Generation

Wholesale suppliers, software vendors, and insurance providers extract vendor details to build targeted outreach lists across France.

02
Market & Pricing Analysis

Event planners and venue operators analyse regional pricing baselines and package structures to optimise their own commercial positioning.

03
Venue Investment

Real estate and private equity firms track venue density, review velocity, and capacity constraints to identify underserved regions for acquisition.

04
Competitor Intelligence

Caterers and photographers monitor competitor promotional offers, response times, and new award acquisitions.

05
Sentiment Analysis

Aggregating review corpora allows hospitality groups to identify common pain points in specific vendor categories.

06
Directory Aggregation

Global event platforms sync French vendor data to build out local market presence and enrich their own catalogues.

Why DataFlirt

"Mariages.net holds the definitive dataset for the French wedding economy, mapping out pricing, capacity, and reputation across tens of thousands of local businesses."

Extracting this data reliably requires handling aggressive rate limits, JavaScript-rendered contact details, and deeply paginated directory structures. DataFlirt manages the proxy rotation and session handling, delivering structured regional vendor intelligence directly to your warehouse.

Technical Spec

Mariages.net scraper — technical capabilities

Everything supported by our mariages.net scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Playwright sessions required for click-to-reveal elements and dynamic maps
Supported
Review pagination
Extraction of all historical reviews, not just the front page
Supported
Phone number reveal
Triggering DOM events to capture unmasked vendor contact numbers
Supported
Geospatial extraction
Capture of latitude/longitude coordinates from embedded map widgets
Supported
Promo code extraction
Capture of active discounts and seasonal offer terms
Supported
Wedding Awards history
Extraction of all historical badges and award years
Supported
Category rank tracking
Recording vendor position within specific regional search queries
Supported
Private user messages
Direct messages sent between users and vendors via the platform portal
Partial
Vendor backend analytics
Profile view counts, lead conversion rates, and internal dashboard metrics
Partial
Infrastructure

Infrastructure powering the pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering and interaction flows for hidden data.

Proxy Infrastructure

We route requests through French residential IPs to match expected geographic traffic patterns and avoid automated blocking.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested lists
CSV
Flat file with typed columns
XLS
Excel format for immediate business use
Parquet
Columnar format for data warehouses
AWS S3
Direct bucket delivery
Webhook
HTTP POST per record
API
REST endpoint for programmatic access
BigQuery
Streamed directly into your dataset
Snowflake
Stage and COPY INTO workflow
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About mariages.net scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Mariages.net legal?

Scraping publicly available vendor information and reviews is generally permissible under applicable law, provided it does not breach specific copyright protections or extract personally identifiable information of private users. We target public business directory data. Clients should review terms of service and consult legal counsel for specific use cases.

Can you extract direct email addresses?

Mariages.net primarily uses internal contact forms rather than exposing direct email addresses. We extract phone numbers, website URLs, and physical addresses. Emails are only captured if a vendor explicitly writes them in their public description.

Do you support other Knot Worldwide properties?

Yes. The underlying pipeline architecture can be adapted for Bodas.net, Matrimonio.com, WeddingWire, and TheKnot, allowing you to build unified datasets across multiple European and US markets.

How do you handle rate limits?

We use French residential proxy pools and enforce request delays that mimic human browsing patterns. If a block occurs, the request is automatically retried through a clean IP address.

How fresh is the data?

For full directory crawls, we typically recommend a monthly or quarterly cadence. Delta updates for specific regional subsets or targeted vendor lists can be configured to run daily or weekly.

Can you track vendor search rankings?

Yes. We can input specific keywords and regions to track which vendors appear in top positions, capturing featured badges and organic rank over time.

Can I request a sample dataset?

Yes. We provide a sample run of up to 500 vendor profiles as part of the pre-engagement scoping process to validate field completeness and data quality.

$ dataflirt scope --new-project --source=mariages.net ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off directory export or continuous tracking of vendor pricing and reviews — we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →