SYSTEM all green source restorationhardware.com queue 12,408 pages p99 latency 215ms dataflirt.com · scraper/restorationhardware-com
RUN - 31 active pipelines - restorationhardware.com live

RH catalogue data,
at warehouse scale.

We extract luxury furniture specifications, fabric and finish matrices, member pricing tiers, and gallery stock from Restoration Hardware. Delivered as clean JSON, CSV, or Parquet to your warehouse.

Products extracted
84K /run
SKU variations
1.2M /run
Price updates
45K /24h
Active pipelines
31
Uptime
99.98%
Data Dictionary

Every field we extract from restorationhardware.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Product Specs objects from restorationhardware.com. All fields typed and schema-versioned.

product_idnamecollectioncategorysub_categoryregular_price_rangemember_price_rangedimensionscare_instructionsmaterialsoverview_textprimary_image_url
product_specs
● 200 OK
"product_id": "prod123456",
"name": "Maxwell Leather Sofa",
"collection": "Maxwell",
"category": "Living",
"regular_price_range": "4500.00 - 8500.00",
"member_price_range": "3375.00 - 6375.00",
"materials": "Italian Brompton Leather",
"primary_image_url": "https://media.restorationhardware.com/..."
# product_idnamecollectioncategorysub_categoryregular_price_range
1
2
3

Complete list of extractable fields for SKU Variations objects from restorationhardware.com. All fields typed and schema-versioned.

skuparent_product_idlengthdepthfill_typeleather_or_fabric_categorycolourfinishregular_pricemember_pricestock_statuslead_time_weeks
sku_variations
● 200 OK
"sku": "sku987654",
"parent_product_id": "prod123456",
"length": "84 inch",
"depth": "Classic 40 inch",
"fill_type": "Standard",
"leather_or_fabric_category": "Italian Brompton",
"colour": "Cocoa",
"regular_price": 5200.0,
"member_price": 3900.0,
"stock_status": "In Stock"
# skuparent_product_idlengthdepthfill_typeleather_or_fabric_category
1
2
3

Complete list of extractable fields for Pricing Data objects from restorationhardware.com. All fields typed and schema-versioned.

skuregular_pricemember_pricediscount_pctcurrencysale_badgefinal_sale_flagshipping_surchargeprice_timestamp
pricing_data
● 200 OK
"sku": "sku987654",
"regular_price": 5200.0,
"member_price": 3900.0,
"discount_pct": 25,
"currency": "USD",
"sale_badge": false,
"final_sale_flag": false,
"shipping_surcharge": 299.0,
"price_timestamp": "2026-05-12T10:15:00Z"
# skuregular_pricemember_pricediscount_pctcurrencysale_badge
1
2
3

Complete list of extractable fields for Gallery Inventory objects from restorationhardware.com. All fields typed and schema-versioned.

gallery_idgallery_nameaddresscitystatezip_codephoneskudisplay_statuspickup_available
gallery_inventory
● 200 OK
"gallery_id": "gal_042",
"gallery_name": "RH New York, The Gallery",
"city": "New York",
"state": "NY",
"zip_code": "10014",
"sku": "sku987654",
"display_status": "On Display",
"pickup_available": true
# gallery_idgallery_nameaddresscitystatezip_code
1
2
3

Complete list of extractable fields for Source Books objects from restorationhardware.com. All fields typed and schema-versioned.

book_idtitleyearseasonpage_numberfeatured_skuslifestyle_image_urlshoppable_links
source_books
● 200 OK
"book_id": "sb_2025_modern",
"title": "RH Modern 2025",
"year": 2025,
"season": "Spring",
"page_number": 42,
"featured_skus": "['sku111', 'sku222']",
"lifestyle_image_url": "https://media.restorationhardware.com/...",
"shoppable_links": 3
# book_idtitleyearseasonpage_numberfeatured_skus
1
2
3

Capabilities

Extracting the complexity of luxury retail

Restoration Hardware relies on highly nested product configurations. Our pipeline flattens these matrices, capturing every fabric, finish, and dimension permutation alongside dual-tier pricing.

Variant Matrix Flattening

Extract every combination of length, depth, fill, fabric, and finish. We map thousands of child SKUs back to their parent product ID.

Dual-Tier Price Extraction

Capture both Regular and RH Member pricing for every SKU, including clearance and final sale flags.

Dimension & Spec Parsing

Extract structured dimensional data, weight, and material composition from unstructured overview descriptions.

Gallery Display Status

Track which specific SKUs are on display at which physical RH Gallery locations across the country.

High-Res Image Capture

Extract clean URLs for lifestyle imagery, isolated product shots, and high-resolution fabric/finish swatches.

Lead Time & Delivery

Capture estimated shipping windows and freight surcharges per SKU based on destination zip codes.

Source Book Scraping

Map shoppable links and featured SKUs directly from digital RH Source Book pages.

Change Detection

Run continuous pipelines that only emit records when a price changes, a finish is discontinued, or lead times shift.

International Catalogues

Support for RH US, RH Canada, and RH UK storefronts with localised currency and availability.

// engagement pipeline

From catalogue to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide target categories, collections, or specific SKUs. We design the extraction schema for the variant matrices.

Pipeline Build
d 2–4

We configure Playwright crawlers to handle dynamic fabric selectors and proxy rotation for restorationhardware.com.

Validation & QA
d 4–6

Schema validation, checking for missing variant permutations, and price-tier accuracy before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our RH pipeline handles the hard parts

Extracting data from RH requires navigating heavy JavaScript selectors and massive variant arrays. Here is how we maintain stability.

pipeline-monitor · restorationhardware.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
JavaScript rendering
Handling dynamic configuration selectors

RH product pages use complex JavaScript to load pricing and images only after a user selects length, depth, fabric, and colour. We use full Playwright sessions to programmatically iterate through these dropdowns, capturing the specific SKU and price for every permutation.

Variant explosion
Managing massive SKU matrices

A single RH sofa can have over 1,500 distinct SKUs based on customisation options. Our crawlers recursively map these option trees, flattening them into relational tables so you can query exact configurations without dealing with nested JSON arrays.

Anti-bot layer
Residential proxy routing

Frequent requests to RH pricing endpoints trigger rate limits and blocks. We route traffic through US-based residential ISP proxies with realistic browser fingerprints to ensure uninterrupted catalogue extraction.

Data normalisation
Standardising dimensions and materials

RH often buries critical specifications in unstructured HTML text blocks. We apply regex and parsing rules to extract clean, standard dimensional fields (width, depth, height) and material tags for downstream analysis.

Monitoring
Detecting schema drift

When RH updates their frontend framework or changes how member pricing is displayed, our observability stack flags the DOM change immediately. We maintain the selectors so your data feed remains uninterrupted.

Applications

Who uses RH data - and how

Teams across industries use restorationhardware.com data to build competitive products and smarter operations.

01
Competitor Price Benchmarking

Luxury furniture retailers monitor RH Member pricing and promotional cadences to adjust their own pricing strategies.

02
Assortment Planning

Merchandising teams analyse RH category depth, tracking the introduction of new collections, fabrics, and finishes.

03
Supply Chain & Lead Time Tracking

Analysts track estimated delivery windows across different fabric grades to infer supply chain bottlenecks and material shortages.

04
Interior Design Catalogue Sync

Design platforms ingest exact dimensions, high-res imagery, and current pricing to populate 3D rendering and procurement software.

05
Luxury Retail Market Research

Private equity and investment analysts monitor gallery expansion and display inventory to evaluate capital expenditure and brand footprint.

06
Material & Trend Analysis

Trend forecasters aggregate fabric, leather, and finish availability to quantify shifts in luxury interior design preferences.

Why DataFlirt

"Restoration Hardware maintains one of the most complex product matrices in retail. Extracting their fabric, finish, and size permutations requires a pipeline built specifically for highly nested variant data."

Most extraction attempts fail on RH due to the sheer volume of SKU permutations per product. A single sofa can have over 1,500 fabric and finish combinations. DataFlirt handles the JavaScript rendering and recursive variant mapping required to output a flat, queryable catalogue without the maintenance overhead.

Technical Spec

RH scraper - technical capabilities

Everything supported by our restorationhardware.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for dynamic fabric selectors and pricing
Supported
Fabric/finish matrix mapping
Recursive extraction of all child SKUs from parent product configurations
Supported
Member pricing extraction
Capture of both Regular and RH Member price tiers per SKU
Supported
Gallery stock by zip code
Inventory and display status checked against specific gallery locations
Supported
High-res image URLs
Extraction of uncompressed product and lifestyle imagery assets
Supported
Source book parsing
Mapping digital catalogue pages to shoppable product IDs
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed fields since last run
Supported
Trade discount account pricing
Requires authenticated sessions tied to specific trade accounts
Partial
User wishlist data
Extraction of private user saved items and cart configurations
Partial
Infrastructure

Infrastructure powering the RH pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and deduplication. Playwright executes the JavaScript required to iterate through RH's complex product configuration menus.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US regions. Rotation happens per-request to avoid rate limits on pricing API endpoints.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested - schema versioned per run
CSV
Flat file with typed columns - Excel/Sheets compatible
Parquet
Columnar format for BigQuery, Snowflake, Athena
S3
Direct bucket delivery - compatible with any data lake
BigQuery
Streamed directly into your dataset with schema auto-detect
Webhook
HTTP POST per record for real-time downstream processing
Postgres
Upsert into your existing schema with conflict resolution
Snowflake
Stage + COPY INTO workflow - incremental or full-replace
// faq

Common questions.

About restorationhardware.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping restorationhardware.com legal?

Scraping publicly available information from retail websites is generally permissible. DataFlirt targets only public, non-authenticated catalogue, pricing, and gallery data. We do not extract personal data or bypass authentication walls. Clients should review target website terms and consult legal counsel for specific use cases.

How do you handle the fabric and finish permutations?

Our Playwright scripts programmatically select every valid combination of dimensions, fabrics, and finishes on the product page, capturing the unique SKU, price, and lead time for each specific configuration. We output this as a flattened relational dataset.

Can you extract both Member and Regular pricing?

Yes. Every SKU record includes both the standard retail price and the RH Member price, along with any clearance or final sale indicators.

How fresh is the inventory and lead time data?

Data freshness depends on your pipeline schedule. We can configure daily or weekly runs across the catalogue to track shifts in lead times and gallery display status.

Do you scrape the RH Source Books?

Yes. We can parse the digital Source Books to extract page numbers, lifestyle imagery, and the specific SKUs featured in those curated layouts.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 50 parent products (which typically yields thousands of child SKUs) as part of the scoping process, allowing you to validate schema fit and data quality.

$ dataflirt scope --new-project --source=restorationhardware.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous price-monitoring feed across every SKU variation - we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →