SYSTEM all green source cb2.com queue 4,192 pages p99 latency 184ms dataflirt.com · scraper/cb2-com

RUN - 14 active pipelines - cb2.com live

CB2 catalogue data,
at warehouse scale.

We extract furniture listings, designer collaborations, material specifications, and real-time inventory from CB2. Delivered as clean JSON, CSV, or Parquet to S3 or BigQuery on your cadence.

Get data from cb2.com → See how it works

Products extracted

14.2K /day

Price updates

28.5K /24h

Lookbooks parsed

412 /run

Active pipelines

Uptime

99.98%

◆ CB2 Product Data◆ Furniture Dimensions◆ Designer Collaborations◆ Material & Finish Specs◆ Stock Availability◆ Lookbook & Room Ideas◆ Assembly Instructions◆ Pricing History◆ Variant Mapping◆ Category Hierarchy◆ Managed Pipeline◆ S3 / BigQuery Delivery◆ CB2 Product Data◆ Furniture Dimensions◆ Designer Collaborations◆ Material & Finish Specs◆ Stock Availability◆ Lookbook & Room Ideas◆ Assembly Instructions◆ Pricing History◆ Variant Mapping◆ Category Hierarchy◆ Managed Pipeline◆ S3 / BigQuery Delivery

Data Dictionary

Every field we extract from cb2.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Product Listings objects from cb2.com. All fields typed and schema-versioned.

skunamecategorysub_categorybase_pricedesignercolourdimensionsimage_urlsoverviewurl

"sku": "439182",
"name": "Boucle Sofa",
"category": "Furniture",
"sub_category": "Sofas",
"base_price": 1999.0,
"designer": "Gwyneth Paltrow",
"colour": "Ivory",
"dimensions": "84 W x 36 D x 30 H"

#	sku	name	category	sub_category	base_price	designer
1
2
3

Complete list of extractable fields for Pricing & Inventory objects from cb2.com. All fields typed and schema-versioned.

skucurrent_priceoriginal_priceclearance_flagin_stockstock_status_messagedelivery_estimatezip_codecurrencytimestamp

"sku": "439182",
"current_price": 1799.0,
"original_price": 1999.0,
"clearance_flag": true,
"in_stock": true,
"stock_status_message": "In Stock and Ready to Ship",
"delivery_estimate": "3-5 Business Days",
"currency": "USD"

#	sku	current_price	original_price	clearance_flag	in_stock	stock_status_message
1
2
3

Complete list of extractable fields for Designer Collections objects from cb2.com. All fields typed and schema-versioned.

collection_namedesigner_nameexclusive_collaborationcollection_urlitem_countdescriptionactive_datesmaterialsfeatured_skus

"collection_name": "Goop x CB2",
"designer_name": "Gwyneth Paltrow",
"exclusive_collaboration": true,
"item_count": 42,
"description": "A curated collection of modern elegance.",
"materials": "['Boucle', 'Brass', 'Marble']",
"featured_skus": "['439182', '439185']"

#	collection_name	designer_name	exclusive_collaboration	collection_url	item_count	description
1
2
3

Complete list of extractable fields for Lookbooks & Rooms objects from cb2.com. All fields typed and schema-versioned.

lookbook_idtitleroom_typeaestheticimage_urltagged_skustotal_room_costdesigner_notesseason

"lookbook_id": "LB-2023-Fall-04",
"title": "Modern Parisian Living Room",
"room_type": "Living Room",
"aesthetic": "Modern Parisian",
"tagged_skus": "['439182', '882104', '119283']",
"total_room_cost": 4550.0,
"season": "Fall 2023"

#	lookbook_id	title	room_type	aesthetic	image_url	tagged_skus
1
2
3

Complete list of extractable fields for Specifications objects from cb2.com. All fields typed and schema-versioned.

skumaterialfinishcare_instructionsassembly_requiredweightoriginwarning_textcertifications

"sku": "439182",
"material": "Polyester Boucle",
"finish": "Matte Black Legs",
"care_instructions": "Spot clean with mild detergent",
"assembly_required": false,
"weight": "125 lbs",
"origin": "Imported",
"certifications": "['FSC Certified Wood']"

#	sku	material	finish	care_instructions	assembly_required	weight
1
2
3

Capabilities

Extracting CB2 data with architectural precision

Our CB2 scraper navigates complex product variants, designer collections, and dynamic inventory systems. We handle the rendering and session management required to extract complete specification data.

Variant & Fabric Mapping

Extract every colour, fabric, and configuration option for made-to-order furniture, linking parent SKUs to specific variant pricing and lead times.

Dimensional Data Parsing

Capture width, depth, height, and seat height as structured numeric fields rather than raw text blocks.

Designer Collaborations

Track exclusive collections from Kravitz Design, Goop, Paul McCobb, and others, mapping items back to their respective campaigns.

Inventory & Lead Times

Monitor stock availability, backorder dates, and delivery estimates based on specific zip codes and fulfillment centres.

Lookbook Deconstruction

Parse 'Shop the Room' and Lookbook pages to extract tagged SKUs, room aesthetics, and aggregate pricing for curated spaces.

Clearance & Sale Tracking

Identify clearance items, promotional pricing, and limited-time discounts across the entire catalogue.

Material Specifications

Extract detailed material compositions, finish types, care instructions, and origin data for every product.

Regional Pricing

Capture pricing variations across different geographic regions and shipping zones.

Automated Diffing

Receive only updated records when prices change, new items are added, or stock statuses shift, reducing processing overhead.

// engagement pipeline

From category URL to warehouse record

Brief in. Clean data out.

Define Scope

d 0

Provide CB2 categories, specific designer collections, or search terms. We define the schema together.

Pipeline Build

d 2–4

We configure crawlers to handle CB2's dynamic loading, variant selectors, and image galleries.

Validation & QA

d 4–6

Schema validation, null-rate checks, and dimension parsing tests before full launch.

Delivery

ongoing

Structured data pushed to your S3 bucket, BigQuery dataset, or via Webhook on your defined schedule.

Under the hood

Navigating CB2's digital storefront

Extracting home decor data requires handling complex product configurations and visual-heavy pages. Here is how we maintain data integrity.

// fingerprinting

Identity rotation

TLS fingerprintrandomised

User-agentrotated

IP poolresidential

Challenges blocked0

// pagination

Page coverage

48,291 pages queued running

// observability

Pipeline health

99.9%

uptime

142ms

p99 lat

0.3%

null rate

alerts

Dynamic variants

Handling made-to-order configurations

CB2 sofas and beds often have dozens of fabric and leg combinations. We execute JavaScript to trigger variant selections, capturing the specific price, SKU, and lead time for every possible configuration.

Data structuring

Normalising dimensional text

Furniture dimensions are often presented as unstructured text. Our pipeline uses regex and NLP to parse '84"Wx36"Dx30"H' into distinct, numeric width, depth, and height columns.

Visual data

High-resolution image extraction

We extract the highest resolution image URLs for primary photos, lifestyle shots, and detailed fabric swatches, bypassing lazy-loaded thumbnails.

Inventory tracking

Location-based availability

Stock status on CB2 varies by delivery location. We inject specific zip codes into the session to extract accurate delivery estimates and backorder dates for your target regions.

Schema stability

Resilient DOM parsing

Retail sites update their front-end frequently. We rely on underlying JSON APIs and structured data objects where possible, using DOM parsing only as a secondary fallback.

Applications

Who uses CB2 data - and why

Teams across industries use cb2.com data to build competitive products and smarter operations.

Competitor Price Tracking

Home decor retailers monitor CB2 pricing, clearance cycles, and promotional events to adjust their own merchandising strategies.

Assortment Planning

Merchandisers analyse category depth, material trends, and colour palettes across CB2 collections to inform product development.

Interior Design Platforms

Aggregators and design apps ingest CB2 product catalogues to offer accurate 3D modeling, pricing, and purchasing options to their users.

Trend Forecasting

Analysts track the introduction of new designer collaborations and material shifts to identify emerging interior design trends.

Supply Chain Analysis

Logistics teams monitor backorder dates and out-of-stock rates to gauge macroeconomic supply chain health in the furniture sector.

Marketplace Aggregation

Affiliate sites and home goods aggregators maintain synchronised listings with accurate pricing and availability.

Why DataFlirt

"In the furniture sector, dimensions, materials, and lead times are just as critical as price. Extracting this data accurately requires a pipeline built for complex retail structures."

Scraping a modern furniture retailer involves navigating endless variant combinations, dynamic inventory checks, and unstructured specification text. DataFlirt manages the JavaScript rendering, session state, and data normalisation required to deliver clean, structured interior design catalogues.

Technical Spec

CB2 scraper - technical specifications

Everything supported by our cb2.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering

Required for variant selection, pricing updates, and image galleries

Supported

Variant mapping

Extracts all fabric, colour, and size combinations per product

Supported

Dimension parsing

Converts text dimensions into structured numeric fields

Supported

Zip code injection

Session modification to check stock for specific regions

Supported

Lookbook extraction

Maps tagged products to curated room scenes

Supported

Review extraction

Captures customer ratings, review text, and helpful votes

Supported

Change detection

Emits only records with changed fields since the last run

Supported

CB2 Trade Program pricing

Exclusive discounts requiring an approved interior designer account

Partial

User Wishlists & Carts

Private user data requiring authentication

Partial

Infrastructure

Infrastructure powering the CB2 pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus

Playwright Integration

Executes full browser sessions to interact with fabric selectors and dynamic pricing modules, capturing data hidden from standard HTTP requests.

Data Normalisation

Custom Python parsing logic converts inconsistent retail text into strict numeric types for dimensions, weights, and pricing.

Orchestrated Delivery

Airflow manages the dependency chain, ensuring categories are scraped, variants mapped, and diffs calculated before warehouse delivery.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON

Nested structures ideal for complex variant mapping

CSV

Flat files for immediate analyst use

XLS

Spreadsheet format for merchandising teams

Parquet

Columnar storage for efficient warehouse querying

AWS S3

Direct bucket upload on completion

Webhook

HTTP POST for real-time inventory alerts

API

REST endpoints to query your extracted dataset

BigQuery

Direct streaming into your GCP environment

Direct bucket delivery — compatible with any data lake

// faq

Common questions.

About cb2.com scraping, legality, and pipeline operations.

Ask us directly →

Can you extract all fabric options for a single sofa?

Yes. Our pipeline iterates through all available fabric and colour combinations on a product page, capturing the specific price, SKU, and lead time for each variant.

How do you handle dimensions?

We use custom parsing logic to extract width, depth, height, and seat height from the unstructured text blocks on CB2, delivering them as clean numeric fields in your database.

Can you check inventory for specific locations?

Yes. We can inject target zip codes into the scraping session to extract accurate delivery estimates and stock availability for specific regions.

Do you extract data from Lookbooks and 'Shop the Room' pages?

Yes. We map the curated lifestyle images to their tagged product SKUs, allowing you to reconstruct the room aesthetic and calculate aggregate room costs.

How frequently can you update pricing and stock?

We can configure pipelines to run daily for the entire catalogue, or at higher frequencies for specific high-priority SKUs.

Do you scrape customer reviews?

Yes. We extract star ratings, review text, submission dates, and helpful votes across all paginated review sections on a product page.

Can I get historical pricing data?

We begin tracking pricing history from the moment your pipeline is activated. We do not have access to historical pricing prior to pipeline initiation.

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. From dimensional specs to real-time inventory, we build and manage the pipeline. Tell us your data requirements and delivery cadence.

Start a cb2.com pipeline → View pricing

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h

Services

Data Extraction for Every Industry

View All Services →

🛍️ eCommerce → 🔍 Search Engine → ⚽ Sports Data → 📱 App Store → 🍕 Food Delivery → 📉 Betting Odds → ✈️ Aviation & Flight → 🛒 Grocery → 🎓 E-Learning → 💹 Stock Market → 🏠 Real Estate → 🤖 AI Training Data → 🧠 LLM Data → 📰 News → ⭐ Reviews → 💼 Job Board → 🏥 Healthcare → 💊 Pharma → 🏢 Company Data → 🤝 B2B Marketplace → 🚗 Automotive → 🌍 Travel → 🏨 Hospitality → 🪙 Cryptocurrency → 💡 IP & Patents → 📈 SEO Data → ⚖️ Legal → 🛡️ Insurance → 📲 Mobile App → 📸 Influencer → 🏛️ Government → 🚚 Transportation → 🎟️ Events → 📂 Directory → ⚡ Dynamic Websites → 📄 PDF Extraction → ✍️ Blog Content → ☁️ Weather → 🖥️ Cloud Scraping → 👨‍💻 Managed Service →

CB2 catalogue data, at warehouse scale.

Every field we extract from cb2.com

Extracting CB2 data with architectural precision

From category URL to warehouse record

Navigating CB2's digital storefront

Who uses CB2 data - and why

CB2 scraper - technical specifications

Infrastructure powering the CB2 pipeline

Your data, your destination

Common questions.

Tell us whatto extract. We do the rest.

Data Extraction for Every Industry

CB2 catalogue data,
at warehouse scale.

Tell us what
to extract.
We do the rest.