We extract designer lighting catalogues, finish matrices, trade pricing, spec sheets, and stock availability from YLighting. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Product Core objects from ylighting.com. All fields typed and schema-versioned.
"sku": "YLI-FLOS-IC-T1", "brand": "Flos", "designer": "Michael Anastassiades", "title": "IC T1 Table Lamp", "category": "Lighting", "base_price": 795.0, "rating": 4.8
| # | sku | brand | designer | title | category | sub_category |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Technical Specs objects from ylighting.com. All fields typed and schema-versioned.
"sku": "YLI-FLOS-IC-T1", "dimensions": "10.8 W x 15 H", "voltage": "120V", "bulb_type": "Halogen", "wattage": "60W", "dimmable": true
| # | sku | material | dimensions | weight | voltage | bulb_type |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Variants & Finishes objects from ylighting.com. All fields typed and schema-versioned.
"variant_id": "FLS-8492-BRS", "parent_sku": "YLI-FLOS-IC-T1", "finish_name": "Brushed Brass", "size": "Small", "price": 795.0, "stock_status": "In Stock"
| # | variant_id | parent_sku | finish_name | finish_family | size | price |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Assets & Documents objects from ylighting.com. All fields typed and schema-versioned.
"sku": "YLI-FLOS-IC-T1", "primary_image": "https://cdn.ylighting.com/images/ic-t1-main.jpg", "spec_sheet_pdf": "https://cdn.ylighting.com/docs/ic-t1-spec.pdf", "installation_pdf": "https://cdn.ylighting.com/docs/ic-t1-install.pdf", "gallery_images": "['img1.jpg', 'img2.jpg']", "video_url": "None"
| # | sku | primary_image | gallery_images | spec_sheet_pdf | installation_pdf | model_3d_url |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Reviews & Ratings objects from ylighting.com. All fields typed and schema-versioned.
"review_id": "REV-99214", "sku": "YLI-FLOS-IC-T1", "rating": 5, "title": "Beautiful ambient light", "date": "2023-10-12", "verified_buyer": true
| # | review_id | sku | author | rating | title | body |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
High-end lighting catalogues are complex. We handle the JavaScript rendering for finish matrices, normalise technical specifications, and extract PDF documentation automatically.
Extract every SKU across all categories, capturing brand attribution, designer names, and collection hierarchies.
Iterate through finish, size, and lamping combinations to capture specific pricing and imagery for every possible variant.
Extract voltage, wattage, bulb types, dimmability, and UL listing status into structured, queryable fields.
Locate and store URLs for PDF specification sheets, installation guides, and CAD models associated with each fixture.
Capture stock status and dynamic lead times, allowing you to track shipping delays across different brands and finishes.
Track base retail pricing, promotional discounts, and clearance markdowns across the entire YLighting catalogue.
Extract height, width, depth, and weight, standardising the text output for easier downstream database ingestion.
Extract clean, unwatermarked URLs for primary product images, lifestyle shots, and variant-specific galleries.
Paginate through customer reviews to capture star ratings, detailed feedback, and verified buyer status.
Brief in. Clean data out.
Provide specific brands, categories, or designer collections. We map the extraction schema to your database requirements.
We configure Playwright to handle JavaScript variant hydration and Scrapy for rapid catalogue traversal.
Schema validation checks for null rates on critical fields like dimensions, voltage, and variant pricing.
JSON, CSV, or Parquet pushed directly to your S3 bucket, BigQuery dataset, or Snowflake stage.
Extracting data from high-end decor sites requires handling dynamic frontends and multi-dimensional product matrices. Here is how we build resilience.
YLighting uses JavaScript to dynamically load prices, stock status, and images when a user selects a finish or size. We run full Playwright sessions to trigger these DOM events and capture the hydrated data for every variant.
Specification sheets are critical for architectural lighting. Our parsers locate the document nodes within the DOM, extracting clean URLs for PDFs and installation guides even when they are buried in tabbed interfaces.
Product dimensions are often listed in unstructured strings. We apply regex patterns during the extraction phase to isolate height, width, and depth into distinct numerical fields.
To prevent IP bans and rate limiting, we route all requests through US-based residential proxies, rotating IPs and spoofing TLS fingerprints to mimic legitimate browsing behaviour.
For clients monitoring supply chains, we maintain a state file of previous lead times. The pipeline only emits records when a shipping estimate or stock status changes, reducing unnecessary data transfer.
Retailers track YLighting's pricing, promotional events, and clearance discounts to adjust their own merchandising strategies.
Procurement platforms ingest dimensions, finishes, and spec sheets to build unified catalogues for architects and designers.
Analysts monitor stock availability and lead times across specific brands to identify manufacturing delays and supply bottlenecks.
Lighting manufacturers audit YLighting's retail prices to ensure compliance with Minimum Advertised Price agreements.
Researchers analyse category expansion, new designer additions, and popular finishes to forecast interior design trends.
Smaller retailers use extracted specification data to backfill missing technical details in their own product databases.
"YLighting holds the most structured catalogue of modern designer lighting on the web, but extracting accurate variant matrices and technical specs requires a pipeline built for complex eCommerce architectures."
Scraping YLighting is not just about grabbing titles and prices. High-end lighting involves multi-dimensional variants, PDF specification sheets, and dynamic lead times. DataFlirt handles the JavaScript rendering and schema normalisation so your team receives clean, structured data ready for analysis.
Everything supported by our ylighting.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
We use Playwright to programmatically select finish and size options, triggering the necessary network requests to capture variant-specific pricing and stock data.
Custom Python middleware parses unstructured technical specifications, converting varied dimensional formats into clean, queryable database columns.
Airflow manages the crawl schedules, dispatching containerised Scrapy spiders across our Kubernetes cluster to ensure rapid catalogue traversal.
Data delivered to where your team already works — no new tooling required.
About ylighting.com scraping, legality, and pipeline operations.
Ask us directly →Yes. Our pipeline interacts with the product page to select every available variant, capturing the specific price, SKU, lead time, and image URL associated with that exact configuration.
We extract the direct URLs to all PDF assets, including spec sheets, installation guides, and warranty documents, delivering them as structured fields linked to the parent SKU.
We can run full catalogue sweeps weekly or daily. For pricing and stock monitoring on specific high-priority brands, we can configure hourly change-detection runs.
Extracting trade-specific pricing requires authenticating with a valid trade account. If you provide the credentials, we can configure the pipeline to log in and capture the discounted rates.
We apply regex-based normalisation during the extraction process to parse strings like '10.5 in W x 12 in H' into distinct numerical fields for width, height, and depth.
We typically start with a defined scope, such as extracting a specific list of brands or categories on a weekly schedule. Contact us to scope your specific data requirements.
20-minute scoping call. Pilot dataset within the week. Production within two. From complete catalogue dumps to daily lead-time monitoring across specific designer brands. We build and operate the infrastructure. Tell us what you need.