SYSTEM all green source ylighting.com queue 12,842 pages p99 latency 312ms dataflirt.com · scraper/ylighting-com
RUN · 31 active pipelines · ylighting.com live

YLighting data,
at warehouse scale.

We extract designer lighting catalogues, finish matrices, trade pricing, spec sheets, and stock availability from YLighting. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Products extracted
42.1K /run
Variant matrices
185K /run
Spec sheets
38.4K /total
Active pipelines
31
Uptime
99.94%
Data Dictionary

Every field we extract from ylighting.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Product Core objects from ylighting.com. All fields typed and schema-versioned.

skubranddesignertitlecategorysub_categorybase_priceretail_pricedescriptionratingreview_countpage_url
product_core
● 200 OK
"sku": "YLI-FLOS-IC-T1",
"brand": "Flos",
"designer": "Michael Anastassiades",
"title": "IC T1 Table Lamp",
"category": "Lighting",
"base_price": 795.0,
"rating": 4.8
# skubranddesignertitlecategorysub_category
1
2
3

Complete list of extractable fields for Technical Specs objects from ylighting.com. All fields typed and schema-versioned.

skumaterialdimensionsweightvoltagebulb_typewattagedimmableul_listedcertificationcord_length
technical_specs
● 200 OK
"sku": "YLI-FLOS-IC-T1",
"dimensions": "10.8 W x 15 H",
"voltage": "120V",
"bulb_type": "Halogen",
"wattage": "60W",
"dimmable": true
# skumaterialdimensionsweightvoltagebulb_type
1
2
3

Complete list of extractable fields for Variants & Finishes objects from ylighting.com. All fields typed and schema-versioned.

variant_idparent_skufinish_namefinish_familysizepricestock_statuslead_time_daysimage_urlupc
variants_& finishes
● 200 OK
"variant_id": "FLS-8492-BRS",
"parent_sku": "YLI-FLOS-IC-T1",
"finish_name": "Brushed Brass",
"size": "Small",
"price": 795.0,
"stock_status": "In Stock"
# variant_idparent_skufinish_namefinish_familysizeprice
1
2
3

Complete list of extractable fields for Assets & Documents objects from ylighting.com. All fields typed and schema-versioned.

skuprimary_imagegallery_imagesspec_sheet_pdfinstallation_pdfmodel_3d_urlvideo_urllifestyle_images
assets_& documents
● 200 OK
"sku": "YLI-FLOS-IC-T1",
"primary_image": "https://cdn.ylighting.com/images/ic-t1-main.jpg",
"spec_sheet_pdf": "https://cdn.ylighting.com/docs/ic-t1-spec.pdf",
"installation_pdf": "https://cdn.ylighting.com/docs/ic-t1-install.pdf",
"gallery_images": "['img1.jpg', 'img2.jpg']",
"video_url": "None"
# skuprimary_imagegallery_imagesspec_sheet_pdfinstallation_pdfmodel_3d_url
1
2
3

Complete list of extractable fields for Reviews & Ratings objects from ylighting.com. All fields typed and schema-versioned.

review_idskuauthorratingtitlebodydateverified_buyerhelpful_voteslocation
reviews_& ratings
● 200 OK
"review_id": "REV-99214",
"sku": "YLI-FLOS-IC-T1",
"rating": 5,
"title": "Beautiful ambient light",
"date": "2023-10-12",
"verified_buyer": true
# review_idskuauthorratingtitlebody
1
2
3

Capabilities

Extract designer lighting data with engineering precision

High-end lighting catalogues are complex. We handle the JavaScript rendering for finish matrices, normalise technical specifications, and extract PDF documentation automatically.

Full Catalogue Extraction

Extract every SKU across all categories, capturing brand attribution, designer names, and collection hierarchies.

Complex Variant Mapping

Iterate through finish, size, and lamping combinations to capture specific pricing and imagery for every possible variant.

Technical Specification Parsing

Extract voltage, wattage, bulb types, dimmability, and UL listing status into structured, queryable fields.

Document Asset Scraping

Locate and store URLs for PDF specification sheets, installation guides, and CAD models associated with each fixture.

Real-Time Availability

Capture stock status and dynamic lead times, allowing you to track shipping delays across different brands and finishes.

Pricing Intelligence

Track base retail pricing, promotional discounts, and clearance markdowns across the entire YLighting catalogue.

Dimensional Normalisation

Extract height, width, depth, and weight, standardising the text output for easier downstream database ingestion.

High-Resolution Imagery

Extract clean, unwatermarked URLs for primary product images, lifestyle shots, and variant-specific galleries.

Review & Sentiment Mining

Paginate through customer reviews to capture star ratings, detailed feedback, and verified buyer status.

// engagement pipeline

From brand list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide specific brands, categories, or designer collections. We map the extraction schema to your database requirements.

Pipeline Build
d 2–4

We configure Playwright to handle JavaScript variant hydration and Scrapy for rapid catalogue traversal.

Validation & QA
d 4–6

Schema validation checks for null rates on critical fields like dimensions, voltage, and variant pricing.

Delivery
ongoing

JSON, CSV, or Parquet pushed directly to your S3 bucket, BigQuery dataset, or Snowflake stage.

Under the hood

Handling YLighting's complex catalogue structure

Extracting data from high-end decor sites requires handling dynamic frontends and multi-dimensional product matrices. Here is how we build resilience.

pipeline-monitor · ylighting.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
JavaScript hydration
Executing SPA logic for variants

YLighting uses JavaScript to dynamically load prices, stock status, and images when a user selects a finish or size. We run full Playwright sessions to trigger these DOM events and capture the hydrated data for every variant.

PDF extraction
Locating hidden documentation links

Specification sheets are critical for architectural lighting. Our parsers locate the document nodes within the DOM, extracting clean URLs for PDFs and installation guides even when they are buried in tabbed interfaces.

Dimensional normalisation
Cleaning unstructured text specs

Product dimensions are often listed in unstructured strings. We apply regex patterns during the extraction phase to isolate height, width, and depth into distinct numerical fields.

Bot mitigation
Residential proxies and fingerprinting

To prevent IP bans and rate limiting, we route all requests through US-based residential proxies, rotating IPs and spoofing TLS fingerprints to mimic legitimate browsing behaviour.

Change detection
Tracking lead time fluctuations

For clients monitoring supply chains, we maintain a state file of previous lead times. The pipeline only emits records when a shipping estimate or stock status changes, reducing unnecessary data transfer.

Applications

Who uses YLighting data

Teams across industries use ylighting.com data to build competitive products and smarter operations.

01
Competitor Price Monitoring

Retailers track YLighting's pricing, promotional events, and clearance discounts to adjust their own merchandising strategies.

02
Interior Design Aggregation

Procurement platforms ingest dimensions, finishes, and spec sheets to build unified catalogues for architects and designers.

03
Supply Chain Tracking

Analysts monitor stock availability and lead times across specific brands to identify manufacturing delays and supply bottlenecks.

04
Brand MAP Enforcement

Lighting manufacturers audit YLighting's retail prices to ensure compliance with Minimum Advertised Price agreements.

05
Market Trend Analysis

Researchers analyse category expansion, new designer additions, and popular finishes to forecast interior design trends.

06
Catalogue Enrichment

Smaller retailers use extracted specification data to backfill missing technical details in their own product databases.

Why DataFlirt

"YLighting holds the most structured catalogue of modern designer lighting on the web, but extracting accurate variant matrices and technical specs requires a pipeline built for complex eCommerce architectures."

Scraping YLighting is not just about grabbing titles and prices. High-end lighting involves multi-dimensional variants, PDF specification sheets, and dynamic lead times. DataFlirt handles the JavaScript rendering and schema normalisation so your team receives clean, structured data ready for analysis.

Technical Spec

YLighting extraction capabilities

Everything supported by our ylighting.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Playwright sessions required to hydrate variant pricing and imagery
Supported
Variant matrices
Extraction of all finish, size, and lamping combinations per SKU
Supported
PDF URL extraction
Capture of spec sheets and installation guides
Supported
Residential proxies
US-based IP rotation to bypass basic rate limiting
Supported
Change detection
Diff-based delivery for pricing and stock status updates
Supported
Review pagination
Iterating through all customer reviews and ratings
Supported
Trade Advantage pricing
Requires authenticated trade account credentials to access
Partial
User cart data
Session-specific cart totals and shipping calculations
Partial
Infrastructure

Infrastructure powering the pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Playwright for Variant Hydration

We use Playwright to programmatically select finish and size options, triggering the necessary network requests to capture variant-specific pricing and stock data.

Normalisation Pipeline

Custom Python middleware parses unstructured technical specifications, converting varied dimensional formats into clean, queryable database columns.

Cloud-Native Orchestration

Airflow manages the crawl schedules, dispatching containerised Scrapy spiders across our Kubernetes cluster to ensure rapid catalogue traversal.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Nested structures ideal for complex variant matrices
CSV
Flat files for immediate use in spreadsheet applications
XLS
Excel format for non-technical procurement teams
Parquet
Columnar storage optimised for data warehouse ingestion
AWS S3
Direct delivery to your cloud storage buckets
Webhook
Real-time HTTP POST payloads for immediate updates
API
REST endpoints to query extracted datasets on demand
BigQuery
Direct streaming into Google Cloud data warehouses
PostgreSQL
Direct database inserts with conflict resolution
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About ylighting.com scraping, legality, and pipeline operations.

Ask us directly →
Can you extract data for every finish and size combination?

Yes. Our pipeline interacts with the product page to select every available variant, capturing the specific price, SKU, lead time, and image URL associated with that exact configuration.

Do you extract PDF specification sheets?

We extract the direct URLs to all PDF assets, including spec sheets, installation guides, and warranty documents, delivering them as structured fields linked to the parent SKU.

How frequently can the catalogue be updated?

We can run full catalogue sweeps weekly or daily. For pricing and stock monitoring on specific high-priority brands, we can configure hourly change-detection runs.

Can you access YLighting Trade Advantage pricing?

Extracting trade-specific pricing requires authenticating with a valid trade account. If you provide the credentials, we can configure the pipeline to log in and capture the discounted rates.

How do you handle unstructured dimension text?

We apply regex-based normalisation during the extraction process to parse strings like '10.5 in W x 12 in H' into distinct numerical fields for width, height, and depth.

What is the minimum viable engagement?

We typically start with a defined scope, such as extracting a specific list of brands or categories on a weekly schedule. Contact us to scope your specific data requirements.

$ dataflirt scope --new-project --source=ylighting.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. From complete catalogue dumps to daily lead-time monitoring across specific designer brands. We build and operate the infrastructure. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →