SYSTEM all green source wlw.de queue 12,948 pages p99 latency 312ms dataflirt.com · scraper/wlw-de
RUN * 41 active pipelines * wlw.de live

wlw.de supplier data,
at warehouse scale.

We extract B2B company profiles, product portfolios, certifications, and contact details from Wer liefert was. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Suppliers extracted
482K /month
Products mapped
3.7M /run
Contact updates
19K /week
Active pipelines
41
Uptime
99.94%
Data Dictionary

Every field we extract from wlw.de

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Company Profiles objects from wlw.de. All fields typed and schema-versioned.

company_idcompany_namewlw_urldescriptionyear_foundedemployee_countrevenue_bracketlegal_formaddress_streetaddress_cityaddress_zipaddress_country
company_profiles
● 200 OK
"company_id": "WLW-98234",
"company_name": "Muller CNC GmbH",
"wlw_url": "https://www.wlw.de/de/firma/muller-cnc-gmbh",
"year_founded": 1985,
"employee_count": "50-99",
"address_city": "Stuttgart",
"address_country": "Germany"
# company_idcompany_namewlw_urldescriptionyear_foundedemployee_count
1
2
3

Complete list of extractable fields for Contact Information objects from wlw.de. All fields typed and schema-versioned.

company_idwebsite_urlphone_numberfax_numberemail_addresscontact_person_namecontact_person_rolesocial_linkshq_location
contact_information
● 200 OK
"company_id": "WLW-98234",
"website_url": "https://muller-cnc.de",
"phone_number": "+49 711 123456",
"email_address": "info@muller-cnc.de",
"contact_person_name": "Hans Muller",
"hq_location": "Stuttgart, DE"
# company_idwebsite_urlphone_numberfax_numberemail_addresscontact_person_name
1
2
3

Complete list of extractable fields for Product Portfolio objects from wlw.de. All fields typed and schema-versioned.

company_idproduct_categoryproduct_nameproduct_descriptionproduct_image_urlwlw_product_urlis_manufactureris_distributoris_service_provider
product_portfolio
● 200 OK
"company_id": "WLW-98234",
"product_category": "CNC Machining",
"product_name": "5-Axis Milling Service",
"is_manufacturer": true,
"is_distributor": false,
"is_service_provider": true
# company_idproduct_categoryproduct_nameproduct_descriptionproduct_image_urlwlw_product_url
1
2
3

Complete list of extractable fields for Certifications objects from wlw.de. All fields typed and schema-versioned.

company_idcertification_namecertification_bodyvalid_untiliso_9001iso_14001din_standardscustom_certifications
certifications
● 200 OK
"company_id": "WLW-98234",
"certification_name": "ISO 9001:2015",
"certification_body": "TUV SUD",
"iso_9001": true,
"iso_14001": false,
"din_standards": "['DIN EN ISO 9001']"
# company_idcertification_namecertification_bodyvalid_untiliso_9001iso_14001
1
2
3

Complete list of extractable fields for Market & Export objects from wlw.de. All fields typed and schema-versioned.

company_idtarget_marketsexport_countrieslanguages_spokentrade_showsassociations_membershipsbrands_carrieddelivery_terms
market_& export
● 200 OK
"company_id": "WLW-98234",
"export_countries": "['Austria', 'Switzerland', 'France']",
"languages_spoken": "['German', 'English']",
"trade_shows": "['Hannover Messe 2024']",
"brands_carried": "['Siemens', 'Fanuc']",
"delivery_terms": "EXW"
# company_idtarget_marketsexport_countrieslanguages_spokentrade_showsassociations_memberships
1
2
3

Capabilities

Everything you need from wlw.de

Our wlw.de scraper handles every layer of the directory: firmographic profiles, nested product catalogues, and protected contact details, with JavaScript rendering and anti-bot circumvention built in.

Full Company Profiles

Extract legal names, founding years, employee brackets, and descriptions for hundreds of thousands of DACH suppliers.

Product & Service Catalogues

Map exact product offerings, distinguishing between primary manufacturers, distributors, and service providers.

Contact Data Resolution

Render JavaScript to extract protected phone numbers, website links, and available email addresses from supplier pages.

Certification Tracking

Capture ISO standards, DIN norms, and quality certifications to filter suppliers meeting strict procurement requirements.

Export & Market Intelligence

Identify target markets, supported languages, and export capabilities for cross-border sourcing.

Deep Category Crawling

Navigate wlw.de's complex nested category taxonomy to ensure complete coverage of niche industrial sectors.

Legal Form & Structure

Extract corporate structures (GmbH, AG, GmbH & Co. KG) for compliance and risk assessment workflows.

Change Detection

Monitor supplier profiles for updated contact details, new product lines, or lapsed certifications.

Anti-Bot Circumvention

Bypass strict rate limits and Cloudflare challenges using residential proxies and humanised request patterns.

// engagement pipeline

From target sector to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Specify target categories, keywords, or location filters. We map the required data fields to your schema.

Pipeline Build
d 2–4

We configure Playwright crawlers, proxy rotation, and interaction scripts to expose hidden contact data.

Validation & QA
d 4–6

Automated checks for null rates, missing phone numbers, and category misalignments before production deployment.

Delivery
ongoing

Structured records pushed to your S3 bucket, Snowflake stage, or PostgreSQL database on a defined schedule.

Under the hood

How our wlw.de pipeline handles the hard parts

B2B directories protect their supplier data fiercely. Here is how we maintain reliable extraction against aggressive bot mitigation.

pipeline-monitor · wlw.de · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Click to reveal
JavaScript execution for contact data

Phone numbers and external website links on wlw.de are often masked behind JavaScript events. Our Playwright nodes execute the necessary clicks and state changes to capture the underlying DOM nodes.

Rate limiting
DACH residential proxy rotation

Requests originating outside the DACH region or from known data centre IPs face immediate blocks. We route traffic exclusively through German, Austrian, and Swiss residential ISP proxies.

Pagination limits
Deep category slicing

Search results truncate after a specific page count. We programmatically slice broad categories by granular geographic regions and subcategories to extract the entire supplier list without hitting pagination walls.

Cloudflare protection
TLS fingerprinting & CAPTCHA solving

We spoof JA3/JA4 fingerprints and manage browser headers to bypass Cloudflare turnstiles, falling back to automated solvers when interactive challenges are presented.

Schema drift
Resilient DOM targeting

Directory layouts change frequently to disrupt scrapers. We use fallback selector chains combining XPath, CSS, and regex patterns to ensure continuous data flow when primary elements shift.

Applications

Who uses wlw.de data and how

Teams across industries use wlw.de data to build competitive products and smarter operations.

01
Supplier Discovery & Sourcing

Procurement teams build custom supplier databases to find alternative manufacturers for critical components.

02
Lead Generation

B2B sales teams extract highly targeted lists of industrial companies based on specific machinery or service requirements.

03
Market Mapping

Consultancies analyse regional density of specific industries and manufacturing capabilities across the DACH region.

04
Competitor Analysis

Distributors monitor competitor product portfolios, brand representations, and target markets.

05
Compliance & Risk

Audit teams continuously track supplier certifications, ISO standards, and legal entity changes.

06
Master Data Enrichment

CRM administrators append missing firmographic data, employee counts, and revenue brackets to existing account records.

Why DataFlirt

"wlw.de holds the definitive map of DACH industrial manufacturing, but extracting it requires navigating strict rate limits and aggressive bot protection."

B2B directories actively defend their core asset: contact data. Simple HTTP scrapers fail immediately against Cloudflare challenges and JavaScript rendered phone numbers. DataFlirt deploys localised residential proxies and headless browsers to extract complete supplier profiles reliably, delivering clean firmographic data directly to your warehouse.

Technical Spec

wlw.de scraper technical capabilities

Everything supported by our wlw.de scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Executes clicks to reveal masked phone numbers and external links
Supported
DACH proxy targeting
Routes requests through local IPs to avoid geographic blocking
Supported
Category deep crawling
Bypasses pagination limits by slicing searches geographically
Supported
Certification extraction
Parses structured quality standard and ISO certification data
Supported
Change detection
Identifies new suppliers or updated contact details since last run
Supported
Multi-format delivery
Pushes data as JSON, CSV, or Parquet to cloud storage
Supported
Direct messaging
Automated sending of RFQs or messages through the wlw.de portal
Partial
Premium analytics
Access to supplier profile visitor statistics and premium dashboard data
Partial
Infrastructure

Infrastructure powering the wlw.de pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and deduplication. Playwright executes JavaScript to reveal hidden contact information and interact with Cloudflare turnstiles.

Localised Proxy Infrastructure

We maintain dedicated pools of residential ISP proxies across Germany, Austria, and Switzerland to ensure high success rates and avoid geo-blocks.

Cloud-Native Orchestration

Pipelines run on Kubernetes clusters. Airflow manages scheduling and dependencies, while Prometheus and Grafana provide real-time observability.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline delimited or nested schema versioned per run
CSV
Flat file with typed columns for CRM imports
Parquet
Columnar format optimized for BigQuery and Snowflake
AWS S3
Direct bucket delivery compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoints to query extracted supplier data on demand
PostgreSQL
Direct upsert into your existing relational database schema
XLS
Standard Excel format for manual procurement review workflows
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About wlw.de scraping, legality, and pipeline operations.

Ask us directly →
Is scraping wlw.de legal?

Scraping publicly available firmographic data is generally permissible under EU law, provided it targets corporate entities rather than personal data, adhering to GDPR guidelines. We extract public company profiles, generic contact details, and product catalogues. Clients must ensure their downstream use cases, such as cold outreach, comply with local regulations like the UWG in Germany.

How do you extract hidden phone numbers?

wlw.de masks contact details behind JavaScript events to deter basic scrapers. Our pipeline uses headless Playwright browsers to simulate human interaction, clicking the necessary elements to render the full text before extraction.

Can you bypass the pagination limits on large categories?

Yes. When a broad category like 'Mechanical Engineering' hits the maximum displayable results, our crawlers automatically subdivide the query by postal code ranges and city filters to extract the complete dataset without truncation.

Do you provide historical data or track changes?

We can configure pipelines to run continuously, diffing new extractions against historical runs. This allows us to deliver delta payloads containing only new suppliers, updated contact details, or newly acquired certifications.

How do you handle Cloudflare bot protection?

We utilise residential proxies originating from the DACH region combined with sophisticated TLS fingerprint spoofing and humanised request delays. When interactive challenges occur, automated solvers clear the hurdles without pipeline interruption.

What is the typical delivery frequency?

Pipelines can be scheduled according to your requirements. Most CRM enrichment and procurement use cases rely on weekly or monthly full catalog refreshes to balance data freshness with compute efficiency.

$ dataflirt scope --new-project --source=wlw.de ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a complete dump of the wlw.de directory or targeted weekly updates for specific industrial sectors, we build and maintain the infrastructure. Specify your requirements.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →