SYSTEM all green source kompass.com queue 18,492 pages p99 latency 310ms dataflirt.com · scraper/kompass-com
RUN · 112 active pipelines · kompass.com live

Global B2B directory data,
at warehouse scale.

We extract company profiles, executive contacts, product classifications, and financial indicators from Kompass. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Companies extracted
1.2M /run
Executive contacts
3.4M /month
Classification codes
850K /run
Active pipelines
112
Uptime
99.94%
Data Dictionary

Every field we extract from kompass.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Company Profiles objects from kompass.com. All fields typed and schema-versioned.

company_idnamecountrycityaddressphonewebsiteyear_establisheddescriptionlegal_formregistration_number
company_profiles
● 200 OK
"company_id": "FR1234567",
"name": "TechCorp Solutions SAS",
"country": "France",
"city": "Paris",
"year_established": 1998,
"legal_form": "SAS",
"phone": "+33 1 23 45 67 89"
# company_idnamecountrycityaddressphone
1
2
3

Complete list of extractable fields for Executive Contacts objects from kompass.com. All fields typed and schema-versioned.

contact_idcompany_idfirst_namelast_namejob_titledepartmentmanagement_levellinkedin_urlphone_extension
executive_contacts
● 200 OK
"contact_id": "CNT-98765",
"company_id": "FR1234567",
"first_name": "Jean",
"last_name": "Dupont",
"job_title": "Chief Technology Officer",
"department": "IT",
"management_level": "C-Level"
# contact_idcompany_idfirst_namelast_namejob_titledepartment
1
2
3

Complete list of extractable fields for Products & Services objects from kompass.com. All fields typed and schema-versioned.

company_idcategory_namekompass_codedescriptionis_produceris_distributoris_service_providerparent_category
products_& services
● 200 OK
"company_id": "FR1234567",
"category_name": "Software Development Services",
"kompass_code": "85210",
"is_service_provider": true,
"is_producer": false,
"is_distributor": false
# company_idcategory_namekompass_codedescriptionis_produceris_distributor
1
2
3

Complete list of extractable fields for Financial & Activity Data objects from kompass.com. All fields typed and schema-versioned.

company_idturnover_rangeemployee_countimport_regionsexport_regionsbank_namecapital_amountfiscal_year
financial_& activity data
● 200 OK
"company_id": "FR1234567",
"turnover_range": "10M - 50M EUR",
"employee_count": "250-499",
"export_regions": "['Europe', 'North America']",
"capital_amount": "500000 EUR",
"fiscal_year": 2024
# company_idturnover_rangeemployee_countimport_regionsexport_regionsbank_name
1
2
3

Complete list of extractable fields for Search & Category Results objects from kompass.com. All fields typed and schema-versioned.

keywordcategory_pathpositioncompany_idcompany_namelocationverified_badgescraped_at
search_& category results
● 200 OK
"keyword": "industrial valves",
"position": 1,
"company_id": "DE7654321",
"company_name": "ValveTech GmbH",
"location": "Berlin, Germany",
"verified_badge": true
# keywordcategory_pathpositioncompany_idcompany_namelocation
1
2
3

Capabilities

Extract global B2B intelligence at scale

Our Kompass scraper handles complex category hierarchies, geographic pagination, and multilingual directories to deliver unified B2B company data.

Comprehensive Company Profiles

Extract business names, registration numbers, addresses, contact details, and descriptions across 70 countries.

Executive Directory Mapping

Capture decision makers, job titles, and management hierarchies associated with each company profile.

Kompass Classification Codes

Map businesses to specific Kompass product and service codes, identifying them as producers, distributors, or service providers.

Multilingual Support

Scrape directory data across different regional subdomains to capture localised business information.

Financial Indicators

Extract turnover ranges, employee headcount brackets, and registered capital figures where publicly available.

Import and Export Intelligence

Identify active trading regions and countries for companies engaged in international commerce.

Subsidiary and Branch Mapping

Link parent companies to their regional branches and subsidiaries using internal directory references.

Scheduled Refresh Cycles

Run monthly or quarterly updates to track changes in executive personnel, address relocations, or new product classifications.

Standardised Schema Delivery

Normalise inconsistent address formats and telephone numbers into a strict, queryable JSON or CSV schema.

// engagement pipeline

From target criteria to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide target countries, Kompass classification codes, or keyword sets. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy crawlers, regional proxy rotation, session management, and CAPTCHA handling for kompass.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, and sample data delivery before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

Overcoming B2B directory scraping constraints

Kompass restricts access through pagination limits, IP tracking, and CAPTCHAs. We manage the infrastructure required for deep extraction.

pipeline-monitor · kompass.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Pagination bypass
Deep category crawling

Directory search results often truncate after a specific number of pages. We bypass this by recursively querying sub-categories, postal codes, and employee size filters to extract the full underlying dataset.

Anti-bot layer
Residential proxy rotation

High-volume requests from datacenter IPs trigger immediate blocks. We route traffic through residential ISP proxies matching the target region, rotating IPs to distribute the request load.

Data normalisation
Cleaning inconsistent user inputs

B2B directories contain varied formats for phone numbers, addresses, and legal entities. Our pipeline applies regular expressions and standardisation logic to ensure clean, warehouse-ready data.

Monitoring & alerting
24/7 pipeline health

Every run emits structured logs to our observability stack. We alert on null-rate spikes, schema drift, and coverage drops, ensuring reliable data delivery.

CAPTCHA handling
Automated solver integration

Directory sites frequently deploy CAPTCHAs during sustained navigation. We integrate automated solving services to maintain pipeline throughput without manual intervention.

Applications

Who uses Kompass data and how

Teams across industries use kompass.com data to build competitive products and smarter operations.

01
B2B Lead Generation

Sales teams extract targeted lists of companies and executives based on specific industry codes and geographic regions.

02
Supplier Sourcing

Procurement departments map alternative suppliers by filtering manufacturers and distributors within specific Kompass product categories.

03
Market Mapping

Analysts assess industry concentration, company size distributions, and geographic clusters for specific B2B sectors.

04
CRM Enrichment

RevOps teams update stale CRM records with current addresses, phone numbers, and executive contacts from the directory.

05
Competitor Analysis

Strategy teams track competitor branch expansions, product category additions, and export market activity.

06
Master Data Management

Enterprise data teams cross-reference internal vendor lists against Kompass profiles to validate legal entities and registration numbers.

Why DataFlirt

"B2B directories hold critical firmographic data, but extracting complete category hierarchies requires navigating complex pagination and strict anti-bot measures."

Most teams struggle with directory scraping because results are artificially truncated and heavily monitored for automated access. DataFlirt manages the proxy rotation, deep-search querying, and schema normalisation so your data engineering team receives clean firmographics without the operational overhead.

Technical Spec

Kompass scraper technical capabilities

Everything supported by our kompass.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

Company profile extraction
Full capture of public firmographic data, addresses, and descriptions
Supported
Executive directory mapping
Extraction of named contacts, job titles, and management levels
Supported
Kompass classification codes
Mapping of companies to specific producer/distributor categories
Supported
Geographic targeting
Country and region-specific scraping across localized subdomains
Supported
Pagination bypass
Recursive querying via sub-filters to extract beyond standard limits
Supported
Residential proxy rotation
ISP-grade IPs to circumvent datacenter IP blocks
Supported
Automated CAPTCHA solving
Integration with 2Captcha and CapSolver for uninterrupted runs
Supported
Premium executive email addresses
Direct email contacts locked behind the paid Kompass Booster/Premium account
Partial
Credit risk reports
Detailed financial risk assessments provided by third-party paywalls
Partial
Infrastructure

Infrastructure powering the Kompass pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy Orchestration

Scrapy handles high-concurrency crawl orchestration, request deduplication, and retry logic for deep directory traversal.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies mapped to target directory regions. Rotation happens per-request to avoid rate limits.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested array format
CSV
Flat file with typed columns for spreadsheet analysis
XLS
Excel compatible format for business users
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery compatible with any data lake
Webhook
HTTP POST per record for immediate downstream processing
API
REST endpoint access to query recent extraction runs
BigQuery
Streamed directly into your dataset with schema auto-detect
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About kompass.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Kompass legal?

Scraping publicly available firmographic information is generally permissible. DataFlirt extracts only public company profiles and directory listings. We do not circumvent authentication walls to access premium paid data. Clients should review Kompass terms of service and consult legal counsel for specific commercial use cases.

How do you extract beyond the pagination limit?

Directory searches often limit results to a few thousand records per query. We bypass this by systematically applying granular filters, such as postal codes, specific employee size brackets, and sub-categories, ensuring we capture the entire underlying dataset.

Can you extract direct email addresses for executives?

We extract contact information that is publicly visible on the company profile page. Direct executive email addresses are typically gated behind Kompass premium subscriptions and are not included in public scraping pipelines.

Do you support scraping specific countries only?

Yes. We can target specific Kompass regional subdomains and apply geographic filters to extract companies registered only in your target markets.

How do you handle inconsistent data formats?

Our pipeline includes a normalisation layer that standardises telephone numbers, formats addresses into constituent parts, and maps varying legal entity types to a consistent schema.

What is the typical delivery cadence?

For B2B directories, clients typically request monthly or quarterly full-refreshes to capture new registrations and updated executive contacts. One-off historical extractions are also supported.

$ dataflirt scope --new-project --source=kompass.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off extraction of specific industry sectors or a continuous feed of firmographic updates, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →