SYSTEM all green source compass.com queue 18,492 URLs p99 latency 214ms dataflirt.com · scraper/compass-com
RUN · 84 active pipelines · compass.com live

Compass data,
at warehouse scale.

We extract property listings, agent directories, transaction histories, and market dynamics from Compass. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Listings extracted
142K /day
Agent profiles
84K /run
Price updates
32K /24h
Active pipelines
84
Uptime
99.94%
Data Dictionary

Every field we extract from compass.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Property Listings objects from compass.com. All fields typed and schema-versioned.

listing_idaddressneighbourhoodcitystatezip_codepricebedroomsbathroomssquare_feetproperty_typestatusdays_on_marketcompass_exclusiveagent_iddescriptionimage_urls
property_listings
● 200 OK
"listing_id": "123456789",
"address": "15 Central Park West, Apt 14D",
"neighbourhood": "Upper West Side",
"price": 4500000,
"bedrooms": 3,
"bathrooms": 3.5,
"property_type": "Condo",
"status": "Active",
"compass_exclusive": true
# listing_idaddressneighbourhoodcitystatezip_code
1
2
3

Complete list of extractable fields for Agent Profiles objects from compass.com. All fields typed and schema-versioned.

agent_idnametitleteam_namephoneemailoffice_locationtotal_sales_volumeactive_listings_countpast_sales_countbiosocial_linkslanguages_spokenprofile_image_url
agent_profiles
● 200 OK
"agent_id": "987654321",
"name": "Jane Doe",
"title": "Principal Agent",
"team_name": "The Doe Team",
"office_location": "New York, NY",
"active_listings_count": 12,
"past_sales_count": 84,
"languages_spoken": "['English', 'Spanish']"
# agent_idnametitleteam_namephoneemail
1
2
3

Complete list of extractable fields for Transaction History objects from compass.com. All fields typed and schema-versioned.

property_idevent_dateevent_typepriceprice_per_sqftsourcebuyer_agent_idseller_agent_idprice_change_pct
transaction_history
● 200 OK
"property_id": "123456789",
"event_date": "2023-11-14",
"event_type": "Sold",
"price": 4350000,
"price_per_sqft": 2150,
"source": "ACRIS",
"buyer_agent_id": "987654321",
"price_change_pct": -3.33
# property_idevent_dateevent_typepriceprice_per_sqftsource
1
2
3

Complete list of extractable fields for Building Data objects from compass.com. All fields typed and schema-versioned.

building_idnameaddresstotal_unitsyear_builtamenitiesactive_salesactive_rentalspast_salespet_policymanagement_company
building_data
● 200 OK
"building_id": "B-4567",
"name": "The Century",
"total_units": 426,
"year_built": 1931,
"active_sales": 5,
"active_rentals": 2,
"pet_policy": "Pets Allowed",
"amenities": "['Doorman', 'Elevator', 'Gym']"
# building_idnameaddresstotal_unitsyear_builtamenities
1
2
3

Complete list of extractable fields for Open Houses objects from compass.com. All fields typed and schema-versioned.

property_idaddresspricedatestart_timeend_timeagent_idtour_typevirtual_tour_urlrsvp_required
open_houses
● 200 OK
"property_id": "123456789",
"address": "15 Central Park West, Apt 14D",
"date": "2023-12-03",
"start_time": "13:00",
"end_time": "15:00",
"tour_type": "In-Person",
"rsvp_required": false,
"agent_id": "987654321"
# property_idaddresspricedatestart_timeend_time
1
2
3

Capabilities

Everything you need from Compass

Our Compass scraper handles storefront listings, dynamic pricing, building directories, and agent intelligence. We bypass Next.js hydration and anti-bot systems automatically.

Full Property Data Extraction

Address, price, beds, baths, square footage, amenities, HOA fees, and description text scraped at the individual listing level.

Agent Directory Mining

Extract agent names, contact details, team affiliations, active inventory, and historical sales volume across all operating regions.

Transaction & Price History

Capture price drops, delistings, and final sale prices timestamped per event to track market velocity and asset depreciation.

Building & Condo Directories

Extract building-level metadata including total units, year built, amenities, pet policies, and aggregated active inventory.

Open House Schedules

Track upcoming open houses, virtual tour links, and RSVP requirements mapped to specific agents and properties.

Media & Floorplan Links

Extract high-resolution image URLs, 3D tour links, and floorplan assets for automated property valuation models.

Geospatial & Neighbourhood Data

Capture latitude, longitude, neighbourhood boundaries, and assigned school districts for precise mapping integrations.

Scheduled + Streaming Modes

Run one-off bulk exports or configure continuous pipelines at hourly, daily, or real-time cadences with change-detection diffing.

Multi-Region Support

Scrape inventory across all Compass markets including NYC, Los Angeles, Miami, Chicago, and San Francisco from a unified schema.

// engagement pipeline

From URL list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide zip codes, neighbourhoods, agent IDs, or building URLs. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, session management, and CAPTCHA handling for compass.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, price-outlier detection, and sample exports before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Compass pipeline handles the hard parts

Compass invests in scraping detection and dynamic rendering. Here is how we stay resilient and deliver clean data.

pipeline-monitor · compass.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Residential proxy rotation + fingerprint spoofing

Compass uses advanced bot detection based on TLS fingerprints and IP reputation. Our crawlers use US residential ISP proxies with realistic browser fingerprints and full cookie session management.

JavaScript rendering
Next.js hydration and state extraction

Compass is a React-heavy application. We extract the raw JSON state directly from the Next.js hydration payload, bypassing fragile DOM parsing and capturing complete property metadata instantly.

Schema stability
Resilient selectors with fallback chains

Real estate platforms update their layouts frequently. Our selector strategy uses multiple fallback chains per field, ensuring a frontend redesign does not break your data pipeline.

Change detection
Only re-scrape what has changed

For large market monitoring, we maintain a hash index of last-seen values per listing. Subsequent runs only push diffs, reducing compute cost and downstream processing load.

Monitoring & alerting
24/7 pipeline health with anomaly detection

Every run emits structured logs to our observability stack. We alert on null-rate spikes, inventory drops, and schema drift. SLA uptime is contractual, not aspirational.

Applications

Who uses Compass data

Teams across industries use compass.com data to build competitive products and smarter operations.

01
Investment Analysis

Institutional investors track price reductions, days on market, and inventory levels to identify distressed assets and buying opportunities.

02
Agent Recruiting

Brokerages monitor competitor agent performance, sales volume, and active listings to identify top producers for recruitment.

03
Market Benchmarking

Real estate analysts aggregate neighborhood-level pricing and transaction velocity to build predictive market models.

04
Property Valuation

PropTech companies ingest raw listing data and transaction histories to train automated valuation models (AVMs).

05
Lead Generation

Service providers extract agent contact details and new listing events to target real estate professionals with relevant offerings.

06
PropTech Integration

Startups sync active inventory and building directories to populate their own consumer-facing real estate applications.

Why DataFlirt

"Compass holds the most accurate luxury real estate inventory and agent transaction records, but extracting it requires bypassing aggressive rate limits and dynamic Next.js rendering."

Most teams underestimate the investment required: reliable Compass scraping requires residential proxies, full JavaScript rendering, CAPTCHA handling, daily selector maintenance, and anomaly monitoring. DataFlirt absorbs that complexity so your engineers can focus on the analysis, not the infrastructure.

Technical Spec

Compass scraper — technical capabilities

Everything supported by our compass.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions and Next.js state extraction
Supported
CAPTCHA bypass
Automated 2Captcha + CapSolver integration
Supported
Residential proxy rotation
ISP-grade residential IPs from US pools rotated per request
Supported
Historical transactions
Extract full ledger of past sales and price changes per property
Supported
Agent directory pagination
Traverse all agent profiles across specified regions
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed fields since last run
Supported
Client-only private portal data
Documents and communications restricted to the Compass client dashboard
Partial
Saved searches and favorites
User-specific saved collections requiring authenticated consumer accounts
Partial
Infrastructure

Infrastructure powering the Compass pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US regions. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns
XLS
Excel format for non-technical stakeholders
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoint to query your extracted datasets
BigQuery
Streamed directly into your dataset
Snowflake
Stage + COPY INTO workflow
PostgreSQL
Upsert into your existing schema
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About compass.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Compass legal?

Scraping publicly available real estate listings and agent profiles is generally permissible under applicable law. DataFlirt targets only public, non-authenticated data. We do not extract personal client data or circumvent authentication walls. Clients should review Compass ToS and consult legal counsel for specific use cases.

How do you handle Next.js dynamic rendering?

Instead of relying purely on fragile DOM parsing, our crawlers intercept and parse the Next.js hydration state (JSON) embedded in the page source. This allows us to extract complete, structured property and agent metadata instantly and reliably.

Can you extract transaction history for a specific property?

Yes. We extract the full historical ledger available on the listing page, including past sales, price reductions, and delistings, complete with dates and associated agents.

How fresh is the listing data?

Pipeline cadences are configurable. We can run daily full-market refreshes or hourly delta updates for specific high-value zip codes to ensure you capture new inventory and price changes immediately.

Do you extract agent contact information?

Yes. We extract public phone numbers, email addresses, office locations, and social media links present on the agent profile pages and listing directories.

Can I request a sample dataset?

Absolutely. We provide a sample run of up to 500 listings or agent profiles as part of the pre-engagement scoping process so you can validate schema fit and data quality.

$ dataflirt scope --new-project --source=compass.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off market dump or a continuous inventory feed across multiple cities, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →