SYSTEM all green source crexi.com queue 12,492 parcels p99 latency 184ms dataflirt.com · scraper/crexi-com
RUN * 42 active pipelines * crexi.com live

Crexi property data,
at warehouse scale.

We extract commercial properties for sale and lease, auction schedules, cap rates, zoning details, and broker intelligence from Crexi. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Listings extracted
340K /day
Status updates
85K /24h
Broker records
12K /run
Active pipelines
42
Uptime
99.98%
Data Dictionary

Every field we extract from crexi.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for For Sale Listings objects from crexi.com. All fields typed and schema-versioned.

property_idtitleproperty_typesub_typepricecap_ratenoioccupancysquare_footagelot_size_acresyear_builtzoningapnaddresscitystatezip
for_sale listings
● 200 OK
"property_id": "PRP-849201",
"title": "Downtown Retail Center",
"property_type": "Retail",
"price": 4500000.0,
"cap_rate": 6.5,
"noi": 292500.0,
"occupancy": 95.0,
"apn": "042-192-04-000"
# property_idtitleproperty_typesub_typepricecap_rate
1
2
3

Complete list of extractable fields for Lease Listings objects from crexi.com. All fields typed and schema-versioned.

property_idtitlespace_available_sqftmin_divisible_sqftmax_contiguous_sqftlease_ratelease_typetermdate_availablespace_usefloorcondition
lease_listings
● 200 OK
"property_id": "LSE-392011",
"space_available_sqft": 12500,
"min_divisible_sqft": 2500,
"lease_rate": 24.5,
"lease_type": "NNN",
"space_use": "Medical Office",
"condition": "White Box",
"date_available": "2026-08-01"
# property_idtitlespace_available_sqftmin_divisible_sqftmax_contiguous_sqftlease_rate
1
2
3

Complete list of extractable fields for Auction Properties objects from crexi.com. All fields typed and schema-versioned.

auction_idproperty_idtitlestarting_bidreserve_metauction_start_dateauction_end_datebidders_countcurrent_biddeposit_requireddue_diligence_vault_url
auction_properties
● 200 OK
"auction_id": "AUC-99382",
"starting_bid": 1500000.0,
"reserve_met": false,
"auction_start_date": "2026-09-15T14:00:00Z",
"auction_end_date": "2026-09-17T14:00:00Z",
"deposit_required": 50000.0,
"current_bid": 1650000.0
# auction_idproperty_idtitlestarting_bidreserve_metauction_start_date
1
2
3

Complete list of extractable fields for Broker Intelligence objects from crexi.com. All fields typed and schema-versioned.

broker_idnameagencytitlephone_numberemailactive_listings_counttotal_sales_volumeprofile_urllicensesspecialtiesregions_served
broker_intelligence
● 200 OK
"broker_id": "BRK-4829",
"name": "Sarah Jenkins",
"agency": "CBRE",
"active_listings_count": 14,
"specialties": "['Industrial', 'Logistics']",
"licenses": "['DRE 01928374']",
"regions_served": "['Southern California']"
# broker_idnameagencytitlephone_numberemail
1
2
3

Complete list of extractable fields for Demographics & Traffic objects from crexi.com. All fields typed and schema-versioned.

property_idradius_milespopulationmedian_incomeaverage_agehouseholdsprojected_growthtraffic_countwalk_scoretransit_score
demographics_& traffic
● 200 OK
"property_id": "PRP-849201",
"radius_miles": 3.0,
"population": 142890,
"median_income": 84500.0,
"projected_growth": 2.4,
"traffic_count": 34500,
"walk_score": 82,
"transit_score": 65
# property_idradius_milespopulationmedian_incomeaverage_agehouseholds
1
2
3

Capabilities

Commercial real estate data, structured for analysis

Our Crexi scraper navigates map-based search interfaces, extracts nested financial models, and normalises lease rates across millions of commercial assets.

Asset Class Extraction

Extract specific data models for retail, industrial, office, multifamily, and special purpose properties.

Financial Metrics

Capture Cap Rate, Net Operating Income (NOI), Gross Rent Multiplier (GRM), and occupancy percentages.

Map-Based Scraping

We intercept Crexi map tile APIs to extract properties by polygon, radius, or specific MSA boundaries.

Auction Tracking

Monitor starting bids, auction windows, deposit requirements, and reserve status for distressed assets.

Lease Normalisation

Standardise Triple Net (NNN), Modified Gross (MG), and Full Service Gross (FSG) lease structures.

APN & Zoning Data

Extract Assessor's Parcel Numbers, zoning codes, lot dimensions, and year-built metadata.

Broker Directory Mining

Extract listing agents, brokerage firms, contact numbers, and active listing portfolios.

Demographic Context

Pull 1-mile, 3-mile, and 5-mile radius demographic models and traffic counts attached to listings.

Status Change Detection

Track when properties move from Active to Under Contract, Sold, or Off-Market.

// engagement pipeline

From target MSA to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide target MSAs, asset classes, or specific brokerages. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy crawlers, map API interception, and session management for crexi.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, and lease rate normalisation before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

Overcoming Crexi extraction challenges

Commercial real estate platforms use complex map interfaces and gated interactions. Here is how we extract the underlying data.

pipeline-monitor · crexi.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Map Interface
GeoJSON API Interception

Crexi limits traditional pagination in favour of map-based browsing. We intercept the backend GraphQL and REST APIs driving the map layers, allowing us to query by specific bounding boxes and extract thousands of points without manual zooming.

Dynamic Content
React Hydration Parsing

Property details are rendered client-side via React. We parse the underlying Next.js hydration state (JSON embedded in the DOM) to extract cap rates and NOI directly, bypassing the need for fragile CSS selectors.

Contact Gating
Automated Interaction Flows

Broker contact details and offering memorandums often require a click to reveal. Our Playwright nodes simulate user interaction patterns to expose phone numbers and email addresses while managing session cookies.

Data Normalisation
Standardising Lease Types

Lease terms on Crexi are highly variable. We parse and normalise text strings to categorise leases into NNN, MG, or FSG, and convert monthly vs annual rates into a unified annualised metric.

State Management
Tracking Off-Market Assets

Crexi removes sold properties from primary search results. We maintain historical hashes of all seen APNs and query them directly to determine if an asset has sold, expired, or been delisted.

Applications

Who uses Crexi data

Teams across industries use crexi.com data to build competitive products and smarter operations.

01
Investment Analysis

Private equity firms monitor cap rates and NOI across specific MSAs to identify mispriced assets and yield opportunities.

02
Brokerage Market Share

National brokerages track competitor listing volume, time-on-market, and asset class dominance by region.

03
Appraisal & Valuation

Commercial appraisers build automated valuation models (AVMs) using active listing prices, APN data, and zoning codes.

04
Lead Generation

Lenders, title companies, and contractors extract broker contact details and new listings to pitch commercial services.

05
Retail Site Selection

Corporate expansion teams analyse traffic counts, demographic models, and lease rates to determine new store locations.

06
PropTech Data Enrichment

Real estate software platforms backfill their databases with active inventory and historical auction data.

Why DataFlirt

"Crexi centralises the commercial real estate market, but extracting structured financial models and zoning data requires bypassing complex map-layer APIs."

Most teams underestimate the investment required: reliable Crexi scraping requires intercepting map-bound API responses, rendering React hydration states, and managing session cookies to reveal broker details. DataFlirt absorbs that complexity so your engineers can focus on yield analysis - not infrastructure.

Technical Spec

Crexi scraper - technical capabilities

Everything supported by our crexi.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

Map API interception
Extract listings via bounding box or polygon coordinates
Supported
React state parsing
Direct extraction from Next.js hydration objects
Supported
Broker contact extraction
Automated interaction to reveal hidden phone numbers and emails
Supported
Auction bid tracking
Capture current bid amounts and reserve status during live auctions
Supported
APN extraction
Capture Assessor's Parcel Numbers for cross-referencing public records
Supported
Change detection
Hash-based diffing to track price drops and status changes
Supported
Webhook delivery
HTTP POST per listing for real-time CRM ingestion
Supported
Gated Due Diligence Vaults
Requires executed NDA and verified user authentication
Partial
PRO-only Comp Data
Historical sales comps locked behind paid Crexi PRO subscription
Partial
Infrastructure

Infrastructure powering the Crexi pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and map API pagination. Playwright handles JavaScript rendering and interaction flows for gated contact details.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US regions to prevent IP bans while querying Crexi backend APIs at high concurrency.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested - schema versioned per run
CSV
Flat file with typed columns for financial modelling
XLS
Excel format for immediate analyst consumption
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery - compatible with any data lake
Webhook
HTTP POST per record for real-time CRM ingestion
API
REST endpoint to query your extracted Crexi dataset
Snowflake
Stage + COPY INTO workflow for enterprise warehouses
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About crexi.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Crexi legal?

Scraping publicly available real estate listings is generally permissible. DataFlirt targets only public, non-authenticated property data, broker details, and auction schedules. We do not bypass NDA walls or extract PRO-only comp data. Clients should review Crexi ToS and consult legal counsel for specific use cases.

How do you extract data from the map interface?

We intercept the underlying API requests that populate the map tiles. By supplying specific polygon coordinates or bounding boxes, we can extract all properties within a target MSA without relying on brittle UI automation.

Can you extract broker contact information?

Yes. While phone numbers and emails are often hidden behind a 'click to reveal' button, our pipeline uses Playwright to simulate the necessary interactions and extract the contact details into the structured payload.

Do you track property status changes over time?

Yes. We maintain a state database of all seen APNs and property IDs. Subsequent pipeline runs check these IDs to determine if a property has dropped in price, gone under contract, or been removed from the market.

Can you download Offering Memorandums (OMs)?

We can extract public flyers and brochures attached to listings. However, access to the Due Diligence Vault, which contains detailed OMs and rent rolls, typically requires an executed NDA and manual approval by the listing broker, which we do not automate.

How frequently can the data be updated?

For targeted MSAs or specific asset classes, we can run daily or hourly pipelines. Full national sweeps of all active inventory typically run on a weekly cadence to manage compute costs and avoid rate limits.

$ dataflirt scope --new-project --source=crexi.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off export of industrial assets in Texas or a continuous feed of national multifamily listings - we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →