SYSTEM all green source homepath.com queue 14,291 listings p99 latency 184ms dataflirt.com · scraper/homepath-com
RUN · 42 active pipelines · homepath.com live

Fannie Mae REO data,
at warehouse scale.

We extract distressed property listings, First Look status, pricing signals, and listing agent details from Homepath. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Properties extracted
84.2K /run
Price updates
12.4K /24h
First Look properties
18.9K /run
Active pipelines
42
Uptime
99.98%
Data Dictionary

Every field we extract from homepath.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Property Listings objects from homepath.com. All fields typed and schema-versioned.

mls_idaddresscitystatezip_codebedsbathssqftlot_size_acresyear_builtproperty_typelisting_statusdays_on_marketfirst_look_eligiblelatitudelongitude
property_listings
● 200 OK
"mls_id": "RX-10892341",
"address": "142 Maple St",
"city": "Orlando",
"state": "FL",
"zip_code": "32801",
"beds": 3,
"baths": 2,
"sqft": 1850,
"first_look_eligible": true,
"property_type": "Single Family"
# mls_idaddresscitystatezip_codebeds
1
2
3

Complete list of extractable fields for Pricing & Financials objects from homepath.com. All fields typed and schema-versioned.

mls_idcurrent_priceoriginal_priceprice_drop_amountprice_drop_pcthoa_feesestimated_taxestax_yearearnest_money_requiredfinancing_types_allowedprice_per_sqft
pricing_& financials
● 200 OK
"mls_id": "RX-10892341",
"current_price": 185000.0,
"original_price": 210000.0,
"price_drop_amount": 25000.0,
"hoa_fees": 0.0,
"estimated_taxes": 2450.0,
"earnest_money_required": 1000.0
# mls_idcurrent_priceoriginal_priceprice_drop_amountprice_drop_pcthoa_fees
1
2
3

Complete list of extractable fields for First Look & Status objects from homepath.com. All fields typed and schema-versioned.

mls_idfirst_look_eligiblefirst_look_end_datelisting_statusforeclosure_statusauction_dateauction_venueproperty_conditionoccupancy_statusreo_id
first_look & status
● 200 OK
"mls_id": "RX-10892341",
"first_look_eligible": true,
"first_look_end_date": "2024-05-12T23:59:59Z",
"listing_status": "Active",
"foreclosure_status": "REO",
"property_condition": "Needs Repair",
"occupancy_status": "Vacant"
# mls_idfirst_look_eligiblefirst_look_end_datelisting_statusforeclosure_statusauction_date
1
2
3

Complete list of extractable fields for Agent & Broker Data objects from homepath.com. All fields typed and schema-versioned.

mls_idlisting_agent_nameagent_phoneagent_emailbrokerage_namebrokerage_addressbrokerage_phoneagent_license_numberbroker_license_number
agent_& broker data
● 200 OK
"mls_id": "RX-10892341",
"listing_agent_name": "Sarah Jenkins",
"agent_phone": "555-019-8472",
"brokerage_name": "Premier REO Brokers",
"agent_license_number": "RE-99482",
"brokerage_phone": "555-019-8000",
"agent_email": "sarah@premier-reo.com"
# mls_idlisting_agent_nameagent_phoneagent_emailbrokerage_namebrokerage_address
1
2
3

Complete list of extractable fields for Property Features objects from homepath.com. All fields typed and schema-versioned.

mls_idheating_typecooling_typeparking_spacesgarage_typebasement_typeroof_typeexterior_materialflooring_typeappliances_included
property_features
● 200 OK
"mls_id": "RX-10892341",
"heating_type": "Forced Air",
"cooling_type": "Central",
"parking_spaces": 2,
"garage_type": "Attached",
"basement_type": "Unfinished",
"roof_type": "Asphalt Shingles"
# mls_idheating_typecooling_typeparking_spacesgarage_typebasement_type
1
2
3

Capabilities

Everything you need from Homepath — nothing you don't

Our Homepath scraper handles every layer of the platform: property details, First Look countdowns, pricing history, and agent directories — with geospatial subdivision and anti-bot circumvention built in.

Full Property Data Extraction

Beds, baths, square footage, lot size, year built, and structural features extracted for every REO listing.

First Look Tracking

Monitor expiration dates for Fannie Mae's First Look program to time investor offers perfectly.

Price Reduction Monitoring

Track current price, original list price, and price drop percentages across the REO lifecycle.

Agent Directory Extraction

Capture listing agent names, phone numbers, emails, and brokerage details for REO networking.

HOA & Tax Data

Extract carrying costs including estimated annual taxes and monthly HOA fees.

Geospatial Filtering

Define extraction boundaries by zip code, county, state, or custom latitude/longitude bounding boxes.

Image & Media Scraping

Extract high-resolution property photos and floorplan image URLs for offline analysis.

Auction Detail Capture

Track auction dates, venues, starting bids, and property condition flags.

Scheduled Diffs

Run daily sweeps that only emit records for listings that changed status or price since the last run.

// engagement pipeline

From target counties to warehouse records

Brief in. Clean data out.

Define Scope
d 0

Provide target states, counties, or zip codes. We design the extraction schema and frequency together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, and geospatial subdivision logic for homepath.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, and coordinate verification before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Homepath pipeline handles the hard parts

Real estate portals invest heavily in scraping detection. Here's how we stay resilient — and why teams choose managed infrastructure over DIY.

pipeline-monitor · homepath.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Residential proxy rotation + fingerprint spoofing

Property portals use strict WAFs to block datacenter IPs. Our crawlers use US-based residential ISP proxies with realistic browser fingerprints and full cookie session management.

Geospatial subdivision
Bypassing map pagination limits

Homepath limits search results per map view. We programmatically subdivide large geographic areas into smaller bounding boxes to ensure 100% listing capture without hitting pagination caps.

JavaScript rendering
Full Playwright execution for dynamic maps

Homepath relies heavily on client-side rendering for map clusters and property details. We run full Playwright browser sessions to hydrate the DOM and intercept API payloads directly.

Change detection
Only re-scrape what's changed

For daily market sweeps, we maintain a hash index of last-seen values per property. Subsequent runs only push diffs — capturing status changes from Active to Pending instantly.

Monitoring & alerting
24/7 pipeline health

Every run emits structured logs to our observability stack. We alert on null-rate spikes, missing First Look dates, and DOM structure changes — fixing selectors before you notice.

Applications

Who uses Homepath data — and how

Teams across industries use homepath.com data to build competitive products and smarter operations.

01
Real Estate Investment

Flippers and investors track First Look expiration dates to submit offers the moment properties open to non-owner occupants.

02
Institutional Buying

REITs and funds aggregate REO inventory data across multiple states to identify acquisition targets at scale.

03
Broker Lead Generation

B2B service providers extract active REO listing agents to build targeted marketing lists for staging, repair, or title services.

04
PropTech Valuation Models

AVM platforms ingest distressed property pricing and foreclosure status to refine neighborhood valuation algorithms.

05
Market Analysis

Analysts track REO inventory volume, days on market, and price reduction velocity by county to gauge localized housing market distress.

06
Title & Escrow Services

Settlement service providers monitor pending REO transactions to identify upcoming volume in specific jurisdictions.

Why DataFlirt

"Homepath contains the definitive inventory of Fannie Mae distressed assets, but tracking First Look expirations across thousands of counties requires automated infrastructure."

Most investment firms rely on manual portal checks or delayed MLS feeds to track REO inventory. DataFlirt automates the extraction of Homepath listings, First Look countdowns, and price reductions, delivering clean property datasets directly to your warehouse so your acquisition team can act first.

Technical Spec

Homepath scraper — technical capabilities

Everything supported by our homepath.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

Geospatial polygon search
Extract properties strictly within custom latitude/longitude bounding boxes
Supported
First Look expiration tracking
Capture exact expiration timestamps for owner-occupant priority periods
Supported
High-res image extraction
Extract all property photo URLs at maximum available resolution
Supported
Agent contact info scraping
Extract listing agent phone numbers, emails, and brokerages
Supported
Historical status changes
Track transitions from Active to Pending to Sold
Supported
Daily price diffing
Emit records only when list prices change
Supported
Document download
Extract public disclosures or property condition reports attached to listings
Supported
Automated offer submission
Submitting bids via the Homepath portal requires authenticated user sessions
Partial
Saved search alert configuration
Configuring internal portal alerts requires user authentication
Partial
Infrastructure

Infrastructure powering the Homepath pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLPostGISApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Map-Based Crawling

We intercept internal JSON payloads from map tile requests, bypassing frontend rendering limits and capturing raw coordinate data.

Residential Proxy Infrastructure

We maintain pools of US-based residential ISP proxies to bypass real estate portal WAFs. Rotation happens per-request with sticky sessions where required.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling for daily market sweeps, dependency management, and SLA alerting.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets compatible
XLS
Excel format for non-technical acquisition teams
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery — compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
Queryable REST endpoint for your internal tools
PostgreSQL
Upsert into your existing schema with PostGIS support
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About homepath.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Homepath legal?

Scraping publicly available real estate listings is generally permissible. DataFlirt targets only public, non-authenticated property data, pricing, and agent contact info. We do not circumvent authentication walls or submit automated offers. Clients should review Homepath's ToS and consult legal counsel for specific use cases.

How do you handle map pagination limits?

Homepath limits search results on map views. We bypass this by programmatically subdividing large geographic areas (like entire states) into smaller latitude/longitude bounding boxes, ensuring the result count per box stays under the pagination cap.

Can you track First Look expiration dates?

Yes. We extract the exact First Look expiration date for every eligible property, allowing your acquisition team to time their offers precisely when the property opens to investors.

How fresh is the data?

Most clients opt for daily sweeps of their target markets. We can configure pipelines to run hourly for specific high-priority zip codes to catch new REO listings the moment they go live.

Do you extract listing agent contact information?

Yes. We capture the listing agent's name, phone number, email address, brokerage name, and license numbers where publicly available on the listing page.

What delivery formats do you support?

We deliver data in JSON, CSV, XLS, or Parquet. We can push directly to AWS S3, Google Cloud Storage, BigQuery, Snowflake, or trigger Webhooks for real-time ingestion.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 500 properties in your target market as part of the pre-engagement scoping process so you can validate schema fit and data quality.

$ dataflirt scope --new-project --source=homepath.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need daily sweeps of Fannie Mae inventory across three states or a national pipeline tracking First Look expirations — we build and operate the infrastructure.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →