We extract sales and lettings listings, floorplans, agent directories, and 'Only With Us' early properties from OnTheMarket. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Property Listings objects from onthemarket.com. All fields typed and schema-versioned.
"property_id": "13482910", "title": "3 bedroom semi-detached house for sale", "price": 425000, "price_qualifier": "Offers in excess of", "bedrooms": 3, "epc_rating": "C", "tenure": "Freehold", "only_with_us": true
| # | property_id | title | price | price_qualifier | property_type | bedrooms |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Agent Directory objects from onthemarket.com. All fields typed and schema-versioned.
"agent_id": "AG-7482", "branch_name": "Dexters London Bridge", "company_name": "Dexters", "postcode": "SE1 9SG", "phone_number": "020 7483 9281", "properties_for_sale": 142, "properties_to_rent": 89
| # | agent_id | branch_name | company_name | address | postcode | phone_number |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Pricing & History objects from onthemarket.com. All fields typed and schema-versioned.
"property_id": "13482910", "current_price": 425000, "original_price": 450000, "price_reduced_date": "2023-11-14", "price_reduction_pct": 5.5, "last_sale_date": "2018-06-22", "last_sale_price": 385000
| # | property_id | current_price | original_price | price_reduced_date | price_reduction_pct | historical_sold_prices |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Features & Media objects from onthemarket.com. All fields typed and schema-versioned.
"property_id": "13482910", "floorplan_url": "https://media.onthemarket.com/floorplans/13482910.pdf", "garden": true, "parking": "Off-street", "broadband_speed_mbps": 1000, "council_tax_band": "D", "nearest_station_1": "London Bridge", "nearest_station_distance": "0.4 miles"
| # | property_id | floorplan_url | virtual_tour_url | epc_certificate_url | garden | parking |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for New Developments objects from onthemarket.com. All fields typed and schema-versioned.
"development_id": "DEV-9921", "developer_name": "Barratt Homes", "site_name": "Riverside Quarter", "units_available": 14, "starting_price": 350000, "max_price": 850000, "show_home_status": "Open daily"
| # | development_id | developer_name | site_name | location | units_available | completion_date |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Our OnTheMarket scraper captures every layer of the portal: residential sales, lettings, agent directories, and exclusive early listings — bypassing bot protection and pagination limits automatically.
Extract full property metadata including price, bedrooms, bathrooms, tenure, description, and agent details across all UK regions.
Identify and track properties listed exclusively on OnTheMarket 24 hours before they syndicate to Rightmove or Zoopla.
Scrape the full agent directory to monitor market share, branch locations, and total stock volume per agency.
Capture URLs for high-resolution images, PDF floorplans, EPC certificates, and virtual tour links.
Extract precise coordinates, nearest railway stations, distance metrics, and local broadband speed estimates.
Track original listing prices against current prices, capturing reduction dates and percentage drops.
Monitor new residential sites, tracking developer names, phase completions, and unit pricing bands.
Extract Land Registry sold price history associated with specific postcodes and property records.
Run continuous pipelines that output only new listings, removed listings, or price adjustments to minimise processing overhead.
Brief in. Clean data out.
Provide target regions, postcodes, or agent IDs. We configure the extraction schema and frequency.
We deploy Scrapy crawlers with UK residential proxies and automated CAPTCHA solvers to bypass portal defences.
Schema validation, coordinate normalisation, and null-rate checks run before production deployment.
Clean JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on schedule.
UK property portals employ aggressive bot mitigation and pagination limits. Here is how our infrastructure maintains continuous extraction.
OnTheMarket uses advanced TLS fingerprinting and challenge pages. We utilise UK-based residential proxies and Playwright sessions with spoofed hardware concurrency and canvas fingerprints to maintain high success rates.
Search results are capped at 42 pages (approx 1,000 results). To extract entire regions, our pipeline dynamically splits large search areas into smaller geographic polygons, ensuring zero missed properties.
Precise latitude and longitude coordinates are often obfuscated or require map interaction. We execute the required JavaScript payloads to extract exact location data for spatial analysis.
Property detail pages frequently undergo A/B testing. We employ fallback selector chains targeting embedded JSON-LD and Next.js state objects rather than relying solely on brittle CSS classes.
For daily market monitoring, we maintain state across runs. The pipeline only outputs new instructions, price changes, or properties marked as sold/let, drastically reducing data ingestion costs.
AVM (Automated Valuation Model) providers ingest pricing, floor area, and feature data to train property valuation algorithms.
Institutional landlords track asking rents against capital values to calculate gross yields across different UK postcodes.
Estate agencies monitor local competitors to calculate market share, instruction velocity, and price reduction frequencies.
Green energy firms extract EPC ratings and property types to target households requiring boiler upgrades or insulation.
Property sourcers identify slow-moving stock with multiple price reductions to target motivated sellers.
Consultancies analyse housing density, new development pipelines, and transit proximity for infrastructure planning.
"OnTheMarket represents a critical segment of the UK property ecosystem, carrying exclusive listings 24 hours before they hit Rightmove or Zoopla."
Extracting property data at scale requires bypassing sophisticated bot protection, managing complex map-based pagination, and maintaining selectors across frequent front-end updates. DataFlirt absorbs that complexity so your engineering team can focus on building valuation models and market analysis — not maintaining scrapers.
Everything supported by our onthemarket.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
Scrapy manages request queues and deduplication. Playwright handles JavaScript execution for map rendering and Next.js hydration.
Dedicated pools of UK residential ISP proxies ensure requests appear as legitimate local traffic, preventing geo-blocks and rate limits.
Pipelines execute on AWS ECS with Airflow handling scheduling, retry logic, and delivery to downstream data warehouses.
Data delivered to where your team already works — no new tooling required.
About onthemarket.com scraping, legality, and pipeline operations.
Ask us directly →Scraping publicly accessible property data is generally permissible for analytical purposes. DataFlirt extracts only public listings and agent directory information. We do not bypass authentication walls or extract personal user data. Clients should review portal terms of service and consult legal counsel regarding their specific data usage.
We utilise UK residential proxies, full Playwright browser execution, and automated solvers to navigate challenge pages. Our request headers, TLS fingerprints, and concurrency rates are configured to mimic legitimate user behaviour.
Yes. We specifically capture the 'Only With Us' flag, allowing clients to track properties listed on OnTheMarket before they are syndicated to other major portals.
OnTheMarket limits search pagination. To extract entire cities or regions, our system dynamically generates small geographic polygons, ensuring result counts remain below the pagination threshold and capturing 100% of available stock.
Pipeline frequency is configurable. We support daily full-market sweeps, or high-frequency intra-day checks on specific postcodes for real-time new instruction alerting.
Yes. We extract the direct URLs for high-resolution images, PDF floorplans, and EPC documents, which can be downloaded or stored in your data lake.
Our minimum engagement typically starts with daily extraction of a defined set of UK regions or postcodes. Contact our technical team to scope your specific geographic requirements and delivery cadence.
20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a daily feed of new London instructions or a historical price dataset for the entire UK — we scope, build, and operate the pipeline. Tell us what you need.