We extract residential and commercial listings, pricing histories, energy certificates, and broker details from Immowelt.de. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Residential Rent objects from immowelt.de. All fields typed and schema-versioned.
"expose_id": "2d3f4a9", "title": "Helle 3-Zimmer Wohnung mit Balkon in Mitte", "kaltmiete": 1250.0, "warmmiete": 1450.0, "nebenkosten": 200.0, "wohnflaeche": 85.5, "zimmer": 3, "plz": "10115", "city": "Berlin"
| # | expose_id | title | kaltmiete | warmmiete | nebenkosten | heizkosten |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Residential Buy objects from immowelt.de. All fields typed and schema-versioned.
"expose_id": "8b7c6d5", "title": "Modernes Einfamilienhaus im Grünen", "kaufpreis": 850000.0, "kaufpreis_pro_qm": 5666.67, "provision": "3.57% inkl. MwSt.", "wohnflaeche": 150.0, "grundstuecksflaeche": 600.0, "baujahr": 2018, "city": "München"
| # | expose_id | title | kaufpreis | kaufpreis_pro_qm | hausgeld | provision |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Building & Energy objects from immowelt.de. All fields typed and schema-versioned.
"expose_id": "2d3f4a9", "heizungsart": "Zentralheizung", "befeuerungsart": "Fernwärme", "energieausweistyp": "Bedarfsausweis", "endenergiebedarf": "65.4 kWh/(m²*a)", "energieeffizienzklasse": "B", "baujahr": 2015, "objektzustand": "Neuwertig"
| # | expose_id | heizungsart | befeuerungsart | energieausweistyp | endenergiebedarf | energieeffizienzklasse |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Broker & Agency objects from immowelt.de. All fields typed and schema-versioned.
"broker_id": "mkl_99421", "firmenname": "Müller Immobilien GmbH", "makler_name": "Thomas Müller", "telefonnummer": "+49 30 1234567", "adresse": "Kurfürstendamm 10, 10719 Berlin", "active_listings_count": 42, "bewertungen_score": 4.8, "bewertungen_count": 156
| # | broker_id | makler_name | firmenname | telefonnummer | adresse | |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Search Results objects from immowelt.de. All fields typed and schema-versioned.
"plz_input": "20457", "radius_km": 5, "position": 1, "expose_id": "5x9y2z1", "is_top_listing": true, "price": 1800.0, "wohnflaeche": 110.0, "scraped_at": "2026-05-12T10:15:30Z"
| # | search_id | keyword | plz_input | radius_km | position | expose_id |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Our Immowelt scraper bypasses bot protection and renders dynamic map interfaces to extract highly structured property data, pricing histories, and broker intelligence across all German postal codes.
Title, description texts, amenities, floorplans, image arrays, and every metadata field Immowelt surfaces - scraped at the individual listing level.
Capture Kaltmiete, Warmmiete, Nebenkosten, Kaufpreis, Hausgeld, and broker commission rates accurately parsed into numeric fields.
Extract EPC details including Energieeffizienzklasse, Endenergiebedarf, heating types, and build year for ESG compliance and valuation models.
Identify agency names, contact details, active listing counts, and rating scores to map the competitive broker landscape in any region.
Extract PLZ, city, district, street names, and coordinate data from map layers to power hyper-local market analysis.
Monitor time-on-market and price drops for individual Exposé IDs over time to gauge market liquidity and seller motivation.
Extract office spaces, retail locations, and industrial properties with commercial-specific fields like divisible floor space and net rents.
Replicate complex user searches using PLZ inputs and radius parameters to ensure comprehensive coverage without hitting pagination limits.
Run one-off bulk exports or configure continuous pipelines at hourly, daily, or weekly cadences with change-detection diffing.
Brief in. Clean data out.
Provide postal codes, property types, or specific agency URLs. We design the extraction schema together.
We configure Scrapy / Playwright crawlers, German proxy rotation, session management, and CAPTCHA handling for immowelt.de.
Schema validation, null-rate checks, price-outlier detection, and sample listings before full launch.
JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.
German real estate portals deploy aggressive bot protection and complex frontend architectures. Here is how we maintain reliable data flows.
Immowelt uses advanced bot mitigation that blocks standard data centre IPs immediately. Our crawlers route traffic exclusively through German residential ISP proxies, matching regional expectations and maintaining realistic TLS fingerprints to prevent IP bans.
Key property details, image galleries, and exact map locations are loaded dynamically via JavaScript. We execute full Playwright browser sessions to hydrate the DOM, ensuring we capture data that simple HTTP requests miss entirely.
Immowelt limits search results to a maximum number of pages. To extract entire cities, our orchestration engine automatically subdivides large queries by tightening radius parameters, price brackets, or room counts until all results are accessible.
Listings vary wildly depending on the property type and the broker's input quality. Our parsing logic uses extensive fallback chains and regex normalisation to ensure fields like 'Kaltmiete' and 'Wohnfläche' always output clean numeric types, regardless of the source formatting.
For ongoing market monitoring, we maintain a hash index of last-seen values per Exposé ID. Subsequent runs only push diffs - such as price drops or status changes to 'rented' - reducing compute cost and downstream processing load.
Data science teams feed structured Kaltmiete and Kaufpreis data into Automated Valuation Models to price portfolios accurately.
Institutional investors track rent-to-buy ratios across specific PLZ zones to identify high-yield acquisition targets.
PropTech companies identify active brokers and agencies in specific regions to target their B2B sales efforts.
Analysts extract Energieausweis data to assess the energy efficiency of the housing stock and model renovation costs.
Economists monitor time-on-market metrics and price drop frequencies to gauge regional market heat and housing supply.
Property management firms track competitor listings to optimise their own pricing and amenity offerings in real time.
"Immowelt contains the critical pricing and energy efficiency signals for the German housing market - data that remains locked behind dynamic interfaces unless extracted systematically."
Extracting real estate data at scale requires bypassing sophisticated anti-bot systems, rendering complex JavaScript map interfaces, and parsing highly variable Exposé layouts. DataFlirt handles the proxy rotation, session management, and schema normalisation so your data science team can focus on yield analysis and market trends.
Everything supported by our immowelt.de scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.
We maintain pools of residential ISP proxies across DE regions. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.
Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.
Data delivered to where your team already works — no new tooling required.
About immowelt.de scraping, legality, and pipeline operations.
Ask us directly →Scraping publicly available real estate listings is generally permissible under applicable law, provided it does not extract personal data or breach copyright. DataFlirt targets only public, non-authenticated property and pricing data. We do not extract private user data or circumvent authentication walls. Clients should review Immowelt's ToS and consult legal counsel for specific use cases.
We use German residential ISP proxies, full Playwright browser sessions with realistic TLS fingerprints, and request timing modelled on human behaviour. We monitor for CAPTCHA rate spikes in real time and trigger solver queues automatically.
Immowelt caps pagination for broad searches. Our orchestration engine automatically subdivides large queries by iterating through granular PLZ codes, tightening radius parameters, or slicing by price brackets to ensure 100% market coverage.
Real-time streaming pipelines achieve sub-60-minute latency for specific high-priority regions. Full national catalogue refreshes at daily cadence complete within a 6-12 hour window depending on volume.
Yes. Every pipeline run produces timestamped snapshots. We maintain a time-series table per Exposé ID for price changes and availability status from the date your pipeline starts.
Our smallest packages start at a defined regional scope (e.g., top 5 German cities) with weekly delivery. For national coverage or custom schema requirements, we price based on volume and delivery frequency.
Absolutely. We provide a sample run of up to 500 listings as part of the pre-engagement scoping process - so you can validate schema fit, field completeness, and data quality before signing any contract.
20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off regional extraction or a continuous national price-monitoring feed - we scope, build, and operate the pipeline. Tell us what you need.