SYSTEM all green source immowelt.de queue 14,892 listings p99 latency 312ms dataflirt.com · scraper/immowelt-de
RUN - 41 active pipelines - immowelt.de live

German property data,
at warehouse scale.

We extract residential and commercial listings, pricing histories, energy certificates, and broker details from Immowelt.de. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Listings extracted
182K /day
Price updates
41K /24h
Broker records
12K /run
Active pipelines
41
Uptime
99.94%
Data Dictionary

Every field we extract from immowelt.de

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Residential Rent objects from immowelt.de. All fields typed and schema-versioned.

expose_idtitlekaltmietewarmmietenebenkostenheizkostenkautionwohnflaechezimmerschlafzimmerbadezimmerbaujahrplzcitystreetavailable_frombalconybuilt_in_kitchen
residential_rent
● 200 OK
"expose_id": "2d3f4a9",
"title": "Helle 3-Zimmer Wohnung mit Balkon in Mitte",
"kaltmiete": 1250.0,
"warmmiete": 1450.0,
"nebenkosten": 200.0,
"wohnflaeche": 85.5,
"zimmer": 3,
"plz": "10115",
"city": "Berlin"
# expose_idtitlekaltmietewarmmietenebenkostenheizkosten
1
2
3

Complete list of extractable fields for Residential Buy objects from immowelt.de. All fields typed and schema-versioned.

expose_idtitlekaufpreiskaufpreis_pro_qmhausgeldprovisionwohnflaechegrundstuecksflaechezimmerbaujahrobjektzustandplzcitystreetgarage_pricerented
residential_buy
● 200 OK
"expose_id": "8b7c6d5",
"title": "Modernes Einfamilienhaus im Grünen",
"kaufpreis": 850000.0,
"kaufpreis_pro_qm": 5666.67,
"provision": "3.57% inkl. MwSt.",
"wohnflaeche": 150.0,
"grundstuecksflaeche": 600.0,
"baujahr": 2018,
"city": "München"
# expose_idtitlekaufpreiskaufpreis_pro_qmhausgeldprovision
1
2
3

Complete list of extractable fields for Building & Energy objects from immowelt.de. All fields typed and schema-versioned.

expose_idheizungsartbefeuerungsartenergieausweistypendenergiebedarfenergieeffizienzklassebaujahrletzte_modernisierungobjektzustandausstattung_kategoriedenkmalschutz
building_& energy
● 200 OK
"expose_id": "2d3f4a9",
"heizungsart": "Zentralheizung",
"befeuerungsart": "Fernwärme",
"energieausweistyp": "Bedarfsausweis",
"endenergiebedarf": "65.4 kWh/(m²*a)",
"energieeffizienzklasse": "B",
"baujahr": 2015,
"objektzustand": "Neuwertig"
# expose_idheizungsartbefeuerungsartenergieausweistypendenergiebedarfenergieeffizienzklasse
1
2
3

Complete list of extractable fields for Broker & Agency objects from immowelt.de. All fields typed and schema-versioned.

broker_idmakler_namefirmennametelefonnummeremailadresseimpressum_urlactive_listings_countbewertungen_scorebewertungen_countlogo_url
broker_& agency
● 200 OK
"broker_id": "mkl_99421",
"firmenname": "Müller Immobilien GmbH",
"makler_name": "Thomas Müller",
"telefonnummer": "+49 30 1234567",
"adresse": "Kurfürstendamm 10, 10719 Berlin",
"active_listings_count": 42,
"bewertungen_score": 4.8,
"bewertungen_count": 156
# broker_idmakler_namefirmennametelefonnummeremailadresse
1
2
3

Complete list of extractable fields for Search Results objects from immowelt.de. All fields typed and schema-versioned.

search_idkeywordplz_inputradius_kmpositionexpose_idtitlepricewohnflaechezimmeris_top_listingscraped_at
search_results
● 200 OK
"plz_input": "20457",
"radius_km": 5,
"position": 1,
"expose_id": "5x9y2z1",
"is_top_listing": true,
"price": 1800.0,
"wohnflaeche": 110.0,
"scraped_at": "2026-05-12T10:15:30Z"
# search_idkeywordplz_inputradius_kmpositionexpose_id
1
2
3

Capabilities

Extract the complete German real estate market

Our Immowelt scraper bypasses bot protection and renders dynamic map interfaces to extract highly structured property data, pricing histories, and broker intelligence across all German postal codes.

Full Exposé Extraction

Title, description texts, amenities, floorplans, image arrays, and every metadata field Immowelt surfaces - scraped at the individual listing level.

Rent & Purchase Pricing

Capture Kaltmiete, Warmmiete, Nebenkosten, Kaufpreis, Hausgeld, and broker commission rates accurately parsed into numeric fields.

Energy Ratings & Certificates

Extract EPC details including Energieeffizienzklasse, Endenergiebedarf, heating types, and build year for ESG compliance and valuation models.

Broker Intelligence

Identify agency names, contact details, active listing counts, and rating scores to map the competitive broker landscape in any region.

Geospatial Precision

Extract PLZ, city, district, street names, and coordinate data from map layers to power hyper-local market analysis.

Historical Price Tracking

Monitor time-on-market and price drops for individual Exposé IDs over time to gauge market liquidity and seller motivation.

Commercial Real Estate

Extract office spaces, retail locations, and industrial properties with commercial-specific fields like divisible floor space and net rents.

Radius & Polygon Search

Replicate complex user searches using PLZ inputs and radius parameters to ensure comprehensive coverage without hitting pagination limits.

Scheduled + Streaming Modes

Run one-off bulk exports or configure continuous pipelines at hourly, daily, or weekly cadences with change-detection diffing.

// engagement pipeline

From PLZ list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide postal codes, property types, or specific agency URLs. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, German proxy rotation, session management, and CAPTCHA handling for immowelt.de.

Validation & QA
d 4–6

Schema validation, null-rate checks, price-outlier detection, and sample listings before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Immowelt pipeline handles the hard parts

German real estate portals deploy aggressive bot protection and complex frontend architectures. Here is how we maintain reliable data flows.

pipeline-monitor · immowelt.de · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
German residential proxies + TLS fingerprinting

Immowelt uses advanced bot mitigation that blocks standard data centre IPs immediately. Our crawlers route traffic exclusively through German residential ISP proxies, matching regional expectations and maintaining realistic TLS fingerprints to prevent IP bans.

JavaScript rendering
Playwright for map data and lazy loading

Key property details, image galleries, and exact map locations are loaded dynamically via JavaScript. We execute full Playwright browser sessions to hydrate the DOM, ensuring we capture data that simple HTTP requests miss entirely.

Pagination limits
Algorithmic search subdivision

Immowelt limits search results to a maximum number of pages. To extract entire cities, our orchestration engine automatically subdivides large queries by tightening radius parameters, price brackets, or room counts until all results are accessible.

Schema stability
Handling Exposé variations

Listings vary wildly depending on the property type and the broker's input quality. Our parsing logic uses extensive fallback chains and regex normalisation to ensure fields like 'Kaltmiete' and 'Wohnfläche' always output clean numeric types, regardless of the source formatting.

Change detection
Only re-scrape what changes

For ongoing market monitoring, we maintain a hash index of last-seen values per Exposé ID. Subsequent runs only push diffs - such as price drops or status changes to 'rented' - reducing compute cost and downstream processing load.

Applications

Who uses Immowelt data - and how

Teams across industries use immowelt.de data to build competitive products and smarter operations.

01
Real Estate Valuation (AVM)

Data science teams feed structured Kaltmiete and Kaufpreis data into Automated Valuation Models to price portfolios accurately.

02
Investment Yield Analysis

Institutional investors track rent-to-buy ratios across specific PLZ zones to identify high-yield acquisition targets.

03
Broker Lead Generation

PropTech companies identify active brokers and agencies in specific regions to target their B2B sales efforts.

04
Energy Compliance Tracking

Analysts extract Energieausweis data to assess the energy efficiency of the housing stock and model renovation costs.

05
Market Liquidity Analysis

Economists monitor time-on-market metrics and price drop frequencies to gauge regional market heat and housing supply.

06
Competitor Benchmarking

Property management firms track competitor listings to optimise their own pricing and amenity offerings in real time.

Why DataFlirt

"Immowelt contains the critical pricing and energy efficiency signals for the German housing market - data that remains locked behind dynamic interfaces unless extracted systematically."

Extracting real estate data at scale requires bypassing sophisticated anti-bot systems, rendering complex JavaScript map interfaces, and parsing highly variable Exposé layouts. DataFlirt handles the proxy rotation, session management, and schema normalisation so your data science team can focus on yield analysis and market trends.

Technical Spec

Immowelt scraper - technical capabilities

Everything supported by our immowelt.de scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions - required for dynamic maps and image galleries
Supported
CAPTCHA bypass
Automated CapSolver integration for cloud security challenges
Supported
German residential proxies
ISP-grade residential IPs from DE pools to bypass geo-blocking
Supported
Exposé ID tracking
Unique identifier captured for deduplication and historical tracking
Supported
Energy certificate parsing
Extraction of EPC class, consumption metrics, and heating types
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed prices or status
Supported
Webhook delivery
HTTP POST per record - useful for real-time deal alerts
Supported
Schufa credit reports
Gated financial data requiring user authentication and consent
Partial
Direct landlord messaging
Automated messaging via the portal's internal communication system
Partial
User saved searches
Extraction of private user profiles and saved property lists
Partial
Infrastructure

Infrastructure powering the Immowelt pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across DE regions. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested - schema versioned per run
CSV
Flat file with typed columns - Excel/Sheets compatible
XLS
Excel format for direct analyst consumption
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery - compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoints to query your extracted datasets on demand
PostgreSQL
Upsert into your existing schema with conflict resolution
BigQuery
Streamed directly into your dataset with schema auto-detect
Snowflake
Stage + COPY INTO workflow - incremental or full-replace
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About immowelt.de scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Immowelt legal?

Scraping publicly available real estate listings is generally permissible under applicable law, provided it does not extract personal data or breach copyright. DataFlirt targets only public, non-authenticated property and pricing data. We do not extract private user data or circumvent authentication walls. Clients should review Immowelt's ToS and consult legal counsel for specific use cases.

How do you handle Immowelt's bot protection?

We use German residential ISP proxies, full Playwright browser sessions with realistic TLS fingerprints, and request timing modelled on human behaviour. We monitor for CAPTCHA rate spikes in real time and trigger solver queues automatically.

How do you overcome the 100-page limit on search results?

Immowelt caps pagination for broad searches. Our orchestration engine automatically subdivides large queries by iterating through granular PLZ codes, tightening radius parameters, or slicing by price brackets to ensure 100% market coverage.

How fresh is the data?

Real-time streaming pipelines achieve sub-60-minute latency for specific high-priority regions. Full national catalogue refreshes at daily cadence complete within a 6-12 hour window depending on volume.

Can you track price drops and time on market?

Yes. Every pipeline run produces timestamped snapshots. We maintain a time-series table per Exposé ID for price changes and availability status from the date your pipeline starts.

What is the minimum viable engagement?

Our smallest packages start at a defined regional scope (e.g., top 5 German cities) with weekly delivery. For national coverage or custom schema requirements, we price based on volume and delivery frequency.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 500 listings as part of the pre-engagement scoping process - so you can validate schema fit, field completeness, and data quality before signing any contract.

$ dataflirt scope --new-project --source=immowelt.de ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off regional extraction or a continuous national price-monitoring feed - we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →