SYSTEM all green source booli.se queue 12,941 pages p99 latency 184ms dataflirt.com · scraper/booli-se
RUN · 42 active pipelines · booli.se live

Swedish property data,
at warehouse scale.

We extract active listings, historical sold prices, bidding histories, and algorithmic valuations from Booli. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Active listings
48K /day
Sold records
2.1M /total
Valuations
185K /run
Active pipelines
42
Uptime
99.94%
Data Dictionary

Every field we extract from booli.se

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Active Listings objects from booli.se. All fields typed and schema-versioned.

booli_idaddressmunicipalitycountyproperty_typeroomsliving_areaplot_arealist_pricerentconstruction_yearbroker_namebroker_agencydays_on_market
active_listings
● 200 OK
"booli_id": "4829104",
"address": "Sveavägen 42",
"municipality": "Stockholm",
"property_type": "Lägenhet",
"rooms": 3,
"living_area": 84.5,
"list_price": 7500000,
"rent": 4250,
"broker_agency": "Fastighetsbyrån"
# booli_idaddressmunicipalitycountyproperty_typerooms
1
2
3

Complete list of extractable fields for Sold Properties objects from booli.se. All fields typed and schema-versioned.

booli_idaddressproperty_typesold_pricelist_priceprice_diff_pctsold_dateprice_per_sqmroomsliving_areabroker_agencybidding_participants
sold_properties
● 200 OK
"booli_id": "3910284",
"address": "Linnégatan 12",
"sold_price": 8200000,
"list_price": 7800000,
"price_diff_pct": 5.1,
"sold_date": "2023-10-14",
"price_per_sqm": 97041.4,
"broker_agency": "Bjurfors"
# booli_idaddressproperty_typesold_pricelist_priceprice_diff_pct
1
2
3

Complete list of extractable fields for Booli Valuations objects from booli.se. All fields typed and schema-versioned.

property_idaddressvaluation_estimatevaluation_lowvaluation_highconfidence_scorereference_propertiestrend_12mtrend_36mlast_updated
booli_valuations
● 200 OK
"property_id": "9481726",
"valuation_estimate": 4500000,
"valuation_low": 4200000,
"valuation_high": 4800000,
"confidence_score": "High",
"trend_12m": -2.4,
"reference_properties": 14,
"last_updated": "2023-11-01T08:14:00Z"
# property_idaddressvaluation_estimatevaluation_lowvaluation_highconfidence_score
1
2
3

Complete list of extractable fields for BRF Details objects from booli.se. All fields typed and schema-versioned.

brf_nameorg_numberregistration_yeartotal_apartmentsdebt_per_sqmmonthly_feeenergy_classenergy_declaration_idmunicipality
brf_details
● 200 OK
"brf_name": "BRF Solrosen 1",
"org_number": "769600-1234",
"registration_year": 1984,
"total_apartments": 42,
"debt_per_sqm": 4500,
"energy_class": "C",
"municipality": "Göteborg"
# brf_nameorg_numberregistration_yeartotal_apartmentsdebt_per_sqmmonthly_fee
1
2
3

Complete list of extractable fields for Bidding History objects from booli.se. All fields typed and schema-versioned.

booli_idaddresstotal_bidsunique_biddersstart_pricefinal_pricebid_timestampbid_amountbid_increasebidder_id
bidding_history
● 200 OK
"booli_id": "3910284",
"total_bids": 8,
"unique_bidders": 3,
"bid_timestamp": "2023-10-12T14:32:10Z",
"bid_amount": 8100000,
"bid_increase": 50000,
"bidder_id": "Bidder 2"
# booli_idaddresstotal_bidsunique_biddersstart_pricefinal_price
1
2
3

Capabilities

Everything you need from Booli — nothing you don't

Our Booli scraper handles every layer of the Swedish property market: active listings, sold prices, BRF details, and algorithmic valuations — with Swedish residential proxies and CAPTCHA handling built in.

Active & Upcoming Listings

Extract properties currently for sale and 'snart till salu' listings. Capture asking price, living area, plot size, and broker details.

Sold Price Archive

Access historical transaction data (slutpriser). Compare final sale prices against original asking prices to calculate regional premiums.

Booli Valuations

Extract Booli's algorithmic property valuations, including low/high confidence intervals and trailing 12-month market trends.

Bidding History Extraction

Track bid logs, unique bidder counts, and bid increments to gauge market heat and demand velocity per property.

BRF Financial Metrics

Scrape housing association (Bostadsrättsförening) data including debt per square metre, monthly fees, and registration years.

Broker Performance Stats

Aggregate data by broker agency to analyse market share, average days on market, and price realisation rates.

Regional Market Trends

Extract aggregated statistics at the municipality and county levels to track macro real estate shifts across Sweden.

Historical Price Tracking

Monitor price reductions and delistings on active properties to identify motivated sellers and stale inventory.

Scheduled + Streaming Modes

Run one-off bulk exports or configure continuous pipelines at daily or real-time cadences with change-detection diffing.

// engagement pipeline

From target region to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide municipalities, property types, or search filters. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, Swedish proxy rotation, and CAPTCHA handling for booli.se.

Validation & QA
d 4–6

Schema validation, null-rate checks, valuation outlier detection, and sample payloads before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Booli pipeline handles the hard parts

Swedish real estate portals invest heavily in scraping detection. Here is how we stay resilient — and why teams choose managed infrastructure over DIY.

pipeline-monitor · booli.se · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Swedish residential proxy rotation

Booli restricts access from data centre IPs and non-Swedish regions. Our crawlers use residential ISP proxies located in Sweden with realistic browser fingerprints and full cookie session management.

JavaScript rendering
Playwright execution for map clusters

Booli's search results and map interfaces rely heavily on JavaScript rendering. We run full Playwright browser sessions to trigger map pan/zoom events and extract hidden listing IDs that headless HTTP clients miss.

Schema stability
Resilient selectors for dynamic DOMs

Property portals update their layouts frequently. Our selector strategy uses multiple fallback chains per field — CSS selectors, XPath, and Next.js state extraction — so a frontend update does not break your data pipeline.

Change detection
Only re-scrape what has changed

For active listings, we maintain a hash index of last-seen values per field. Subsequent runs only push diffs — capturing price drops or status changes without redownloading static images and descriptions.

Monitoring & alerting
24/7 pipeline health with anomaly detection

Every run emits structured logs to our observability stack. We alert on null-rate spikes in critical fields like sold_price or valuation_estimate, and respond before you notice.

Applications

Who uses Booli data — and how

Teams across industries use booli.se data to build competitive products and smarter operations.

01
Automated Valuation Models (AVM)

PropTech companies train machine learning models on historical transaction data and property characteristics to predict future values.

02
Property Investment Analysis

Real estate funds track yield compression, days on market, and regional price trends to identify undervalued municipalities.

03
Broker Market Share

Agencies monitor competitor performance, tracking list-to-sale price ratios and transaction volumes by broker name.

04
Credit Risk & Mortgage Underwriting

Banks and alternative lenders use BRF financial health indicators and regional price trends to assess mortgage portfolio risk.

05
Urban Development Planning

Municipalities and developers analyse bidding intensity and demographic shifts to plan new construction projects.

06
PropTech App Development

Startups build consumer-facing applications that require real-time alerts for upcoming listings and recent regional sales.

Why DataFlirt

"Booli holds the most comprehensive registry of Swedish property transactions and algorithmic valuations — but extracting it at scale requires dedicated infrastructure."

Most teams underestimate the investment required: reliable Booli scraping requires Swedish residential proxies, full JavaScript rendering for map clusters, CAPTCHA handling, and daily selector maintenance. DataFlirt absorbs that complexity so your engineers can focus on the analysis — not the infrastructure.

Technical Spec

Booli scraper — technical capabilities

Everything supported by our booli.se scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions — required for map clusters and dynamic graphs
Supported
CAPTCHA bypass
Automated CapSolver integration for Cloudflare and custom challenges
Supported
Swedish residential proxies
ISP-grade residential IPs from SE pools to prevent geolocation blocks
Supported
Sold price history (Slutpriser)
Extraction of final sale prices, dates, and price differences
Supported
Bidding logs
Timestamped bid increments and unique bidder counts per property
Supported
BRF extraction
Housing association debt metrics and energy declarations
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed fields since last run
Supported
Webhook delivery
HTTP POST per record or batch — useful for real-time listing alerts
Supported
Authenticated saved searches
Extraction of user-specific saved searches and push notifications
Partial
SBAB mortgage offers
Private, user-specific mortgage calculations and loan promises
Partial
Infrastructure

Infrastructure powering the Booli pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Swedish Proxy Infrastructure

We maintain pools of residential ISP proxies specifically located in Sweden. Rotation happens per-request with sticky sessions where required to bypass regional blocking.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets compatible
XLS
Standard Excel format for business analysts
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery — compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoint to query your extracted datasets
PostgreSQL
Upsert into your existing schema with conflict resolution
Snowflake
Stage + COPY INTO workflow — incremental or full-replace
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About booli.se scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Booli legal?

Scraping publicly available information from Booli is generally permissible for analytical purposes. DataFlirt targets only public, non-authenticated property, pricing, and valuation data. We do not extract personal data or circumvent authentication walls. Clients should review Booli's ToS and consult legal counsel for specific commercial use cases.

How do you handle Booli's anti-bot systems?

We use Swedish residential ISP proxies, full Playwright browser sessions with realistic fingerprints, and request timing modelled on human behaviour. We monitor for CAPTCHA rate spikes in real time and trigger solver queues automatically.

How far back does the sold price history go?

Booli maintains extensive historical transaction data. We can extract slutpriser dating back multiple years depending on the municipality, allowing you to build comprehensive time-series datasets.

How fresh is the active listing data?

Real-time streaming pipelines achieve sub-60-minute latency for new listings and price updates on a defined regional set. Full national catalogue refreshes at daily cadence complete within a 4-8 hour window.

Do you extract BRF (housing association) financials?

Yes. We extract available BRF metrics linked to apartment listings, including total apartments, registration year, debt per square metre, and monthly fee structures.

What is the minimum viable engagement?

Our smallest packages start at a defined regional scope (e.g., Stockholm county) with weekly delivery. For national coverage or custom schema requirements, we price based on volume and delivery frequency.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 500 properties as part of the pre-engagement scoping process — so you can validate schema fit, field completeness, and data quality before signing any contract.

$ dataflirt scope --new-project --source=booli.se ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off historical transaction dump or a continuous listing feed across Sweden — we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →