SYSTEM all green source sahibinden.com queue 12,491 pages p99 latency 218ms dataflirt.com · scraper/sahibinden-com
RUN : 114 active pipelines : sahibinden.com live

Sahibinden data,
at warehouse scale.

We extract property listings, vehicle specifications, pricing signals, and seller profiles from Sahibinden. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Listings extracted
314K /day
Price updates
1.2M /24h
Seller profiles
42K /run
Active pipelines
114
Uptime
99.94%
Data Dictionary

Every field we extract from sahibinden.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Real Estate (Emlak) objects from sahibinden.com. All fields typed and schema-versioned.

listing_idtitlecategorypricecurrencycitydistrictneighborhoodgross_sqmnet_sqmroom_countbuilding_agefloor_locationtotal_floorsheating_typebathroom_countbalconyfurnishedstatusseller_type
real_estate (emlak)
● 200 OK
"listing_id": "1093847192",
"title": "Kadikoy Moda'da Deniz Manzarali 3+1",
"price": 12500000.0,
"currency": "TRY",
"city": "Istanbul",
"district": "Kadikoy",
"room_count": "3+1",
"gross_sqm": 145,
"building_age": "5-10"
# listing_idtitlecategorypricecurrencycity
1
2
3

Complete list of extractable fields for Vehicles (Vasita) objects from sahibinden.com. All fields typed and schema-versioned.

listing_idtitlebrandseriesmodelyearfuel_typegear_typekmbody_typeengine_powerengine_capacitytractioncolorwarrantyplate_nationalitytramer_recordprice
vehicles_(vasita)
● 200 OK
"listing_id": "1098234567",
"brand": "Volkswagen",
"series": "Golf",
"model": "1.5 TSI Impression",
"year": 2021,
"km": 45000,
"fuel_type": "Benzin",
"price": 1350000.0,
"color": "Beyaz"
# listing_idtitlebrandseriesmodelyear
1
2
3

Complete list of extractable fields for Pricing & History objects from sahibinden.com. All fields typed and schema-versioned.

listing_idcurrent_priceoriginal_priceprice_drop_pctcurrencylisting_datelast_update_datedays_on_marketview_countfavorite_countstatus
pricing_& history
● 200 OK
"listing_id": "1093847192",
"current_price": 12500000.0,
"original_price": 13000000.0,
"price_drop_pct": 3.8,
"listing_date": "2023-09-15",
"days_on_market": 24,
"status": "Active"
# listing_idcurrent_priceoriginal_priceprice_drop_pctcurrencylisting_date
1
2
3

Complete list of extractable fields for Seller & Agency Data objects from sahibinden.com. All fields typed and schema-versioned.

seller_idseller_nameaccount_typeagency_nameaccount_creation_datetotal_listingsactive_listingsphone_number_visiblelocation_citylocation_districtresponse_rateverified_account
seller_& agency data
● 200 OK
"seller_id": "8472910",
"seller_name": "Ahmet Yilmaz",
"account_type": "Corporate",
"agency_name": "Yilmaz Emlak",
"active_listings": 42,
"verified_account": true,
"location_city": "Istanbul"
# seller_idseller_nameaccount_typeagency_nameaccount_creation_datetotal_listings
1
2
3

Complete list of extractable fields for Search Results objects from sahibinden.com. All fields typed and schema-versioned.

keywordcategory_pathpage_numberpositionlisting_idtitlepricelocationlisting_datethumbnail_urlpromoted_badgeurgent_badge
search_results
● 200 OK
"keyword": "satilik daire",
"category_path": "Emlak > Konut > Satilik",
"position": 4,
"listing_id": "1093847192",
"price": 12500000.0,
"promoted_badge": true,
"urgent_badge": false
# keywordcategory_pathpage_numberpositionlisting_idtitle
1
2
3

Capabilities

Everything you need from Sahibinden

Our Sahibinden scraper handles every layer of the platform. Storefront listings, dynamic pricing, agency intelligence, and vehicle specifications with anti-bot circumvention built in.

Full Property Data Extraction

Title, attributes, location, square meterage, and heating types scraped at the listing level.

Vehicle Specifications

Extract detailed automotive data including brand, model, mileage, transmission, and damage history records.

Real-Time Price Tracking

Capture current price, historical price drops, and listing status changes timestamped per crawl.

Seller & Agency Intelligence

Monitor agency portfolios, active listing counts, and account verification status.

Category Rank Intelligence

Track promoted listings and organic search positions across primary and sub-categories.

Geolocation Extraction

Extract precise city, district, and neighborhood data for spatial analysis and mapping.

Anti-Bot Circumvention

Bypass strict Cloudflare protections and custom rate limits using residential ISP proxies and session management.

Historical Data Archiving

Maintain a time-series database of closed and sold listings to calculate true market clearing prices.

Scheduled Pipelines

Configure continuous pipelines at daily or hourly cadences with change-detection diffing.

// engagement pipeline

From target category to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide category URLs, geographic filters, or agency IDs. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy crawlers, proxy rotation, session management, and CAPTCHA handling for sahibinden.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, and price-outlier detection before full launch.

Delivery
ongoing

JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Sahibinden pipeline handles the hard parts

Sahibinden employs aggressive anti-scraping measures and strict rate limits. Here is how we maintain pipeline stability.

pipeline-monitor · sahibinden.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Turkish residential proxy rotation

Sahibinden blocks non-Turkish IPs and data center proxies immediately. Our crawlers use localized Turkish residential ISP proxies with realistic browser fingerprints.

Rate limiting
Algorithmic request pacing

The platform enforces strict request quotas per session. We distribute load across thousands of unique sessions with randomized timing delays.

JavaScript rendering
Playwright execution for dynamic content

Phone number reveals and dynamic map coordinates require JavaScript execution. We run full browser sessions to interact with these elements.

Schema stability
Resilient selectors for changing layouts

Category attributes vary wildly between real estate and vehicles. Our schema normalisation engine adapts to varying DOM structures automatically.

Change detection
Only re-scrape what has changed

For large city-wide sweeps, we maintain a hash index of last-seen values. Subsequent runs only push diffs to reduce compute cost and storage bloat.

Applications

Who uses Sahibinden data and how

Teams across industries use sahibinden.com data to build competitive products and smarter operations.

01
Real Estate Valuation

PropTech firms ingest asking prices and time-on-market metrics to build automated valuation models (AVMs).

02
Automotive Pricing

Dealerships and insurance companies track vehicle depreciation curves and market averages by make and model.

03
Market Research

Analysts track housing supply volume across districts to identify macro-economic trends and investment opportunities.

04
Agency Monitoring

Real estate franchises audit competitor portfolios, listing quality, and market share per neighborhood.

05
Lead Generation

Service providers target new homeowners or vehicle buyers based on recently closed listings.

06
Urban Planning

Researchers correlate housing density and price fluctuations with infrastructure developments.

Why DataFlirt

"Sahibinden holds the definitive record of the Turkish property and automotive markets. Accessing it requires navigating some of the strictest bot protections in the region."

Most teams underestimate the investment required to extract data from Sahibinden. Reliable scraping requires localized Turkish residential proxies, full JavaScript rendering, CAPTCHA handling, and daily selector maintenance. DataFlirt absorbs that complexity so your engineers can focus on the analysis.

Technical Spec

Sahibinden scraper technical capabilities

Everything supported by our sahibinden.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for phone number reveals and map data
Supported
Turkish residential proxies
ISP-grade residential IPs from TR pools rotated per request
Supported
Category attribute parsing
Dynamic extraction of category-specific specs (e.g. Tramer records, heating types)
Supported
Historical price tracking
Price drop detection and historical time-series available from run start
Supported
Agency portfolio extraction
Complete catalog extraction for specific corporate seller IDs
Supported
Change detection (diffs)
Hash-based diff to only emit records with changed fields since last run
Supported
Image URL extraction
Capture high-resolution gallery image URLs
Supported
User message extraction
Access to private buyer-seller messaging inbox
Partial
Exact property address
Full street and door number details (usually hidden by sellers)
Partial
Infrastructure

Infrastructure powering the Sahibinden pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and retry logic. Playwright handles JavaScript rendering and interaction flows.

Localized Proxy Infrastructure

We maintain pools of Turkish residential ISP proxies. Rotation happens per-request with sticky sessions to avoid geographic blocks.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested schema versioned per run
CSV
Flat file with typed columns for Excel/Sheets
Parquet
Columnar format for BigQuery, Snowflake, Athena
S3
Direct bucket delivery compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoints for on-demand record retrieval
XLS
Formatted Excel spreadsheets for business users
Postgres
Upsert into your existing schema with conflict resolution
// faq

Common questions.

About sahibinden.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Sahibinden legal?

Scraping publicly available information is generally permissible under applicable law. DataFlirt targets only public, non-authenticated listing data. We do not extract personal data beyond what is publicly listed by sellers. Clients should review Sahibinden's Terms of Service and consult legal counsel for specific use cases.

How do you bypass Sahibinden's Cloudflare protection?

We use Turkish residential ISP proxies, full Playwright browser sessions with realistic fingerprints, and request timing modelled on human behaviour. We monitor for 403 blocks in real time and trigger pool rotation automatically.

Can you extract hidden phone numbers?

Yes. Phone numbers on Sahibinden require a user click to reveal. Our Playwright integration executes the necessary JavaScript interaction to render and extract the contact information.

How fresh is the real estate data?

Full category refreshes at daily cadence complete within a 6-12 hour window. Hourly pipelines can be configured for specific high-velocity districts or vehicle models.

Do you extract vehicle damage (Tramer) records?

Yes. If the seller has included Tramer information in the structured attributes or description, our parsers extract and normalise this data.

What is the minimum viable engagement?

Our packages start at defined category or regional sweeps (typically 10,000-50,000 listings) with weekly delivery. Contact us with your specific data requirements for a scoped quote.

$ dataflirt scope --new-project --source=sahibinden.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off real estate dataset or a continuous vehicle pricing feed, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →