SYSTEM all green source angi.com queue 18,402 zip codes p99 latency 215ms dataflirt.com · scraper/angi-com
RUN · 112 active pipelines · angi.com live

Angi data,
at warehouse scale.

We extract contractor directories, verified reviews, license credentials, and local project cost guides from Angi. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Contractors extracted
1.4M /month
Review records
8.2M /run
Cost guides
450K /week
Active pipelines
112
Uptime
99.94%
Data Dictionary

Every field we extract from angi.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Contractor Profiles objects from angi.com. All fields typed and schema-versioned.

contractor_idbusiness_namecategoryphone_numberwebsite_urlstreet_addresscitystatezip_codeoverall_ratingreview_countyears_in_businessangi_certifiedbackground_checked
contractor_profiles
● 200 OK
"contractor_id": "A-12345",
"business_name": "Apex Roofing",
"category": "Roofing",
"phone_number": "+1-555-0198",
"city": "Austin",
"state": "TX",
"overall_rating": 4.8,
"angi_certified": true
# contractor_idbusiness_namecategoryphone_numberwebsite_urlstreet_address
1
2
3

Complete list of extractable fields for Reviews & Ratings objects from angi.com. All fields typed and schema-versioned.

review_idcontractor_idreviewer_namereview_dateratingproject_typeproject_costreview_textcontractor_responseresponse_dateverified_purchase
reviews_& ratings
● 200 OK
"review_id": "R-98765",
"contractor_id": "A-12345",
"rating": 5.0,
"project_type": "Roof Replacement",
"review_text": "Excellent work and cleanup.",
"verified_purchase": true,
"review_date": "2026-03-14"
# review_idcontractor_idreviewer_namereview_dateratingproject_type
1
2
3

Complete list of extractable fields for Project Cost Guides objects from angi.com. All fields typed and schema-versioned.

guide_idproject_categoryzip_codecitystateaverage_costlow_end_costhigh_end_costmaterial_cost_estimatelabor_cost_estimatelast_updated
project_cost guides
● 200 OK
"project_category": "Asphalt Shingle Roof",
"zip_code": "78701",
"average_cost": 8500.0,
"low_end_cost": 5200.0,
"high_end_cost": 12400.0,
"last_updated": "2026-01-10T00:00:00Z"
# guide_idproject_categoryzip_codecitystateaverage_cost
1
2
3

Complete list of extractable fields for Credentials & Licenses objects from angi.com. All fields typed and schema-versioned.

contractor_idcredential_typelicense_numberissuing_authorityissue_dateexpiration_datestatusverification_dateinsurance_verifiedbond_verified
credentials_& licenses
● 200 OK
"contractor_id": "A-12345",
"credential_type": "State Roofing License",
"license_number": "TX-R-88291",
"status": "Active",
"insurance_verified": true,
"verification_date": "2026-04-01"
# contractor_idcredential_typelicense_numberissuing_authorityissue_dateexpiration_date
1
2
3

Complete list of extractable fields for Search Results objects from angi.com. All fields typed and schema-versioned.

search_termzip_codepositioncontractor_idbusiness_nameratingreview_countsponsored_placementangi_certified_badgeeco_friendly_badgescraped_at
search_results
● 200 OK
"search_term": "plumber",
"zip_code": "78701",
"position": 1,
"contractor_id": "P-99210",
"sponsored_placement": true,
"angi_certified_badge": true,
"scraped_at": "2026-05-12T10:00:00Z"
# search_termzip_codepositioncontractor_idbusiness_namerating
1
2
3

Capabilities

Everything you need from Angi - nothing you don't

Our Angi scraper handles location-based directory traversal, review pagination, and dynamic contact detail rendering with session management and anti-bot circumvention built in.

Full Directory Extraction

Capture business names, categories, contact details, and ratings across all home service verticals.

Review Pagination

Extract complete review histories, including project types, costs, text bodies, and contractor responses.

Project Cost Aggregation

Scrape local pricing guides for specific project types across 41,000 US zip codes.

License & Credential Verification

Capture license numbers, issuing bodies, and background check statuses for compliance teams.

Service Area Mapping

Extract exact zip codes and municipal boundaries where contractors operate.

Search Ranking Intelligence

Track organic versus sponsored positions for specific trades in target zip codes.

Dynamic Contact Detail Rendering

Execute JavaScript to reveal hidden phone numbers and email addresses protected by DOM manipulation.

Angi Certified Tracking

Monitor badge status changes and certification criteria compliance over time.

Scheduled Pipeline Modes

Run daily, weekly, or monthly diffs to track new market entrants and review velocity.

// engagement pipeline

From zip code list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide zip codes, trade categories, or specific contractor URLs. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, session management, and CAPTCHA handling for angi.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, location accuracy validation, and sample reviews before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Angi pipeline handles the hard parts

Directory scraping requires systematic location spoofing and bot mitigation. Here is how we stay resilient.

pipeline-monitor · angi.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Residential proxy rotation + fingerprint spoofing

Angi employs bot protection heuristics. Our crawlers use US residential ISP proxies with realistic browser fingerprints and full cookie session management to bypass Datadome and Cloudflare challenges.

Location spoofing
Zip code level session injection

Search results on Angi are strictly geo-fenced. We inject accurate zip code data into request headers and cookies to ensure the returned contractor list matches the targeted local market exactly.

JavaScript rendering
Full Playwright execution for dynamic content

Phone numbers and deep profile data are often obfuscated until user interaction. We run full Playwright browser sessions to trigger these events and capture data that headless HTTP clients miss entirely.

Schema stability
Resilient selectors with fallback chains

Directory layouts change frequently. Our selector strategy uses multiple fallback chains per field so a layout change does not break your data pipeline overnight.

Change detection
Only re-scrape what has changed

For massive national directories, we maintain a hash index of last-seen values per field. Subsequent runs only push diffs, reducing compute cost and downstream processing load.

Applications

Who uses Angi data - and how

Teams across industries use angi.com data to build competitive products and smarter operations.

01
Market Research & PropTech

Aggregate local construction costs to build accurate property valuation and renovation models.

02
Competitor Intelligence

Local contractors track competitor pricing, review velocity, and service area expansions.

03
Lead Generation & B2B Sales

Building material suppliers identify high-volume contractors for targeted outreach.

04
Insurance Risk Assessment

Verify contractor credentials, bond status, and license validity for underwriting.

05
Private Equity Due Diligence

Analyse regional market fragmentation and category leaders for roll-up acquisitions.

06
AI Training Data

Train natural language models on home improvement review corpuses and project descriptions.

Why DataFlirt

"Angi holds the most comprehensive local contractor directory and pricing index in the US market - but accessing it across 41,000 zip codes requires serious infrastructure."

Most teams underestimate the investment required: reliable Angi scraping requires residential proxies, location spoofing, CAPTCHA handling, and daily selector maintenance. DataFlirt absorbs that complexity so your engineers can focus on the analysis, not the infrastructure.

Technical Spec

Angi scraper - technical capabilities

Everything supported by our angi.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for dynamic contact details
Supported
CAPTCHA bypass
Automated CapSolver integration for Datadome challenges
Supported
Residential proxy rotation
ISP-grade residential IPs from US pools rotated per request
Supported
Zip code location spoofing
Accurate local SERP generation via cookie injection
Supported
Review pagination
Full review corpus extraction across hundreds of pages
Supported
Sponsored ad detection
Distinguishes organic versus sponsored placements in SERP results
Supported
Change detection (diffs)
Hash-based diff to emit records with changed fields only
Supported
Webhook delivery
HTTP POST per record or batch for real-time workflows
Supported
Private contractor messaging
Direct messages between homeowners and contractors are gated
Partial
User lead submission history
Internal lead routing metrics require backend authentication
Partial
Infrastructure

Infrastructure powering the Angi pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheusSnowflake
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US regions. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested - schema versioned per run
CSV
Flat file with typed columns - Excel/Sheets compatible
XLS
Legacy spreadsheet format for direct business user consumption
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery - compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoints to query historical snapshot data
PostgreSQL
Upsert into your existing schema with conflict resolution
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About angi.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Angi legal?

Scraping publicly available information from directories is generally permissible under applicable law. DataFlirt targets only public, non-authenticated contractor profiles, reviews, and cost guides. We do not extract personal user data or circumvent authentication walls. Clients should review Angi's ToS and consult legal counsel for specific use cases.

How do you handle Angi's bot protection?

We use US residential ISP proxies, full Playwright browser sessions with realistic fingerprints, and request timing modelled on human behaviour to bypass Datadome and Cloudflare challenges.

Can you scrape specific zip codes or cities?

Yes. We inject precise location data into our session cookies to extract accurate local search results and service area definitions for any US zip code.

How fresh is the data?

Full directory refreshes for specified zip codes complete within 12-24 hours. We can configure daily diff runs to capture new reviews and profile updates quickly.

Do you extract hidden phone numbers?

Yes. Our Playwright integration executes the necessary JavaScript to trigger contact detail rendering, capturing data that standard HTTP requests miss.

What is the minimum viable engagement?

Our smallest packages start at a defined list of trade categories across up to 1,000 zip codes. For national coverage, we price based on volume and delivery frequency.

$ dataflirt scope --new-project --source=angi.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off contractor directory dump or continuous review monitoring across 10,000 zip codes, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →