SYSTEM all green source aetna.com queue 14,892 queries p99 latency 412ms dataflirt.com · scraper/aetna-com
RUN - 42 active pipelines - aetna.com live

Aetna provider data,
normalised at scale.

We extract provider directories, clinic locations, network affiliations, Medicare plan structures, and drug formularies from Aetna. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake.

Providers extracted
1.2M /month
Facilities mapped
84K /run
Formulary updates
12.4K /week
Active pipelines
42
Uptime
99.94%
Data Dictionary

Every field we extract from aetna.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Providers objects from aetna.com. All fields typed and schema-versioned.

npifirst_namelast_namespecialitysub_specialitygenderlanguagesaccepting_new_patientstelehealth_offeredboard_certifiedratingeducation
providers
● 200 OK
"npi": "1829304958",
"first_name": "Sarah",
"last_name": "Chen",
"speciality": "Cardiology",
"accepting_new_patients": true,
"telehealth_offered": true,
"gender": "Female"
# npifirst_namelast_namespecialitysub_specialitygender
1
2
3

Complete list of extractable fields for Facilities objects from aetna.com. All fields typed and schema-versioned.

facility_idfacility_namefacility_typeaddress_line_1citystatezip_codephonenetwork_statusaccreditationbed_counttrauma_level
facilities
● 200 OK
"facility_id": "F-93847",
"facility_name": "Mercy General Hospital",
"facility_type": "Acute Care Hospital",
"city": "Austin",
"state": "TX",
"zip_code": "78701",
"network_status": "In-Network"
# facility_idfacility_namefacility_typeaddress_line_1citystate
1
2
3

Complete list of extractable fields for Network Plans objects from aetna.com. All fields typed and schema-versioned.

plan_idplan_nameplan_typestatecountymetal_tiermonthly_premiumdeductible_individualdeductible_familyout_of_pocket_maxpcp_requiredreferral_required
network_plans
● 200 OK
"plan_id": "AET-TX-2026-HMO",
"plan_name": "Aetna Value Network HMO",
"plan_type": "HMO",
"metal_tier": "Silver",
"deductible_individual": 2500.0,
"pcp_required": true,
"referral_required": true
# plan_idplan_nameplan_typestatecountymetal_tier
1
2
3

Complete list of extractable fields for Formulary Drugs objects from aetna.com. All fields typed and schema-versioned.

ndc_codedrug_namegeneric_namebrand_nametier_levelprior_authorization_requiredstep_therapy_requiredquantity_limitplan_idformulary_year
formulary_drugs
● 200 OK
"ndc_code": "00069-1530-68",
"drug_name": "Lisinopril 10mg",
"tier_level": "Tier 1",
"prior_authorization_required": false,
"step_therapy_required": false,
"generic_name": "Lisinopril",
"plan_id": "AET-TX-2026-HMO"
# ndc_codedrug_namegeneric_namebrand_nametier_levelprior_authorization_required
1
2
3

Complete list of extractable fields for Clinical Policy objects from aetna.com. All fields typed and schema-versioned.

cpb_numbertitlelast_review_dateeffective_datestatuscpt_codes_coveredicd10_codes_coveredsummarybackgroundreferencesurl
clinical_policy
● 200 OK
"cpb_number": "0016",
"title": "Back Pain - Invasive Procedures",
"last_review_date": "2025-11-12",
"status": "Active",
"cpt_codes_covered": "['22513', '22514']",
"icd10_codes_covered": "['M54.50']"
# cpb_numbertitlelast_review_dateeffective_datestatuscpt_codes_covered
1
2
3

Capabilities

Extract Aetna network data with precision

Our infrastructure navigates Aetna's complex directory state, extracts provider NPIs, and maps network affiliations across millions of records. We handle the session management and pagination limits.

Provider Directory Extraction

Extract NPI, names, speciality, languages spoken, and board certifications across all Aetna networks.

Network Affiliation Mapping

Map which specific HMO, PPO, and Medicare Advantage plans a provider or facility accepts.

Facility & Clinic Geocoding

Capture clinic addresses, phone numbers, and facility types. Normalised into clean location records.

Medicare Plan Structures

Extract plan premiums, deductibles, out-of-pocket maximums, and coverage tiers for state-specific plans.

Formulary & Pharmacy Tiers

Track drug tier placements, prior authorization requirements, and step therapy rules across Aetna formularies.

Clinical Policy Bulletins

Scrape Aetna CPBs to extract covered CPT and ICD-10 codes, effective dates, and policy summaries.

Dynamic Search Handling

Navigate Aetna's complex React-based search forms, handling session tokens and multi-step inputs automatically.

Change Detection

Track when a provider drops out of a network or when a drug shifts tiers. Receive only the diffs.

Multi-State Coverage

Execute parallel extraction pipelines across all 50 states to build a national provider database.

// engagement pipeline

From target networks to warehouse records

Brief in. Clean data out.

Define Scope
d 0

Provide target zip codes, network names, or NPI lists. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Playwright crawlers, state management, and geo-targeted proxies for aetna.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, and NPI checksum validation before full launch.

Delivery
ongoing

JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

Overcoming Aetna's directory constraints

Health insurance directories are built to prevent bulk extraction. Here is how we bypass Aetna's limits.

pipeline-monitor · aetna.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Session state
Managing complex search tokens

Aetna's directory requires multi-step session state to view results. We maintain active Playwright contexts that handle cookie negotiation and token generation required to access provider details.

Pagination limits
Bypassing max-result caps

Aetna caps search results to a few hundred providers per query. We implement automated radius chunking, dividing large geographic areas into micro-grids to ensure 100% extraction coverage.

Data normalisation
Standardising messy inputs

Provider data is inherently messy. We clean and normalise addresses, split combined speciality strings, and validate NPI formats before the data reaches your warehouse.

Geo-fencing
Localised proxy rotation

Network results change based on the user's IP location. We map search queries to state-specific residential proxies to ensure we see the correct local network data.

Rate limiting
Controlled request velocity

Aggressive extraction triggers IP bans. We manage request velocity and distribute queries across thousands of residential IPs to maintain pipeline stability.

Applications

Who uses Aetna data and how

Teams across industries use aetna.com data to build competitive products and smarter operations.

01
Network Adequacy Analysis

Healthcare consultants map Aetna's coverage gaps and provider density to evaluate network adequacy against regulatory standards.

02
Provider Roster Verification

Health systems and credentialing teams check if their doctors are actively listed in-network and verify directory accuracy.

03
Competitive Intelligence

Rival payers compare Aetna's network size, facility affiliations, and Medicare plan structures against their own offerings.

04
Formulary Research

Pharma companies track drug tier placements and prior authorization requirements to optimise market access strategies.

05
Telehealth Routing

Digital health platforms use network data to direct patients to in-network virtual care providers, reducing out-of-pocket costs.

06
Healthcare AI Training

ML teams use Clinical Policy Bulletins and provider metadata to train medical LLMs and claims adjudication models.

Why DataFlirt

"Aetna's provider directory contains the ground truth for millions of patient routing decisions, but extracting it requires navigating heavy session-state applications."

Health insurance directories are notoriously difficult to scrape. They rely on complex JavaScript frameworks, session tokens, and aggressive pagination limits. DataFlirt manages the browser automation and proxy rotation required to extract clean, normalised NPI and facility records at scale.

Technical Spec

Aetna scraper technical capabilities

Everything supported by our aetna.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for Aetna's React-based directory search
Supported
Session state management
Handling multi-step search tokens and cookie negotiation
Supported
NPI validation
Regex and checksum validation for extracted National Provider Identifiers
Supported
ZIP code grid search
Automated radius searches to bypass Aetna's pagination limits
Supported
Incremental diffing
Detecting network drop-outs and new provider additions
Supported
Formulary PDF parsing
Extracting drug tiers and PA rules from published PDF documents
Supported
Residential proxy rotation
State-specific ISP IPs to view localized network coverage
Supported
Member Claims Data
Patient PHI and claims history behind the Aetna member portal
Partial
Negotiated Employer Rates
Custom group plan rates requiring employer login credentials
Partial
Infrastructure

Infrastructure powering the Aetna pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Playwright Orchestration

Aetna's directory requires heavy browser automation. We use Playwright to manage the complex state transitions and token generation needed to access provider records.

Geo-Targeted Proxies

Network data varies by location. We route requests through state-specific residential proxies to ensure accurate extraction of local HMO and PPO directories.

Airflow State Sweeps

National directory extraction requires running thousands of parallel zip code queries. Airflow manages the grid search orchestration and dependency tracking.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Nested provider records with array fields for specialities
CSV
Flat files suitable for immediate spreadsheet analysis
Parquet
Columnar format optimized for BigQuery and Snowflake
AWS S3
Direct delivery to your AWS environment
Webhook
HTTP POST for real-time provider status updates
API
Queryable endpoints for extracted data
XLS
Excel format for business stakeholders
PostgreSQL
Direct database inserts with conflict resolution
BigQuery
Streamed directly into your GCP project
Snowflake
Automated Stage and COPY INTO workflows
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About aetna.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Aetna's public directory legal?

Scraping publicly available provider directories is generally permissible. DataFlirt extracts only public, non-authenticated provider and network data. We do not extract PHI or bypass HIPAA-compliant member portals.

How do you handle Aetna's search pagination limits?

Aetna limits search results per query. We bypass this by implementing automated radius chunking, generating overlapping zip code grids to extract the entire provider population without hitting the cap.

Can you extract NPI numbers?

Yes. National Provider Identifiers are exposed in the directory details. We extract and validate NPIs to ensure accurate mapping to your internal provider databases.

Do you track Medicare Advantage plans?

Yes. We extract network affiliations and plan details for Aetna's Medicare Advantage (MAPD) offerings, including state-specific variations.

How frequently can you update the provider list?

We recommend weekly or monthly cadences for full directory refreshes, depending on your scope. Differential updates can be run more frequently for targeted networks.

Can you access member EOBs or claims?

No. Explanation of Benefits, member claims, and negotiated rates tied to specific employer groups require authentication and are out of scope for our public data pipelines.

$ dataflirt scope --new-project --source=aetna.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Need Aetna's national directory, state-specific Medicare networks, or formulary updates? We build and maintain the infrastructure. Tell us your scope.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →