SYSTEM all green source ehealth.com queue 12,841 ZIP codes p99 latency 218ms dataflirt.com · scraper/ehealth-com
RUN · 31 active pipelines · ehealth.com live

Health insurance data,
at warehouse scale.

We extract Medicare Advantage, Part D, ACA, and short-term plans from eHealth. Carrier pricing, deductibles, and copay structures delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake.

Plans extracted
184K /day
Premium updates
420K /24h
Carrier networks
1,294 /run
Active pipelines
31
Uptime
99.98%
Data Dictionary

Every field we extract from ehealth.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Medicare Advantage objects from ehealth.com. All fields typed and schema-versioned.

plan_idplan_namecarrierplan_typemonthly_premiumannual_deductibleout_of_pocket_maxprimary_care_copayspecialist_copaystar_ratingrx_coveragezip_code
medicare_advantage
● 200 OK
"plan_id": "MA-84921",
"plan_name": "AARP Medicare Advantage Choice",
"carrier": "UnitedHealthcare",
"monthly_premium": 0.0,
"annual_deductible": 0.0,
"out_of_pocket_max": 4900.0,
"star_rating": 4.5
# plan_idplan_namecarrierplan_typemonthly_premiumannual_deductible
1
2
3

Complete list of extractable fields for ACA / Individual objects from ehealth.com. All fields typed and schema-versioned.

plan_idplan_namemetal_tiercarriermonthly_premiumdeductibleout_of_pocket_maxcopay_pcpcopay_ercoinsurance_pctnetwork_typehsa_eligiblezip_code
aca_/ individual
● 200 OK
"metal_tier": "Silver",
"carrier": "Blue Cross Blue Shield",
"monthly_premium": 412.5,
"deductible": 2500.0,
"hsa_eligible": true,
"network_type": "PPO",
"copay_pcp": 30.0
# plan_idplan_namemetal_tiercarriermonthly_premiumdeductible
1
2
3

Complete list of extractable fields for Medicare Part D objects from ehealth.com. All fields typed and schema-versioned.

plan_idplan_namecarriermonthly_premiumannual_deductibleinitial_coverage_limittier_1_copaytier_2_copaytier_3_copaygap_coveragestar_ratingzip_code
medicare_part d
● 200 OK
"plan_name": "SilverScript Choice",
"carrier": "Aetna Medicare",
"monthly_premium": 34.2,
"annual_deductible": 545.0,
"tier_1_copay": 0.0,
"gap_coverage": false,
"star_rating": 3.5
# plan_idplan_namecarriermonthly_premiumannual_deductibleinitial_coverage_limit
1
2
3

Complete list of extractable fields for Dental & Vision objects from ehealth.com. All fields typed and schema-versioned.

plan_idplan_namecarrierplan_typemonthly_premiumdeductibleannual_max_benefitpreventive_coverage_pctbasic_coverage_pctmajor_coverage_pctwaiting_periodzip_code
dental_& vision
● 200 OK
"plan_type": "Dental PPO",
"carrier": "Delta Dental",
"monthly_premium": 28.5,
"deductible": 50.0,
"annual_max_benefit": 1500.0,
"preventive_coverage_pct": 100,
"waiting_period": "None"
# plan_idplan_namecarrierplan_typemonthly_premiumdeductible
1
2
3

Complete list of extractable fields for Carrier Networks objects from ehealth.com. All fields typed and schema-versioned.

carrier_idcarrier_namestate_availabilityplan_countaverage_star_ratingam_best_ratingcustomer_service_phonewebsite_urlnetwork_sizefounded_year
carrier_networks
● 200 OK
"carrier_name": "Kaiser Permanente",
"state_availability": "['CA', 'WA', 'OR', 'CO', 'HI', 'MD', 'VA', 'GA']",
"plan_count": 42,
"average_star_rating": 4.8,
"am_best_rating": "A+",
"network_size": "Large"
# carrier_idcarrier_namestate_availabilityplan_countaverage_star_ratingam_best_rating
1
2
3

Capabilities

Everything you need from eHealth, nothing you do not

Our eHealth scraper handles the complexity of demographic form hydration, session management, and deeply nested benefit structures to extract accurate regional pricing.

Medicare Advantage & Part D

Extract Medicare plan details, prescription drug coverage tiers, premiums, and Star Ratings at the county level.

ACA Individual & Family Plans

Capture metal tiers, subsidy eligibility estimates, and base premiums for Obamacare marketplace plans.

Dental, Vision & Short-Term

Extract ancillary insurance products including annual maximum benefits, waiting periods, and coverage percentages.

ZIP Code Level Granularity

Insurance pricing varies strictly by location. We iterate through target ZIP codes to capture hyper-local premium variations.

Copay & Coinsurance Structures

Deep extraction of tier structures for primary care, specialists, emergency room visits, and pharmacy tiers.

Out-of-Pocket Maximums

Extract complex financial limits including in-network deductibles, out-of-network limits, and family caps.

Star Ratings & Carrier Metrics

Capture CMS Star Ratings for Medicare plans and aggregate carrier quality metrics across regions.

Formulary & Drug Tier Mapping

Extract prescription coverage tiers and associated copays for Part D and Medicare Advantage plans.

Dynamic Form Hydration

Bypass demographic input gates by automatically injecting age, gender, and tobacco status via Playwright.

Real-Time Quoting

Extract live premiums based on specific demographic profiles for accurate competitor benchmarking.

// engagement pipeline

From ZIP code list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide target ZIP codes, demographic profiles, and plan types. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Playwright crawlers, demographic form submission logic, session management, and proxy rotation.

Validation & QA
d 4–6

Schema validation, premium outlier detection, and county-level coverage checks before full launch.

Delivery
ongoing

JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our eHealth pipeline handles the hard parts

Insurance aggregation sites require complex demographic form submissions and session state management. Here is how we extract accurate quotes.

pipeline-monitor · ehealth.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Form Hydration
Automated demographic inputs and session state

eHealth requires age, ZIP code, and tobacco status to render accurate pricing. We automate these demographic form submissions and maintain session cookies to ensure the quoted premiums match the target profile.

JavaScript rendering
Full Playwright execution for dynamic pricing

Plan details and dynamic pricing load via asynchronous API calls after form submission. Playwright executes full browser sessions to capture network payloads that simple HTTP clients miss.

Distributed queues
Iterating across 41,000 ZIP codes

Insurance pricing is hyper-local. We manage distributed queues to iterate through national ZIP code lists without triggering rate limits, capturing accurate county-level pricing variations.

Anti-bot evasion
Residential proxy rotation

eHealth uses commercial bot protection. We rotate ISP-grade residential proxies and randomise request timing to mimic real user behaviour and maintain high success rates.

Schema normalisation
Flattening complex benefit matrices

Insurance plans have deeply nested copay and deductible tiers. We normalise this complex structure into flat, queryable tables ready for immediate analysis in your data warehouse.

Applications

Who uses eHealth data and how

Teams across industries use ehealth.com data to build competitive products and smarter operations.

01
Competitor Price Monitoring

Carriers track regional pricing and premium adjustments across competing Medicare and ACA plans.

02
Product Development

Actuaries analyse benefit structures, copays, and out-of-pocket maximums to design competitive insurance products.

03
Market Expansion Strategy

Health plans evaluate carrier density and average premiums in new ZIP codes or counties.

04
Broker & Agency Intelligence

Large brokerages aggregate eHealth data to train internal quoting tools and benchmark market offerings.

05
Regulatory & Compliance Auditing

Consultancies track Medicare Star Ratings and plan availability to audit market compliance.

06
Consumer App Data Feeds

Insurtech startups use normalised plan data to power comparison engines and recommendation models.

Why DataFlirt

"Health insurance pricing is fundamentally local and opaque. eHealth aggregates it, but extracting those premiums across 41,291 ZIP codes requires serious infrastructure."

Most teams fail at insurance scraping because they underestimate the complexity of session state. You cannot simply request a URL; you must submit demographic forms, manage cookies, and parse deeply nested benefit matrices. DataFlirt handles the form hydration, proxy rotation, and schema normalisation so you get clean data, not HTTP errors.

Technical Spec

eHealth scraper technical capabilities

Everything supported by our ehealth.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for dynamic pricing and asynchronous plan loading
Supported
Demographic form submission
Automated input of age, ZIP code, and tobacco status for accurate quoting
Supported
ZIP code iteration
Distributed queues for national coverage across all US counties
Supported
Medicare & ACA plans
Extraction of all primary health plan types and ancillary products
Supported
Residential proxy rotation
ISP-grade IPs to bypass rate limits and geographic blocks
Supported
CAPTCHA bypass
Automated CapSolver integration for bot protection challenges
Supported
Change detection
Emit only premium or benefit changes since the last pipeline run
Supported
Webhook delivery
HTTP POST for real-time quote extraction workflows
Supported
Member portal data
Claims history and active policy details requiring user authentication
Partial
Protected Health Information
Extraction of individual patient records or PII
Partial
Infrastructure

Infrastructure powering the eHealth pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration while Playwright manages demographic form submission, cookie sessions, and interaction flows for accurate pricing.

Residential Proxy Network

Geographically targeted residential IPs ensure requests appear as local traffic, preventing rate limits and geographic blocking during ZIP code iteration.

Cloud-Native Orchestration

Airflow and AWS Lambda manage distributed queues across 41,000 ZIP codes, ensuring pipelines complete within required SLAs.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested structures
CSV
Flat file with typed columns for actuaries
XLS
Excel compatible format for analyst teams
Parquet
Columnar format for BigQuery and Snowflake
AWS S3
Direct bucket delivery for data lakes
Webhook
HTTP POST per record for real-time processing
API
REST endpoints for on-demand quoting
PostgreSQL
Upsert into your existing database schema
BigQuery
Streamed directly into your dataset
Snowflake
Stage and COPY INTO workflow
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About ehealth.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping eHealth legal?

Scraping publicly available plan data and premiums is generally permissible. We do not extract PII, access authenticated member portals, or scrape Protected Health Information (PHI). Clients should consult legal counsel for specific use cases.

How do you handle demographic inputs?

We configure demographic profiles including age, gender, tobacco use, and ZIP code. Playwright automates the form submission process to generate accurate quotes for these profiles.

Can you extract data for all US ZIP codes?

Yes. We use distributed queues to iterate through national ZIP code lists, capturing county-level pricing variations across the entire country.

What plan types do you support?

We extract Medicare Advantage, Medicare Part D, ACA Individual, Short-Term, Dental, and Vision plans.

How frequently can you refresh the data?

During the Open Enrollment Period (OEP), we configure daily or weekly refreshes. Off-season, monthly refreshes are typical.

Do you extract detailed copay and deductible tiers?

Yes. We capture primary care, specialist, emergency room, and pharmacy copays, along with in-network and out-of-network deductibles.

Can I get a sample of my target ZIP codes?

Yes. We provide sample extractions for specific ZIP codes and demographic profiles before contract signature to validate data quality.

$ dataflirt scope --new-project --source=ehealth.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a national Medicare dataset or regional ACA premium tracking, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →