SYSTEM all green source uhc.com queue 18,492 queries p99 latency 318ms dataflirt.com · scraper/uhc-com
RUN · 42 active pipelines · uhc.com live

Provider networks,
mapped at scale.

We extract UHC provider directories, in-network statuses, facility locations, and formulary lists. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Providers mapped
1.2M /month
Facilities tracked
84K /run
Network queries
450K /day
Active pipelines
42
Uptime
99.94%
Data Dictionary

Every field we extract from uhc.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Provider Profiles objects from uhc.com. All fields typed and schema-versioned.

npifirst_namelast_namespecialtydegreegenderaccepting_new_patientstelehealth_offeredboard_certifiedyears_experiencelanguages_spokeneducation
provider_profiles
● 200 OK
"npi": "1982736450",
"first_name": "Sarah",
"last_name": "Jenkins",
"specialty": "Cardiology",
"accepting_new_patients": true,
"telehealth_offered": false,
"board_certified": true
# npifirst_namelast_namespecialtydegreegender
1
2
3

Complete list of extractable fields for Facility Locations objects from uhc.com. All fields typed and schema-versioned.

facility_idfacility_namefacility_typeaddress_line1citystatezip_codephone_numberlatitudelongitudehandicap_accessibleparking_availablehours_of_operation
facility_locations
● 200 OK
"facility_id": "FAC-92817",
"facility_name": "Northside Heart Clinic",
"facility_type": "Specialty Clinic",
"city": "Atlanta",
"state": "GA",
"zip_code": "30342",
"latitude": 33.908,
"longitude": -84.351
# facility_idfacility_namefacility_typeaddress_line1citystate
1
2
3

Complete list of extractable fields for Network Coverage objects from uhc.com. All fields typed and schema-versioned.

provider_npiplan_idplan_nameplan_typenetwork_statustier_statuseffective_datetermination_datepcp_requiredreferrals_required
network_coverage
● 200 OK
"provider_npi": "1982736450",
"plan_name": "UHC Choice Plus",
"plan_type": "PPO",
"network_status": "In-Network",
"tier_status": "Tier 1",
"pcp_required": false,
"referrals_required": false
# provider_npiplan_idplan_nameplan_typenetwork_statustier_status
1
2
3

Complete list of extractable fields for Formulary Data objects from uhc.com. All fields typed and schema-versioned.

drug_iddrug_namegeneric_namedrug_tierprior_authorization_requiredstep_therapy_requiredquantity_limitplan_idplan_nameformulationdosage
formulary_data
● 200 OK
"drug_id": "DRG-4451",
"drug_name": "Lipitor",
"generic_name": "Atorvastatin",
"drug_tier": "Tier 2",
"prior_authorization_required": false,
"step_therapy_required": false,
"quantity_limit": "30 per 30 days"
# drug_iddrug_namegeneric_namedrug_tierprior_authorization_requiredstep_therapy_required
1
2
3

Complete list of extractable fields for Provider Ratings objects from uhc.com. All fields typed and schema-versioned.

npioverall_ratingrating_countwait_time_ratingbedside_manner_ratingstaff_friendliness_ratingfacility_cleanliness_ratingreviews_urllast_updated
provider_ratings
● 200 OK
"npi": "1982736450",
"overall_rating": 4.8,
"rating_count": 142,
"wait_time_rating": 4.5,
"bedside_manner_rating": 4.9,
"staff_friendliness_rating": 4.7,
"last_updated": "2026-10-14T08:30:00Z"
# npioverall_ratingrating_countwait_time_ratingbedside_manner_ratingstaff_friendliness_rating
1
2
3

Capabilities

Everything you need from UHC directories

Our UHC scraper handles the complex search states, location iteration, and plan selection required to map national provider networks accurately.

Provider Directory Extraction

Extract NPIs, specialties, demographics, education, and board certifications for individual physicians and specialists.

Network Status Mapping

Determine precise in-network and out-of-network statuses across different UHC plans for any given provider.

Location & Facility Data

Capture exact coordinates, addresses, accessibility flags, and contact information for clinics and hospitals.

Formulary & Drug Tier Parsing

Extract drug tier classifications, prior authorization flags, and step therapy requirements from UHC formularies.

Medicare Advantage Intelligence

Track provider participation and facility networks specific to UHC Medicare Advantage and Medicaid plans.

Complex Search Handling

Automated traversal of state, zip code, and plan type permutations to extract comprehensive national datasets.

Rating & Review Aggregation

Collect patient satisfaction scores, wait time ratings, and bedside manner metrics for listed providers.

Telehealth Availability

Identify providers offering virtual visits and integrate telehealth flags into your directory build.

Schedule & Roster Diffs

Track providers joining or leaving networks over time with our hash-based change detection system.

// engagement pipeline

From target regions to warehouse records

Brief in. Clean data out.

Define Scope
d 0

Provide target zip codes, plan IDs, or specialty lists. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, session management, and CAPTCHA handling for uhc.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, location outlier detection, and network coverage verification before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our UHC pipeline handles the hard parts

Healthcare portals use strict rate limiting and complex stateful search forms. Here is how we maintain reliable extraction.

pipeline-monitor · uhc.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Search state management
Navigating complex multi-step forms

UHC directories require sequential state selection (location to plan to category). We automate these complex Playwright flows to maintain session state across paginated results.

Anti-bot layer
Bypassing enterprise WAFs

Healthcare portals use strict rate limiting and bot protection. Our crawlers use residential ISP proxies with realistic browser fingerprints and request timing to avoid IP bans.

Geographic coverage
Zip-code level iteration

To map a national network, we iterate through thousands of zip codes systematically, handling overlapping radii to deduplicate provider records accurately.

Dynamic rendering
Hydrating SPA components

Provider profiles and network statuses are loaded asynchronously via API calls. We intercept these XHR requests directly or render the full DOM to capture complete datasets.

Change detection
Tracking network roster updates

We maintain a hash index of last-seen values per NPI. Subsequent runs only push diffs, allowing you to track exactly when a provider drops out of a network.

Applications

Who uses UHC data — and how

Teams across industries use uhc.com data to build competitive products and smarter operations.

01
Network Adequacy Analysis

Payers and regulators analyse geographic coverage and specialty distribution to ensure compliance with network adequacy standards.

02
Provider Data Management

Healthtech platforms enrich their internal directories with verified UHC provider demographics, NPIs, and active locations.

03
Competitive Intelligence

Rival insurance carriers monitor UHC network expansions, Medicare Advantage footprint, and provider overlap to optimise their own contracting.

04
Patient Navigation Tools

Digital health startups integrate real-time network status and UHC provider ratings into their care routing algorithms.

05
Pharma Market Access

Pharmaceutical companies track UHC formulary tier changes, prior authorization requirements, and step therapy rules for their drug portfolios.

06
Claims Denial Mitigation

Revenue cycle management teams verify historical in-network statuses to appeal out-of-network claims denials.

Why DataFlirt

"UHC's provider directory is the largest in the United States, representing a critical map of healthcare access — but it remains locked behind complex search interfaces."

Most teams fail at healthcare scraping because they underestimate the complexity of stateful search forms and enterprise bot protection. DataFlirt manages the UHC extraction pipeline end-to-end, handling location iteration, plan selection, and WAF circumvention so your data team receives clean, warehouse-ready provider records.

Technical Spec

UHC scraper — technical capabilities

Everything supported by our uhc.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for UHC React-based directory
Supported
CAPTCHA bypass
Automated 2Captcha + CapSolver integration for search blocks
Supported
Residential proxy rotation
US-based ISP residential proxies to match expected user geography
Supported
Zip-code iteration
Automated traversal of target zip codes with radius deduplication
Supported
XHR/API interception
Direct capture of backend JSON payloads for provider details
Supported
Network change detection
Hash-based diffing to track providers joining/leaving networks
Supported
Formulary PDF parsing
Extraction of drug tiers from unstructured UHC formulary documents
Supported
Member claims data
Gated personal EOBs and claims history requiring member authentication
Partial
Negotiated rates (myUHC)
Patient-specific pricing estimates behind the myUHC login wall
Partial
Infrastructure

Infrastructure powering the UHC pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and multi-step form submissions required by UHC directory interfaces. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US regions. Rotation happens per-request with sticky sessions to maintain state during complex provider search flows. IP score monitoring prevents WAF blocks.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, zip-code batching, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets compatible
XLS
Excel format for business analyst teams
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery — compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoint for querying cached provider records
PostgreSQL
Direct database upserts with conflict resolution
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About uhc.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping UHC provider directories legal?

Scraping publicly available provider directories is generally permissible under US law. DataFlirt extracts only public demographic, network, and facility data. We do not circumvent authentication walls, scrape patient portals, or handle PHI/HIPAA-regulated data.

How do you handle UHC's search interface?

Our Playwright crawlers automate the exact user journey: entering zip codes, selecting specific insurance plans, and paginating through results. We manage the session state and cookies required to keep the search context active.

Can you track when a provider leaves a network?

Yes. By maintaining a baseline database of providers and their associated plans, our change-detection pipelines flag dropped plans, new network additions, and address changes in every run.

How do you avoid deduplication issues across zip codes?

When scraping by geographic radius, UHC often returns overlapping provider lists. We use the provider NPI (National Provider Identifier) and internal UHC IDs as primary keys to deduplicate records before delivery.

Do you extract Medicare and Medicaid plan networks?

Yes, we support extraction across all UHC plan categories, including Employer & Individual, Medicare Advantage, Medicaid, and specialized dental/vision networks.

What is the delivery latency for a national directory scrape?

A full national extraction across all major UHC plans involves millions of queries and typically runs on a weekly or monthly cadence, completing over a 48-72 hour window to respect target server load.

Can I get a sample of the provider dataset?

Yes. We offer a sample extraction for a specific zip code and plan type during the scoping phase, allowing your engineering team to validate the schema and data quality.

$ dataflirt scope --new-project --source=uhc.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a regional provider map or a national network monitoring system — we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →