SYSTEM all green source goodrx.com queue 12,841 drugs p99 latency 184ms dataflirt.com · scraper/goodrx-com
RUN · 84 active pipelines · goodrx.com live

GoodRx pricing data,
at warehouse scale.

We extract pharmacy-level drug prices, coupon values, generic equivalents, and availability from GoodRx. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Prices extracted
1.2M /day
Pharmacy locations
84K /run
Zip codes tracked
41,290 /month
Active pipelines
84
Uptime
99.94%
Data Dictionary

Every field we extract from goodrx.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Drug Information objects from goodrx.com. All fields typed and schema-versioned.

drug_idbrand_namegeneric_namedrug_classdescriptionrx_requiredavailable_formsavailable_dosagesdefault_quantitymanufacturerfda_alerts
drug_information
● 200 OK
"drug_id": "d-12948",
"brand_name": "Lipitor",
"generic_name": "Atorvastatin",
"drug_class": "Statins",
"rx_required": true,
"available_forms": "['tablet', 'capsule']",
"default_quantity": 30
# drug_idbrand_namegeneric_namedrug_classdescriptionrx_required
1
2
3

Complete list of extractable fields for Pharmacy Pricing objects from goodrx.com. All fields typed and schema-versioned.

drug_idzip_codepharmacy_namepharmacy_chainretail_pricecoupon_pricegold_pricediscount_pctdistance_mileslast_updatedcurrency
pharmacy_pricing
● 200 OK
"drug_id": "d-12948",
"zip_code": "90210",
"pharmacy_name": "CVS Pharmacy",
"retail_price": 45.99,
"coupon_price": 9.14,
"discount_pct": 80,
"distance_miles": 1.2,
"last_updated": "2023-10-25T14:30:00Z"
# drug_idzip_codepharmacy_namepharmacy_chainretail_pricecoupon_price
1
2
3

Complete list of extractable fields for Coupon Details objects from goodrx.com. All fields typed and schema-versioned.

coupon_iddrug_idpharmacy_namebin_numberpcn_numbergroup_numbermember_iddiscount_typeterms_conditionsexpiration_date
coupon_details
● 200 OK
"coupon_id": "c-99281",
"drug_id": "d-12948",
"pharmacy_name": "Walgreens",
"bin_number": "015995",
"pcn_number": "GDC",
"group_number": "DR33",
"discount_type": "standard_coupon"
# coupon_iddrug_idpharmacy_namebin_numberpcn_numbergroup_number
1
2
3

Complete list of extractable fields for Generic Equivalents objects from goodrx.com. All fields typed and schema-versioned.

brand_drug_idgeneric_drug_idbrand_namegeneric_nameprice_difference_pctbrand_avg_pricegeneric_avg_priceavailability_statusmanufacturerfda_approval_date
generic_equivalents
● 200 OK
"brand_name": "Lipitor",
"generic_name": "Atorvastatin",
"price_difference_pct": 85,
"brand_avg_price": 245.0,
"generic_avg_price": 12.5,
"availability_status": "widely_available",
"manufacturer": "Pfizer"
# brand_drug_idgeneric_drug_idbrand_namegeneric_nameprice_difference_pctbrand_avg_price
1
2
3

Complete list of extractable fields for Pharmacy Locations objects from goodrx.com. All fields typed and schema-versioned.

pharmacy_idnamechainaddresscitystatezip_codephonelatitudelongitudedrive_thru24_hour_status
pharmacy_locations
● 200 OK
"name": "Walmart Pharmacy",
"chain": "Walmart",
"address": "123 Main St",
"city": "Beverly Hills",
"state": "CA",
"zip_code": "90210",
"latitude": 34.0736,
"longitude": -118.4004
# pharmacy_idnamechainaddresscitystate
1
2
3

Capabilities

Everything you need from GoodRx — nothing you don't

Our GoodRx scraper handles every layer of the platform: location-specific drug pricing, coupon generation, and pharmacy mapping — with geo-targeted proxies and anti-bot circumvention built in.

Drug Formulation Extraction

Capture all available forms, dosages, and quantities for every drug listed on the platform.

Geo-Targeted Price Tracking

Extract prices specific to thousands of individual zip codes using geo-located residential proxies.

Coupon Code Generation

Retrieve the exact BIN, PCN, and Group numbers required to claim the discounted price at the pharmacy counter.

Brand vs Generic Mapping

Map brand-name drugs to their generic equivalents and calculate the exact price differential across pharmacies.

Pharmacy Directory Scraping

Extract complete pharmacy metadata including coordinates, operating hours, and chain affiliations.

GoodRx Gold Pricing

Capture the standard coupon price alongside the GoodRx Gold membership price for accurate tier comparisons.

Price History Diffing

Track price fluctuations over time. We maintain a hash index and only emit records when a pharmacy changes its price.

Telehealth Provider Data

Extract pricing and availability data for GoodRx Care telehealth consultations and lab test services.

Scheduled + Streaming Modes

Run one-off bulk exports or configure continuous pipelines at daily or weekly cadences.

// engagement pipeline

From drug list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide drug lists, NDC codes, or target zip codes. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, geo-proxy rotation, session management, and bot protection handling for goodrx.com.

Validation & QA
d 4–6

Schema validation, null-rate checks, price-outlier detection, and sample coupon codes before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our GoodRx pipeline handles the hard parts

GoodRx protects its pricing widgets heavily. Here is how we stay resilient — and why teams choose managed infrastructure over DIY.

pipeline-monitor · goodrx.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Location spoofing
Geo-targeted residential proxies

GoodRx pricing is hyper-local. We route requests through US residential proxies mapped to specific zip codes, ensuring the prices extracted exactly match what a consumer in that location sees.

JavaScript rendering
Full Playwright execution for pricing widgets

Pricing tables and coupon modals on GoodRx are heavily JavaScript-rendered. We run full Playwright browser sessions to hydrate the DOM and trigger the necessary API calls to reveal the final price.

Anti-bot layer
Bypassing Cloudflare and PerimeterX

GoodRx uses advanced bot protection. Our crawlers spoof TLS fingerprints, manage cookie sessions, and mimic human interaction patterns to maintain high success rates without triggering blocks.

Dynamic tokens
Coupon generation logic

Coupon codes (BIN/PCN) are often generated dynamically per session. Our pipeline handles the entire interaction flow required to generate and extract the valid coupon data.

Change detection
Only re-scrape what's changed

For large drug catalogues across thousands of zip codes, we maintain a hash index of last-seen prices. Subsequent runs only push diffs — reducing downstream processing load.

Applications

Who uses GoodRx data — and how

Teams across industries use goodrx.com data to build competitive products and smarter operations.

01
Pharma Market Access

Pharmaceutical companies monitor out-of-pocket costs and discount card effectiveness across different pharmacy chains.

02
Retail Pharmacy Competitor Pricing

Pharmacy chains track competitor cash prices and GoodRx discount rates in their local catchment areas to adjust their own pricing strategies.

03
Health Insurance & PBMs

PBMs compare their negotiated rates against GoodRx cash prices to ensure competitive formulary design.

04
Telehealth Platforms

Virtual care providers integrate cash price estimates into their prescribing workflows to improve patient medication adherence.

05
Healthcare Market Research

Analysts track generic drug price erosion and pharmacy margin trends to evaluate retail health companies.

06
Consumer Health Apps

Digital health applications ingest pricing data to help their users find the lowest cost medications nearby.

Why DataFlirt

"GoodRx centralises the fragmented US pharmacy pricing market — but extracting that data across 40,000 zip codes requires serious infrastructure."

Most teams underestimate the investment required: reliable GoodRx scraping requires zip-code-specific residential proxies, full JavaScript rendering for pricing widgets, and constant maintenance against bot protection. DataFlirt absorbs that complexity so your engineers can focus on the analysis — not the infrastructure.

Technical Spec

GoodRx scraper — technical capabilities

Everything supported by our goodrx.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions — required for price widgets and coupon generation
Supported
Geo-targeted pricing
Zip-code level accuracy using localized residential proxies
Supported
Coupon code extraction
Capture BIN, PCN, and Group numbers for pharmacy presentation
Supported
Brand/generic mapping
Link brand name drugs to all available generic equivalents
Supported
Pharmacy coordinates
Latitude and longitude for spatial analysis and mapping
Supported
GoodRx Gold member pricing
Extract tiered pricing data for subscription members
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed prices since last run
Supported
Webhook delivery
HTTP POST per record or batch — useful for real-time workflows
Supported
User prescription history
Personal health information (PHI) and individual refill histories
Partial
GoodRx Gold user profiles
Account-specific details, payment methods, or family member data
Partial
Infrastructure

Infrastructure powering the GoodRx pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across US zip codes. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets compatible
XLS
Legacy spreadsheet format for business analysts
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery — compatible with any data lake
Webhook
HTTP POST per record for real-time downstream processing
API
REST endpoint to query your extracted dataset
BigQuery
Streamed directly into your dataset with schema auto-detect
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About goodrx.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping GoodRx legal?

Scraping publicly available pricing information from GoodRx is generally permissible under applicable law. DataFlirt targets only public, non-authenticated drug pricing, coupon, and pharmacy data. We do not extract personal health information (PHI), circumvent authentication walls, or violate HIPAA. Clients should review GoodRx's ToS and consult legal counsel for specific use cases.

How do you handle location-specific pricing?

We use US-based residential proxies that allow us to target specific zip codes. This ensures the pricing data we extract reflects exactly what a consumer in that local market would see.

Can you extract the actual coupon codes?

Yes. Our pipeline interacts with the GoodRx interface to generate and extract the BIN, PCN, and Group numbers required to claim the discount at the pharmacy.

How fresh is the data?

Pipelines can be configured to run daily, weekly, or monthly depending on your requirements. Full catalogue refreshes across multiple zip codes complete within a 12-24 hour window.

Can you track price changes over time?

Yes. Every pipeline run produces timestamped snapshots. We maintain a time-series table per drug and zip code, allowing you to track price volatility over time.

What is the minimum viable engagement?

Our smallest packages start at a defined list of drugs (typically 500-2,000 NDCs) across a set of target zip codes. Contact us with your use case for a scoped quote.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 50 drugs across 5 zip codes as part of the pre-engagement scoping process — so you can validate schema fit and data quality.

$ dataflirt scope --new-project --source=goodrx.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off pricing dump or a continuous tracking feed across 40,000 zip codes — we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →