We extract pharmacy-level drug prices, coupon values, generic equivalents, and availability from GoodRx. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Drug Information objects from goodrx.com. All fields typed and schema-versioned.
"drug_id": "d-12948", "brand_name": "Lipitor", "generic_name": "Atorvastatin", "drug_class": "Statins", "rx_required": true, "available_forms": "['tablet', 'capsule']", "default_quantity": 30
| # | drug_id | brand_name | generic_name | drug_class | description | rx_required |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Pharmacy Pricing objects from goodrx.com. All fields typed and schema-versioned.
"drug_id": "d-12948", "zip_code": "90210", "pharmacy_name": "CVS Pharmacy", "retail_price": 45.99, "coupon_price": 9.14, "discount_pct": 80, "distance_miles": 1.2, "last_updated": "2023-10-25T14:30:00Z"
| # | drug_id | zip_code | pharmacy_name | pharmacy_chain | retail_price | coupon_price |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Coupon Details objects from goodrx.com. All fields typed and schema-versioned.
"coupon_id": "c-99281", "drug_id": "d-12948", "pharmacy_name": "Walgreens", "bin_number": "015995", "pcn_number": "GDC", "group_number": "DR33", "discount_type": "standard_coupon"
| # | coupon_id | drug_id | pharmacy_name | bin_number | pcn_number | group_number |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Generic Equivalents objects from goodrx.com. All fields typed and schema-versioned.
"brand_name": "Lipitor", "generic_name": "Atorvastatin", "price_difference_pct": 85, "brand_avg_price": 245.0, "generic_avg_price": 12.5, "availability_status": "widely_available", "manufacturer": "Pfizer"
| # | brand_drug_id | generic_drug_id | brand_name | generic_name | price_difference_pct | brand_avg_price |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Pharmacy Locations objects from goodrx.com. All fields typed and schema-versioned.
"name": "Walmart Pharmacy", "chain": "Walmart", "address": "123 Main St", "city": "Beverly Hills", "state": "CA", "zip_code": "90210", "latitude": 34.0736, "longitude": -118.4004
| # | pharmacy_id | name | chain | address | city | state |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Our GoodRx scraper handles every layer of the platform: location-specific drug pricing, coupon generation, and pharmacy mapping — with geo-targeted proxies and anti-bot circumvention built in.
Capture all available forms, dosages, and quantities for every drug listed on the platform.
Extract prices specific to thousands of individual zip codes using geo-located residential proxies.
Retrieve the exact BIN, PCN, and Group numbers required to claim the discounted price at the pharmacy counter.
Map brand-name drugs to their generic equivalents and calculate the exact price differential across pharmacies.
Extract complete pharmacy metadata including coordinates, operating hours, and chain affiliations.
Capture the standard coupon price alongside the GoodRx Gold membership price for accurate tier comparisons.
Track price fluctuations over time. We maintain a hash index and only emit records when a pharmacy changes its price.
Extract pricing and availability data for GoodRx Care telehealth consultations and lab test services.
Run one-off bulk exports or configure continuous pipelines at daily or weekly cadences.
Brief in. Clean data out.
Provide drug lists, NDC codes, or target zip codes. We design the extraction schema together.
We configure Scrapy / Playwright crawlers, geo-proxy rotation, session management, and bot protection handling for goodrx.com.
Schema validation, null-rate checks, price-outlier detection, and sample coupon codes before full launch.
JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.
GoodRx protects its pricing widgets heavily. Here is how we stay resilient — and why teams choose managed infrastructure over DIY.
GoodRx pricing is hyper-local. We route requests through US residential proxies mapped to specific zip codes, ensuring the prices extracted exactly match what a consumer in that location sees.
Pricing tables and coupon modals on GoodRx are heavily JavaScript-rendered. We run full Playwright browser sessions to hydrate the DOM and trigger the necessary API calls to reveal the final price.
GoodRx uses advanced bot protection. Our crawlers spoof TLS fingerprints, manage cookie sessions, and mimic human interaction patterns to maintain high success rates without triggering blocks.
Coupon codes (BIN/PCN) are often generated dynamically per session. Our pipeline handles the entire interaction flow required to generate and extract the valid coupon data.
For large drug catalogues across thousands of zip codes, we maintain a hash index of last-seen prices. Subsequent runs only push diffs — reducing downstream processing load.
Pharmaceutical companies monitor out-of-pocket costs and discount card effectiveness across different pharmacy chains.
Pharmacy chains track competitor cash prices and GoodRx discount rates in their local catchment areas to adjust their own pricing strategies.
PBMs compare their negotiated rates against GoodRx cash prices to ensure competitive formulary design.
Virtual care providers integrate cash price estimates into their prescribing workflows to improve patient medication adherence.
Analysts track generic drug price erosion and pharmacy margin trends to evaluate retail health companies.
Digital health applications ingest pricing data to help their users find the lowest cost medications nearby.
"GoodRx centralises the fragmented US pharmacy pricing market — but extracting that data across 40,000 zip codes requires serious infrastructure."
Most teams underestimate the investment required: reliable GoodRx scraping requires zip-code-specific residential proxies, full JavaScript rendering for pricing widgets, and constant maintenance against bot protection. DataFlirt absorbs that complexity so your engineers can focus on the analysis — not the infrastructure.
Everything supported by our goodrx.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering, cookie sessions, and interaction flows. Combined via scrapy-playwright middleware.
We maintain pools of residential ISP proxies across US zip codes. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.
Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.
Data delivered to where your team already works — no new tooling required.
About goodrx.com scraping, legality, and pipeline operations.
Ask us directly →Scraping publicly available pricing information from GoodRx is generally permissible under applicable law. DataFlirt targets only public, non-authenticated drug pricing, coupon, and pharmacy data. We do not extract personal health information (PHI), circumvent authentication walls, or violate HIPAA. Clients should review GoodRx's ToS and consult legal counsel for specific use cases.
We use US-based residential proxies that allow us to target specific zip codes. This ensures the pricing data we extract reflects exactly what a consumer in that local market would see.
Yes. Our pipeline interacts with the GoodRx interface to generate and extract the BIN, PCN, and Group numbers required to claim the discount at the pharmacy.
Pipelines can be configured to run daily, weekly, or monthly depending on your requirements. Full catalogue refreshes across multiple zip codes complete within a 12-24 hour window.
Yes. Every pipeline run produces timestamped snapshots. We maintain a time-series table per drug and zip code, allowing you to track price volatility over time.
Our smallest packages start at a defined list of drugs (typically 500-2,000 NDCs) across a set of target zip codes. Contact us with your use case for a scoped quote.
Absolutely. We provide a sample run of up to 50 drugs across 5 zip codes as part of the pre-engagement scoping process — so you can validate schema fit and data quality.
20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off pricing dump or a continuous tracking feed across 40,000 zip codes — we scope, build, and operate the pipeline. Tell us what you need.