SYSTEM all green source medplus.in queue 12,491 pages p99 latency 184ms dataflirt.com · scraper/medplus-in
RUN · 64 active pipelines · medplus.in live

Medplus data,
at warehouse scale.

We extract medicine catalogues, active salt compositions, pricing signals, substitute mappings, and lab test packages from Medplus. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Medicines extracted
142K /day
Price updates
315K /24h
Lab tests
4,200 /run
Active pipelines
64
Uptime
99.98%
Data Dictionary

Every field we extract from medplus.in

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Medicines & Drugs objects from medplus.in. All fields typed and schema-versioned.

skuproduct_namemanufacturercompositionpackagingmrpselling_priceprescription_requiredstock_statusurl
medicines_& drugs
● 200 OK
"sku": "AUGM0001",
"product_name": "Augmentin 625 Duo Tablet",
"manufacturer": "Glaxo SmithKline Pharmaceuticals Ltd",
"composition": "Amoxycillin 500 MG + Clavulanic Acid 125 MG",
"packaging": "10 Tablets in 1 Strip",
"mrp": 204.5,
"selling_price": 173.82,
"prescription_required": true
# skuproduct_namemanufacturercompositionpackagingmrp
1
2
3

Complete list of extractable fields for Pricing & Offers objects from medplus.in. All fields typed and schema-versioned.

skupin_codemrpdiscount_pctselling_priceadvantage_priceflexi_rewardscoupon_eligibletimestamp
pricing_& offers
● 200 OK
"sku": "AUGM0001",
"pin_code": "560001",
"mrp": 204.5,
"discount_pct": 15,
"selling_price": 173.82,
"advantage_price": 163.6,
"flexi_rewards": 17,
"timestamp": "2026-05-12T09:14:00Z"
# skupin_codemrpdiscount_pctselling_priceadvantage_price
1
2
3

Complete list of extractable fields for Substitutes objects from medplus.in. All fields typed and schema-versioned.

primary_skuprimary_namesubstitute_skusubstitute_namesubstitute_mrpmanufacturercomposition_match_pctprice_diff_pct
substitutes
● 200 OK
"primary_sku": "AUGM0001",
"primary_name": "Augmentin 625 Duo Tablet",
"substitute_sku": "MOXI0045",
"substitute_name": "Moxikind-CV 625 Tablet",
"substitute_mrp": 165.0,
"manufacturer": "Mankind Pharma Ltd",
"composition_match_pct": 100,
"price_diff_pct": -19.3
# primary_skuprimary_namesubstitute_skusubstitute_namesubstitute_mrpmanufacturer
1
2
3

Complete list of extractable fields for Lab Tests objects from medplus.in. All fields typed and schema-versioned.

test_idtest_nameparameters_countfasting_requiredhome_collectionmrpoffer_priceturnaround_time
lab_tests
● 200 OK
"test_id": "LAB_CBP_01",
"test_name": "Complete Blood Picture (CBP)",
"parameters_count": 24,
"fasting_required": false,
"home_collection": true,
"mrp": 450.0,
"offer_price": 380.0,
"turnaround_time": "24 Hours"
# test_idtest_nameparameters_countfasting_requiredhome_collectionmrp
1
2
3

Complete list of extractable fields for OTC & FMCG objects from medplus.in. All fields typed and schema-versioned.

skucategorysub_categorybrandproduct_namepack_sizemrpselling_priceratingimage_url
otc_& fmcg
● 200 OK
"sku": "DETT0023",
"category": "Personal Care",
"sub_category": "Soaps & Body Wash",
"brand": "Dettol",
"product_name": "Dettol Original Bathing Soap",
"pack_size": "125g x 4",
"mrp": 220.0,
"selling_price": 198.0
# skucategorysub_categorybrandproduct_namepack_size
1
2
3

Capabilities

Everything you need from Medplus - nothing you don't

Our Medplus scraper handles every layer of the platform: medicine catalogues, dynamic pricing, PIN code stock localized logic, and substitute mappings - with JavaScript rendering and session management built in.

Full Medicine Catalogue

Extract product names, manufacturers, active salt compositions, packaging details, and prescription requirements across the entire drug index.

PIN Code Localization

Capture accurate pricing and stock availability by setting specific geographic cookies and session headers per request.

Salt & Composition Mapping

Extract exact active ingredients and their respective milligram dosages to map generic equivalents across brands.

Substitute Discovery

Scrape Medplus substitute recommendations, capturing price differentials and composition match percentages.

Lab Test Extraction

Extract diagnostic packages, individual test parameters, fasting requirements, and home collection availability.

Medplus Advantage Pricing

Capture standard MRP, standard discount prices, and gated Medplus Advantage tier pricing simultaneously.

Stock Availability Tracking

Monitor out-of-stock flags and inventory depth indicators across different regional fulfilment centres.

OTC & FMCG Coverage

Extract consumer healthcare products, personal care items, and nutritional supplements with category hierarchies.

Scheduled + Streaming Modes

Run one-off bulk exports or configure continuous pipelines at daily cadences with change-detection diffing.

// engagement pipeline

From SKU list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide categories, salt names, or PIN codes. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, and location session management for medplus.in.

Validation & QA
d 4–6

Schema validation, null-rate checks, and price-outlier detection before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Medplus pipeline handles the hard parts

Healthcare e-commerce platforms use dynamic sessions for localized pricing. Here is how we maintain data accuracy at scale.

pipeline-monitor · medplus.in · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Session management
PIN code specific cookie injection

Medplus requires a valid location session to display accurate stock and pricing. Our crawlers inject and maintain specific PIN code cookies across request chains, ensuring the data reflects exactly what a local user sees.

JavaScript rendering
Playwright execution for dynamic pricing

Medplus Advantage prices and specific promotional discounts load asynchronously. We run full Playwright browser sessions to trigger lazy-loads and hydrate dynamic price widgets before extraction.

Schema stability
Resilient selectors for medical data

Medical compositions and substitute tables have complex DOM structures. Our selector strategy uses structured data extraction and text-pattern matching so minor layout updates do not break your pipeline.

Change detection
Only re-scrape what changed

For massive drug catalogues, we maintain a hash index of last-seen values per field. Subsequent runs only push diffs, reducing compute cost and downstream processing load.

Monitoring & alerting
24/7 pipeline health checks

Every run emits structured logs to our observability stack. We alert on null-rate spikes, missing MRP fields, and schema drift, responding before you notice.

Applications

Who uses Medplus data - and how

Teams across industries use medplus.in data to build competitive products and smarter operations.

01
Competitor Price Monitoring

e-Pharmacies track Medplus discount structures and Advantage pricing to adjust their own promotional strategies.

02
Market Share Analysis

Pharma manufacturers monitor product visibility, stock availability, and category placement across regional Medplus hubs.

03
Substitute & Generic Mapping

Healthcare platforms extract active salt compositions to build generic medicine recommendation engines.

04
Hyperlocal Stock Intelligence

Supply chain analysts track out-of-stock indicators across specific PIN codes to identify regional distribution gaps.

05
Healthcare AI Training

ML teams use structured medicine and composition datasets to train clinical NLP models and dosage classifiers.

06
Insurance Claim Validation

Insurtech firms ingest standard MRP data to validate pharmacy claims and detect overbilling automatically.

Why DataFlirt

"Medplus holds India's most structured digital pharmacy catalogue, but extracting hyperlocal pricing and exact salt compositions requires dedicated infrastructure."

Most teams underestimate the investment required: reliable Medplus scraping requires handling PIN-code specific session cookies, rendering dynamic Advantage pricing, and mapping complex active ingredients. DataFlirt absorbs that complexity so your engineers can focus on the analysis - not the infrastructure.

Technical Spec

Medplus scraper - technical capabilities

Everything supported by our medplus.in scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions required for Medplus Advantage price widgets
Supported
PIN code localized pricing
Accurate MRP and stock status via injected location cookies
Supported
Substitute mapping
Extraction of primary to substitute relationships and price differences
Supported
Lab test parameter extraction
Full diagnostic package breakdown including fasting rules
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed fields since last run
Supported
FlexiRewards calculation
Extraction of reward point accumulation per product
Supported
Historical MRP tracking
Time-series storage of price fluctuations over time
Supported
Webhook delivery
HTTP POST per record or batch for real-time workflows
Supported
User prescription records
Extraction of uploaded patient prescriptions and medical history
Partial
Patient order history
Accessing past purchases requires authenticated user sessions
Partial
Infrastructure

Infrastructure powering the Medplus pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration and retry logic. Playwright handles JavaScript rendering, location cookie injection, and interaction flows.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across India. Rotation happens per-request with sticky sessions for consistent localized pricing.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested schema versioned per run
CSV
Flat file with typed columns for Excel/Sheets
XLS
Legacy spreadsheet format for business analysts
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery compatible with any data lake
Webhook
HTTP POST per record for downstream processing
API
Queryable REST endpoints for on-demand extraction
PostgreSQL
Upsert into your existing schema with conflict resolution
BigQuery
Streamed directly into your dataset with schema auto-detect
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About medplus.in scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Medplus legal?

Scraping publicly available information from Medplus is generally permissible. DataFlirt targets only public, non-authenticated medicine catalogues, pricing, and lab test data. We do not extract personal patient data, prescriptions, or violate privacy regulations. Clients should review platform terms and consult legal counsel for specific use cases.

How do you handle PIN code specific pricing?

Medplus alters stock and pricing based on the user location. We inject specific PIN code cookies and location headers into our Playwright sessions, ensuring the data extracted perfectly matches what a user in that specific geography sees.

Can you extract medicine substitutes?

Yes. We map primary medicines to their recommended substitutes, capturing the exact composition match percentage and the price differential between the brands.

How fresh is the data?

Full catalogue refreshes at daily cadence complete within a 6-12 hour window. For specific high-priority SKUs, we can configure streaming pipelines that check for price and stock updates hourly.

Do you scrape Medplus lab tests?

Yes. We extract diagnostic packages, individual test parameters, fasting requirements, pricing, and home collection availability.

What is the minimum viable engagement?

Our packages start at a defined category list or SKU set with weekly delivery. For full catalogue extraction across multiple PIN codes, we price based on volume and delivery frequency.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 500 SKUs or specific categories as part of the pre-engagement scoping process so you can validate schema fit and data quality.

$ dataflirt scope --new-project --source=medplus.in ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off medicine catalogue dump or a continuous price-monitoring feed across multiple PIN codes - we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →