SYSTEM all green source shopee.com queue 39,114 pages p99 latency 188ms dataflirt.com · scraper/shopee-com
RUN · 134 active pipelines · shopee.com live

Shopee data,
SEA's marketplace, at scale.

We extract product listings, pricing signals, flash deal windows, Shopee Mall shop intelligence, reviews, sold counts, and live commerce data from Shopee. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Products extracted
2.1M /day
Price updates
9.4M /24h
Review records
730K /run
Active pipelines
134
Uptime
99.96%
Data Dictionary

Every field we extract from shopee.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Product Listings objects from shopee.com. All fields typed and schema-versioned.

item_idshop_idtitlebrandcategorysub_categorypriceprice_minprice_maxcurrencydiscount_pctin_stockstock_quantitysold_countliked_countshopee_mall_badgepreferred_seller_badgeshopee_choice_badgecb_optionratingrating_countrating_starshipping_fee_minestimated_daysimage_urlsvariation_countpage_url
product_listings
● 200 OK
"item_id": "24718291930",
"title": "SOMETHINC Niacinamide + Moisture Barrier Serum 20ml",
"brand": "SOMETHINC",
"price": 89000,
"currency": "IDR",
"discount_pct": 30,
"shopee_mall_badge": true,
"sold_count": 82400,
"rating": 4.9,
"rating_count": 14820,
"in_stock": true
# item_idshop_idtitlebrandcategorysub_category
1
2
3

Complete list of extractable fields for Pricing & Flash Deals objects from shopee.com. All fields typed and schema-versioned.

item_idpriceprice_before_discountdiscount_pctflash_sale_priceflash_sale_startflash_sale_endflash_sale_stockflash_sale_soldshopee_coins_cashbackcoins_cashback_pctvoucher_codebundle_deal_pricefree_shipping_eligibleshopee_xpress_eligibleprice_timestampcurrency
pricing_& flash deals
● 200 OK
"item_id": "24718291930",
"price": 89000,
"price_before_discount": 127000,
"discount_pct": 30,
"flash_sale_price": 71200,
"flash_sale_end": "2026-05-12T22:00:00+07:00",
"flash_sale_stock": 200,
"flash_sale_sold": 183,
"shopee_coins_cashback": 89,
"price_timestamp": "2026-05-12T14:00:00Z"
# item_idpriceprice_before_discountdiscount_pctflash_sale_priceflash_sale_start
1
2
3

Complete list of extractable fields for Reviews & Ratings objects from shopee.com. All fields typed and schema-versioned.

review_iditem_idshop_idreviewer_namereviewer_levelstar_ratingcommentreview_datehelpful_votessku_reviewedvariation_reviewedimage_urlsvideo_urlshop_replyshop_reply_datecountry
reviews_& ratings
● 200 OK
"review_id": "spe_rv_77391040",
"item_id": "24718291930",
"star_rating": 5,
"comment": "Pemakaian 2 minggu sudah terlihat hasilnya, kulit lebih cerah!",
"sku_reviewed": "20ml",
"helpful_votes": 221,
"shop_reply": true,
"review_date": "2026-04-28"
# review_iditem_idshop_idreviewer_namereviewer_levelstar_rating
1
2
3

Complete list of extractable fields for Shop Intelligence objects from shopee.com. All fields typed and schema-versioned.

shop_idshop_nameshop_urlshopee_mall_officialpreferred_sellershop_ratingresponse_rateship_on_time_ratefollower_countactive_listings_countjoined_sincechat_performancecountryprimary_categories
shop_intelligence
● 200 OK
"shop_id": "somethinc-id-official",
"shop_name": "SOMETHINC Official Store",
"shopee_mall_official": true,
"shop_rating": 4.94,
"response_rate": 99,
"ship_on_time_rate": 98,
"follower_count": 1840000,
"active_listings_count": 287
# shop_idshop_nameshop_urlshopee_mall_officialpreferred_sellershop_rating
1
2
3

Complete list of extractable fields for Search & Rankings objects from shopee.com. All fields typed and schema-versioned.

keywordcountrypositionitem_idtitleshop_idpricediscount_pctsold_countshopee_mall_badgeshopee_choice_badgeflash_sale_badgefree_shipping_badgesponsoredthumbnail_urlscraped_at
search_& rankings
● 200 OK
"keyword": "niacinamide serum",
"country": "ID",
"position": 1,
"item_id": "24718291930",
"sponsored": false,
"shopee_mall_badge": true,
"flash_sale_badge": true,
"scraped_at": "2026-05-12T14:00:18Z"
# keywordcountrypositionitem_idtitleshop_id
1
2
3

Capabilities

Everything you need from Shopee — nothing you don't

Shopee operates across seven SEA markets plus Taiwan, with flash deals, Coins cashback, live commerce, and Shopee Mall creating pricing complexity that raw scrapes miss. Our pipeline handles all of it — from flash sale stock depth to sold count velocity.

Full Product Data Extraction

Title, brand, category, images, SKU variants, sold count, liked count, and every metadata field Shopee surfaces — at item level with full variation mapping across all markets.

Flash Deal Monitoring

Capture flash sale price, countdown window, total stock, remaining stock, and units sold during flash period — updated at elevated cadence during active deal windows.

Coins Cashback & Voucher Tracking

Shopee Coins cashback percentage, voucher codes, bundle deal pricing, and free-shipping eligibility — the promotion stack that makes Shopee's effective price materially different from the listed price.

Sold Count as Demand Signal

Shopee displays cumulative sold counts prominently. We capture sold_count and liked_count per item — making Shopee one of the few platforms where demand velocity is a directly observable metric.

Shopee Mall Shop Intelligence

Shop rating, response rate, on-time shipping rate, follower count, Shopee Mall official status, Preferred Seller badge, and Shopee Choice flag — per shop across all markets.

Review Mining with Media

Full review text, star ratings, SKU reviewed, helpful votes, shop replies, and review images and video thumbnails — paginated across all review pages in local language.

Seven-Market Coverage

shopee.co.id, shopee.co.th, shopee.com.my, shopee.com.ph, shopee.sg, shopee.vn, and shopee.tw — all from a unified pipeline with market-level tagging and currency normalisation.

Live Commerce & Shopee Video Signals

Capture live stream view counts, product links featured in streams, and Shopee Video engagement metrics — emerging demand signals for social commerce intelligence.

Scheduled + Flash-Triggered Modes

Run daily pipelines or elevate cadence during 9.9, 10.10, 11.11, 12.12, and mid-month Shopee campaigns — with pre/during/post snapshots for price and rank analysis.

// engagement pipeline

From item list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide item ID lists, category URLs, keyword sets, shop IDs, or country-market filters. We design the extraction schema together — including which markets, price fields, and deal-tracking cadences you need.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers with per-country residential proxies, sold count capture, flash deal monitoring logic, and Coins cashback calculation for Shopee.

Validation & QA
d 4–6

Schema validation, sold count cross-verification, price-outlier checks, flash deal logic testing, and sample reviews before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Shopee pipeline handles the hard parts

Shopee's seven-market footprint, API-driven architecture, and flash-deal complexity require a pipeline built specifically for it — not adapted from a single-market B2C scraper.

pipeline-monitor · shopee.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
API intercept
Shopee's internal API — structured data at the source

Shopee's web frontend communicates with an internal JSON API that returns product, pricing, and review data in structured form. Our Playwright sessions intercept these API calls — giving us reliable, schema-stable data that doesn't break when Shopee updates its frontend CSS or React component tree.

Flash deal depth
Stock depth and sell-through rate during flash windows

Flash deals on Shopee move fast. We capture flash_sale_stock (total allocated) and flash_sale_sold (units sold during the deal) at each crawl — giving you sell-through rate over time, not just a snapshot of the final cleared state.

Multi-market proxy routing
Seven residential proxy pools — one for each market

Shopee's anti-bot detection is country-specific. We maintain dedicated residential ISP proxy pools for ID, TH, MY, PH, SG, VN, and TW — routing each request through a country-appropriate proxy with matching browser locale and timezone settings.

Sold count tracking
Demand velocity from a directly observable signal

Shopee surfaces cumulative sold counts publicly on product pages. We capture sold_count per item per run — and because we run on a consistent schedule, the delta between runs gives you a sold velocity time-series: a demand signal unavailable on most other platforms.

Monitoring & alerting
24/7 pipeline health with anomaly detection

Every run emits structured logs to our observability stack. We alert on sold count regressions, flash deal logic failures, null-rate spikes, and schema drift — and respond before you notice.

Applications

Who uses Shopee data — and how

Teams across industries use shopee.com data to build competitive products and smarter operations.

01
Sold Count Demand Intelligence

Brands and analysts track sold count velocity per item over time — using Shopee's publicly visible demand signal as a near-real-time proxy for consumer purchase behaviour across SEA.

02
Flash Deal & Campaign Pricing

Competitive intelligence teams monitor flash deal prices, sell-through rates, and stock allocation during 9.9, 11.11, and 12.12 campaigns — to benchmark and respond to competitor promotional strategies.

03
Brand & MAP Monitoring

Brands monitor Shopee Mall and third-party seller listings for MAP violations, counterfeit products, and unauthorised resellers — across all seven Shopee markets simultaneously.

04
Multi-Market Category Research

Consumer goods companies track category pricing, brand ranking, and new product launch velocity across all SEA markets — to inform pan-regional distribution and pricing strategy.

05
AI Training Data

ML teams use Shopee's multilingual review corpus — spanning Bahasa Indonesia, Thai, Vietnamese, Filipino, and Malay — to train sentiment models and regional NLP classifiers.

06
Investor & Analyst Research

Analysts track Shopee's category growth, Shopee Mall penetration, and live commerce adoption across markets as indicators of Sea Limited's eCommerce strategy execution.

Why DataFlirt

"Shopee publicly shows sold counts — making it one of the only major platforms where demand velocity is directly observable. That data is only valuable if you're capturing it consistently over time."

Reliable Shopee intelligence requires per-country proxy pools, API intercept for stable data extraction, flash deal stock-depth monitoring, and multi-market schema normalisation. DataFlirt operates unified Shopee pipelines across all seven markets — delivering sold count time-series, flash deal analytics, and campaign snapshots on your cadence.

Technical Spec

Shopee scraper — technical capabilities

Everything supported by our shopee.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions — required for flash deal overlays and dynamic content
Supported
Internal API intercept
Shopee's JSON API intercepted for stable, schema-reliable product and pricing data
Supported
Flash deal monitoring
flash_sale_price, stock, and sold units captured at elevated cadence during deal windows
Supported
Sold count time-series
sold_count captured per run — delta between runs gives demand velocity time-series
Supported
CAPTCHA bypass
Automated 2Captcha + CapSolver integration with fallback to manual queue
Supported
Per-country proxy pools
Residential ISP IPs for ID / TH / MY / PH / SG / VN / TW — country-matched per request
Supported
Seven-market coverage
shopee.co.id, .co.th, .com.my, .com.ph, .sg, .vn, .tw from a unified pipeline
Supported
Coins cashback capture
shopee_coins_cashback percentage and absolute value captured per item
Supported
Review pagination
Full review corpus with SKU, variation, and media metadata per review
Supported
Sponsored ad detection
Distinguishes organic vs sponsored placements in SERP and category results
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed fields since last run
Supported
Buyer account / order data
Purchase history, private vouchers, and account-specific pricing require credentials
Partial
Infrastructure

Infrastructure powering the Shopee pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential Proxies (SEA × 7)DockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack with API Intercept

Scrapy handles crawl orchestration and retry logic. Playwright drives JavaScript rendering and Shopee's internal API intercept. The API intercept layer provides structured, schema-stable product and pricing data regardless of frontend updates.

Seven-Country Residential Proxy Infrastructure

We maintain dedicated residential ISP proxy pools for all seven Shopee markets. Each request is routed through a country-matched proxy with the appropriate locale, timezone, and browser fingerprint settings.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, campaign-calendar alignment, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets compatible
Parquet
Columnar format for BigQuery, Snowflake, Athena
S3
Direct bucket delivery — compatible with any data lake
BigQuery
Streamed directly into your dataset with schema auto-detect
Webhook
HTTP POST per record for real-time downstream processing
Postgres
Upsert into your existing schema with conflict resolution
Snowflake
Stage + COPY INTO workflow — incremental or full-replace
// faq

Common questions.

About shopee.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Shopee legal?

Scraping publicly available information from Shopee is generally permissible under applicable law across Southeast Asia and Taiwan — reinforced by precedents such as hiQ v. LinkedIn. DataFlirt targets only public, non-authenticated product, pricing, review, and sold count data. We do not extract personal data or circumvent authentication walls. We recommend clients review Shopee's ToS independently and consult legal counsel for specific use cases.

Which Shopee markets do you support?

We support all seven Shopee storefronts: shopee.co.id (Indonesia), shopee.co.th (Thailand), shopee.com.my (Malaysia), shopee.com.ph (Philippines), shopee.sg (Singapore), shopee.vn (Vietnam), and shopee.tw (Taiwan) — delivered via a unified, market-normalised schema with a country tag per record.

Can you track sold counts over time to get demand velocity?

Yes. We capture sold_count per item on every pipeline run. Because runs happen at a consistent cadence, the delta between consecutive runs gives you a demand velocity time-series — showing how many units sold in each period. This is one of Shopee's most distinctive data signals.

Can you monitor flash deal stock depth during campaigns?

Yes. During flash deal windows, we capture flash_sale_price, flash_sale_stock (total allocated), and flash_sale_sold (units sold so far) at elevated cadence — giving you sell-through rate over time, not just the final state after the deal clears.

How do you handle Shopee's anti-bot protection?

We use per-country residential ISP proxies, full Playwright browser sessions with country-appropriate locale and fingerprint settings, and Shopee's internal API intercept layer as a stable data source that is less sensitive to bot detection than rendered HTML scraping. We monitor block rates in real time and rotate pools automatically.

Can I request a sample dataset before committing?

Yes. We provide a sample run of up to 500 items per market — including pricing, sold count, flash deal, and shop data — as part of pre-engagement scoping, so you can validate schema fit and data quality before signing any contract.

$ dataflirt scope --new-project --source=shopee.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need sold count time-series across seven SEA markets, a flash deal monitoring feed, or a multilingual review corpus — we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →