SYSTEM all green source macys.com queue 27,491 pages p99 latency 149ms dataflirt.com · scraper/macys-com
RUN · 119 active pipelines · macys.com live

Macy's data,
at warehouse scale.

We extract apparel and beauty product listings, sale and clearance pricing, Star Rewards member pricing, size and colour variant data, brand intelligence, and review corpus from Macy's. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Products extracted
1.3M /day
Price updates
6.1M /24h
Review records
580K /run
Active pipelines
119
Uptime
99.95%
Data Dictionary

Every field we extract from macys.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Product Listings objects from macys.com. All fields typed and schema-versioned.

product_idtitlebranddepartmentcategorysub_categoryregular_pricesale_pricecurrencydiscount_pctclearance_flagstar_rewards_extra_pctsize_optionscolour_optionsfit_typematerialcare_instructionsratingreview_countimage_urlsvariant_countproduct_urlscraped_at
product_listings
● 200 OK
"product_id": "MCY-16284039",
"title": "Lauren Ralph Lauren Ponte Blazer — Navy",
"brand": "Lauren Ralph Lauren",
"department": "Women",
"regular_price": 189.00,
"sale_price": 113.40,
"currency": "USD",
"discount_pct": 40,
"clearance_flag": false,
"rating": 4.4,
"review_count": 847
# product_idtitlebranddepartmentcategorysub_category
1
2
3

Complete list of extractable fields for Pricing & Sales Events objects from macys.com. All fields typed and schema-versioned.

product_idregular_pricesale_pricediscount_pctdiscount_absclearance_flagclearance_depthsale_event_namesale_end_datestar_rewards_extra_pctgift_with_purchaseprice_timestampcurrency
pricing_& sales events
● 200 OK
"product_id": "MCY-16284039",
"regular_price": 189.00,
"sale_price": 113.40,
"discount_pct": 40,
"sale_event_name": "Friends & Family",
"sale_end_date": "2026-05-19",
"star_rewards_extra_pct": 5,
"gift_with_purchase": false,
"price_timestamp": "2026-05-12T09:00:00Z"
# product_idregular_pricesale_pricediscount_pctdiscount_absclearance_flag
1
2
3

Complete list of extractable fields for Size & Colour Variants objects from macys.com. All fields typed and schema-versioned.

product_idvariant_idcolour_namecolour_hexsize_labelsize_typefit_typein_stockvariant_pricevariant_sale_pricevariant_image_url
size_& colour variants
● 200 OK
"product_id": "MCY-16284039",
"colour_name": "Lauren Navy",
"colour_hex": "#1B2A4A",
"size_label": "10",
"size_type": "Missy",
"fit_type": "Regular",
"in_stock": true,
"variant_sale_price": 113.40
# product_idvariant_idcolour_namecolour_hexsize_labelsize_type
1
2
3

Complete list of extractable fields for Reviews & Ratings objects from macys.com. All fields typed and schema-versioned.

review_idproduct_idreviewer_nameverified_purchasestar_ratingreview_titlereview_bodyreview_datehelpful_votessize_purchasedfit_feedbackcolour_purchasedimage_urls
reviews_& ratings
● 200 OK
"review_id": "mcy_rv_7284193",
"product_id": "MCY-16284039",
"star_rating": 5,
"verified_purchase": true,
"review_title": "Perfect office blazer — runs true to size",
"fit_feedback": "True to size",
"size_purchased": "10 Regular",
"helpful_votes": 84,
"review_date": "2026-04-15"
# review_idproduct_idreviewer_nameverified_purchasestar_ratingreview_title
1
2
3

Capabilities

Everything you need from Macy's — nothing you don't

Macy's is a department store where product intelligence requires fashion-specific dimensions: full size and colour variant mapping, fit feedback from reviews, sale event classification, clearance depth tracking, and brand-level pricing across hundreds of national and house brands.

Full Apparel & Beauty Product Extraction

Title, brand, department, category, material, care instructions, fit type, and every metadata field Macy's surfaces — scraped at product and variant level.

Size & Colour Variant Mapping

All available colour options (with hex codes), size labels, size types (Petite, Regular, Plus, Tall), fit types, and per-variant stock status and pricing — fully mapped.

Sale Event & Clearance Tracking

Capture regular price, sale price, discount percentage, sale event name (Friends & Family, One Day Sale, etc.), and sale end date — with clearance depth and flag per product.

Fit-Contextualised Review Mining

Reviews on Macy's include structured fit feedback (Runs Small, True to Size, Runs Large) and size purchased — transforming reviews into fit-intelligence data for sizing algorithms.

Brand Intelligence Across 500+ Labels

Macy's carries 500+ brands across apparel, beauty, and home. We extract brand-level pricing, discount intensity, and promotional treatment — enabling brand-by-brand competitive analysis.

Beauty & Fragrance Data

Full beauty product data: shade options, formula type, size variants, gift set structures, and GWP (gift with purchase) offers — across cosmetics, skincare, and fragrance.

Gift Set & GWP Detection

Identify gift set products, their component items, gift-with-purchase offers, and promotional bundle pricing — critical for beauty category seasonal analysis.

Search Rank & Department Rankings

Track product position for any keyword or department-level browse on Macy's — capturing sale badge, clearance flag, Star Rewards extra, and variant count in every result.

Scheduled + Streaming Modes

One-off catalogue exports or continuous pipelines at daily or real-time cadences — aligned to Macy's frequent sale event calendar.

// engagement pipeline

From product ID to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Specify departments, brand lists, category paths, or product IDs. We design the extraction schema including variant mapping, sale events, and fit feedback fields.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, session management, and colour/size variant navigation for macys.com.

Validation & QA
d 4–6

Variant completeness audits, sale event classification validation, fit feedback null-rate checks, and sample records before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Macy's pipeline handles the hard parts

Fashion e-commerce data complexity peaks at Macy's: dozens of sale event types, hundreds of brand pricing tiers, and thousands of colour-size variant combinations per product page.

pipeline-monitor · macys.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Colour-size variant navigation
Full variant matrix extraction per product

Macy's fashion products can have 20+ colour options and 15+ size options, creating variant matrices of 100+ combinations. Our Playwright pipeline systematically navigates colour and size selectors on each product page — capturing per-variant stock status, price, and image URL rather than just the default-display variant.

Sale event classification
Named event tagging with end-date capture

Macy's runs dozens of named sale events annually — Friends & Family, One Day Sale, VIP Sale, Lowest Prices of the Season. Our parser identifies and tags the sale event name per product alongside the sale price and end date — building an event-annotated pricing history that enables promotional intensity analysis by brand and category.

Fit feedback extraction
Structured size intelligence from review data

Macy's reviews include structured fit feedback fields — Runs Small, True to Size, Runs Large — and the size purchased by the reviewer. These fields transform reviews into sizing intelligence data, directly informing size curve decisions, fit model training, and return rate reduction analysis.

Clearance depth tracking
Markdown depth monitoring across departments

Clearance products on Macy's often carry multiple markdown layers. Our pipeline captures clearance flag status and the full discount depth per product — enabling markdown progression analysis across seasons and departments.

Monitoring & alerting
24/7 pipeline health with anomaly detection

Every run emits structured logs to our observability stack. We alert on variant coverage drops, sale event classification failures, fit feedback null-rates, and schema drift — and respond before you notice.

Applications

Who uses Macy's data — and how

Teams across industries use macys.com data to build competitive products and smarter operations.

01
Fashion Brand Pricing & MAP Monitoring

Apparel brands track their own and competitor product pricing, sale event treatment, clearance depth, and discount frequency on Macy's — monitoring MAP compliance and brand equity signals.

02
Size Curve & Fit Intelligence

Fashion brands and sizing platforms use Macy's fit-feedback review data — Runs Small / True to Size / Runs Large, with size purchased — to train sizing algorithms and calibrate size curve decisions.

03
Beauty Market Research

Beauty brands use Macy's product data across cosmetics, skincare, and fragrance to map pricing, shade range, gift set structures, and GWP offer prevalence across competing brands.

04
AI Training Data

ML teams use Macy's product datasets — apparel descriptions, colour names, material fields, fit tags, and review corpora — to train fashion AI for style classification, fit prediction, and recommendation.

05
Retail Analyst & Investor Research

Analysts evaluating Macy's competitive position use brand-level pricing data, promotional intensity signals, and clearance depth trends as indicators of merchandising health and inventory management.

06
Promotional Effectiveness Research

Trade marketing and retail media teams correlate Macy's sale event types, discount depths, and event frequency with review velocity — assessing which promotional structures drive sustained demand versus one-time spike volume.

Why DataFlirt

"Macy's carries over 500 brands across fashion, beauty, and home — and its sale event calendar, variant pricing, and fit-contextualised reviews make it one of the richest and most complex department store datasets in US retail."

Reliable Macy's scraping requires full colour-size variant matrix navigation, named sale event classification, fit feedback extraction from reviews, and clearance depth tracking across departments. DataFlirt builds the fashion-domain-specific pipeline so your brand, research, and sizing teams get complete, structured data — not just the headline price.

Technical Spec

Macy's scraper — technical capabilities

Everything supported by our macys.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Full Playwright sessions — required for variant selectors, pricing widgets, and review tabs
Supported
CAPTCHA bypass
Automated 2Captcha + CapSolver integration with fallback to manual queue
Supported
Residential proxy rotation
ISP-grade US residential IPs — rotated per request
Supported
Full variant matrix capture
All colour and size combinations mapped with individual stock status and pricing
Supported
Sale event classification
Named sale event tagging: Friends & Family, One Day Sale, VIP Sale, and more
Supported
Clearance flag & depth
Clearance status flag and full markdown depth captured per product
Supported
Fit feedback extraction
Structured fit feedback and size purchased extracted from review records
Supported
Beauty gift set detection
Gift set product identification, component listing, and GWP offer capture
Supported
Star Rewards extra capture
Star Rewards additional earn percentage per product where surfaced
Supported
Review pagination
Full review corpus including all star-filter pages and fit feedback fields
Supported
Change detection (diffs)
Hash-based diff: only emit records with changed fields since last run
Supported
Star Rewards account pricing
Personalised Star Rewards tier pricing requires authenticated loyalty account sessions
Partial
Infrastructure

Infrastructure powering the Macy's pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles Macy's JavaScript-rendered colour selectors, size variant navigation, and review tab pagination.

Residential Proxy Infrastructure

We maintain pools of US ISP residential proxies. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets compatible
Parquet
Columnar format for BigQuery, Snowflake, Athena
S3
Direct bucket delivery — compatible with any data lake
BigQuery
Streamed directly into your dataset with schema auto-detect
Webhook
HTTP POST per record for real-time downstream processing
Postgres
Upsert into your existing schema with conflict resolution
Snowflake
Stage + COPY INTO workflow — incremental or full-replace
// faq

Common questions.

About macys.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Macy's legal?

Scraping publicly available product, pricing, and review data from Macy's is generally permissible under applicable US law — reinforced by the hiQ v. LinkedIn ruling and similar precedents. DataFlirt targets only public, non-authenticated data. We do not extract personal purchase history or loyalty account data. We recommend clients review Macy's ToS independently and consult legal counsel for specific use cases.

Can you capture all colour and size variants per product?

Yes. Our Playwright pipeline systematically navigates the colour and size selector on each product page — capturing every available combination with its individual stock status, price, and variant image URL. This full variant matrix is delivered as a nested structure within each product record.

Can you identify which sale event applies to each product?

Yes. Macy's applies named sale event labels — Friends & Family, One Day Sale, VIP Sale, etc. — per product during promotional periods. Our parser captures the sale event name alongside the sale price and end date, building an event-annotated pricing history across your defined product set.

Can you extract the fit feedback from Macy's reviews?

Yes. Macy's review forms include structured fit feedback fields — Runs Small, True to Size, Runs Large — and the size purchased by the reviewer. These are extracted as separate structured fields per review record, enabling sizing algorithm training and fit model development.

Can you track clearance progression over time?

Yes. Clearance flag and discount depth are captured per product per run. For products that enter clearance, we track markdown progression — from initial clearance price through subsequent markdowns — building a season-end clearance cadence dataset.

Can you scrape beauty and fragrance data including shade variants?

Yes. Beauty and fragrance products have distinct variant structures — shade names, formula types, and size options. Our pipeline handles beauty variant navigation separately from apparel, capturing shade name, shade hex code, size, and per-variant availability.

What's the minimum viable engagement?

Our smallest packages start at a defined product set or department (typically 2,000–20,000 products) with weekly delivery. For brand-level pricing monitoring, seasonal sale event coverage, or fit intelligence programmes, we price based on volume and cadence.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 300 products with full variant mapping, sale event classification, and review fit feedback as part of the pre-engagement scoping process.

$ dataflirt scope --new-project --source=macys.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a brand pricing monitor, a full variant catalogue, a sale event history, or a fit-intelligence review corpus — we scope, build, and operate the pipeline.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →