SYSTEM all green source macys.com queue 27,491 pages p99 latency 149ms dataflirt.com · scraper/macys-com

RUN · 119 active pipelines · macys.com live

Macy's data,
at warehouse scale.

We extract apparel and beauty product listings, sale and clearance pricing, Star Rewards member pricing, size and colour variant data, brand intelligence, and review corpus from Macy's. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Get data from macys.com → See how it works

Products extracted

1.3M /day

Price updates

6.1M /24h

Review records

580K /run

Active pipelines

119

Uptime

99.95%

Data Dictionary

Every field we extract from macys.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Product Listings objects from macys.com. All fields typed and schema-versioned.

product_idtitlebranddepartmentcategorysub_categoryregular_pricesale_pricecurrencydiscount_pctclearance_flagstar_rewards_extra_pctsize_optionscolour_optionsfit_typematerialcare_instructionsratingreview_countimage_urlsvariant_countproduct_urlscraped_at

"product_id": "MCY-16284039",
"title": "Lauren Ralph Lauren Ponte Blazer — Navy",
"brand": "Lauren Ralph Lauren",
"department": "Women",
"regular_price": 189.00,
"sale_price": 113.40,
"currency": "USD",
"discount_pct": 40,
"clearance_flag": false,
"rating": 4.4,
"review_count": 847

#	product_id	title	brand	department	category	sub_category
1
2
3

Complete list of extractable fields for Pricing & Sales Events objects from macys.com. All fields typed and schema-versioned.

product_idregular_pricesale_pricediscount_pctdiscount_absclearance_flagclearance_depthsale_event_namesale_end_datestar_rewards_extra_pctgift_with_purchaseprice_timestampcurrency

"product_id": "MCY-16284039",
"regular_price": 189.00,
"sale_price": 113.40,
"discount_pct": 40,
"sale_event_name": "Friends & Family",
"sale_end_date": "2026-05-19",
"star_rewards_extra_pct": 5,
"gift_with_purchase": false,
"price_timestamp": "2026-05-12T09:00:00Z"

#	product_id	regular_price	sale_price	discount_pct	discount_abs	clearance_flag
1
2
3

Complete list of extractable fields for Size & Colour Variants objects from macys.com. All fields typed and schema-versioned.

product_idvariant_idcolour_namecolour_hexsize_labelsize_typefit_typein_stockvariant_pricevariant_sale_pricevariant_image_url

"product_id": "MCY-16284039",
"colour_name": "Lauren Navy",
"colour_hex": "#1B2A4A",
"size_label": "10",
"size_type": "Missy",
"fit_type": "Regular",
"in_stock": true,
"variant_sale_price": 113.40

#	product_id	variant_id	colour_name	colour_hex	size_label	size_type
1
2
3

Complete list of extractable fields for Reviews & Ratings objects from macys.com. All fields typed and schema-versioned.

review_idproduct_idreviewer_nameverified_purchasestar_ratingreview_titlereview_bodyreview_datehelpful_votessize_purchasedfit_feedbackcolour_purchasedimage_urls

"review_id": "mcy_rv_7284193",
"product_id": "MCY-16284039",
"star_rating": 5,
"verified_purchase": true,
"review_title": "Perfect office blazer — runs true to size",
"fit_feedback": "True to size",
"size_purchased": "10 Regular",
"helpful_votes": 84,
"review_date": "2026-04-15"

#	review_id	product_id	reviewer_name	verified_purchase	star_rating	review_title
1
2
3

Capabilities

Everything you need from Macy's — nothing you don't

Macy's is a department store where product intelligence requires fashion-specific dimensions: full size and colour variant mapping, fit feedback from reviews, sale event classification, clearance depth tracking, and brand-level pricing across hundreds of national and house brands.

Full Apparel & Beauty Product Extraction

Title, brand, department, category, material, care instructions, fit type, and every metadata field Macy's surfaces — scraped at product and variant level.

Size & Colour Variant Mapping

All available colour options (with hex codes), size labels, size types (Petite, Regular, Plus, Tall), fit types, and per-variant stock status and pricing — fully mapped.

Sale Event & Clearance Tracking

Capture regular price, sale price, discount percentage, sale event name (Friends & Family, One Day Sale, etc.), and sale end date — with clearance depth and flag per product.

Fit-Contextualised Review Mining

Reviews on Macy's include structured fit feedback (Runs Small, True to Size, Runs Large) and size purchased — transforming reviews into fit-intelligence data for sizing algorithms.

Brand Intelligence Across 500+ Labels

Macy's carries 500+ brands across apparel, beauty, and home. We extract brand-level pricing, discount intensity, and promotional treatment — enabling brand-by-brand competitive analysis.

Beauty & Fragrance Data

Full beauty product data: shade options, formula type, size variants, gift set structures, and GWP (gift with purchase) offers — across cosmetics, skincare, and fragrance.

Gift Set & GWP Detection

Identify gift set products, their component items, gift-with-purchase offers, and promotional bundle pricing — critical for beauty category seasonal analysis.

Search Rank & Department Rankings

Track product position for any keyword or department-level browse on Macy's — capturing sale badge, clearance flag, Star Rewards extra, and variant count in every result.

Scheduled + Streaming Modes

One-off catalogue exports or continuous pipelines at daily or real-time cadences — aligned to Macy's frequent sale event calendar.

// engagement pipeline

From product ID to warehouse record

Brief in. Clean data out.

Define Scope

d 0

Specify departments, brand lists, category paths, or product IDs. We design the extraction schema including variant mapping, sale events, and fit feedback fields.

Pipeline Build

d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, session management, and colour/size variant navigation for macys.com.

Validation & QA

d 4–6

Variant completeness audits, sale event classification validation, fit feedback null-rate checks, and sample records before full launch.

Delivery

ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Macy's pipeline handles the hard parts

Fashion e-commerce data complexity peaks at Macy's: dozens of sale event types, hundreds of brand pricing tiers, and thousands of colour-size variant combinations per product page.

// fingerprinting

Identity rotation

TLS fingerprintrandomised

User-agentrotated

IP poolresidential

Challenges blocked0

// pagination

Page coverage

48,291 pages queued running

// observability

Pipeline health

99.9%

uptime

142ms

p99 lat

0.3%

null rate

alerts

Colour-size variant navigation

Full variant matrix extraction per product

Macy's fashion products can have 20+ colour options and 15+ size options, creating variant matrices of 100+ combinations. Our Playwright pipeline systematically navigates colour and size selectors on each product page — capturing per-variant stock status, price, and image URL rather than just the default-display variant.

Sale event classification

Named event tagging with end-date capture

Macy's runs dozens of named sale events annually — Friends & Family, One Day Sale, VIP Sale, Lowest Prices of the Season. Our parser identifies and tags the sale event name per product alongside the sale price and end date — building an event-annotated pricing history that enables promotional intensity analysis by brand and category.

Fit feedback extraction

Structured size intelligence from review data

Macy's reviews include structured fit feedback fields — Runs Small, True to Size, Runs Large — and the size purchased by the reviewer. These fields transform reviews into sizing intelligence data, directly informing size curve decisions, fit model training, and return rate reduction analysis.

Clearance depth tracking

Markdown depth monitoring across departments

Clearance products on Macy's often carry multiple markdown layers. Our pipeline captures clearance flag status and the full discount depth per product — enabling markdown progression analysis across seasons and departments.

Monitoring & alerting

24/7 pipeline health with anomaly detection

Every run emits structured logs to our observability stack. We alert on variant coverage drops, sale event classification failures, fit feedback null-rates, and schema drift — and respond before you notice.

Applications

Who uses Macy's data — and how

Teams across industries use macys.com data to build competitive products and smarter operations.

Fashion Brand Pricing & MAP Monitoring

Apparel brands track their own and competitor product pricing, sale event treatment, clearance depth, and discount frequency on Macy's — monitoring MAP compliance and brand equity signals.

Size Curve & Fit Intelligence

Fashion brands and sizing platforms use Macy's fit-feedback review data — Runs Small / True to Size / Runs Large, with size purchased — to train sizing algorithms and calibrate size curve decisions.

Beauty Market Research

Beauty brands use Macy's product data across cosmetics, skincare, and fragrance to map pricing, shade range, gift set structures, and GWP offer prevalence across competing brands.

AI Training Data

ML teams use Macy's product datasets — apparel descriptions, colour names, material fields, fit tags, and review corpora — to train fashion AI for style classification, fit prediction, and recommendation.

Retail Analyst & Investor Research

Analysts evaluating Macy's competitive position use brand-level pricing data, promotional intensity signals, and clearance depth trends as indicators of merchandising health and inventory management.

Promotional Effectiveness Research

Trade marketing and retail media teams correlate Macy's sale event types, discount depths, and event frequency with review velocity — assessing which promotional structures drive sustained demand versus one-time spike volume.

Technical Spec

Macy's scraper — technical capabilities

Everything supported by our macys.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering

Full Playwright sessions — required for variant selectors, pricing widgets, and review tabs

Supported

CAPTCHA bypass

Automated 2Captcha + CapSolver integration with fallback to manual queue

Supported

Residential proxy rotation

ISP-grade US residential IPs — rotated per request

Supported

Full variant matrix capture

All colour and size combinations mapped with individual stock status and pricing

Supported

Sale event classification

Named sale event tagging: Friends & Family, One Day Sale, VIP Sale, and more

Supported

Clearance flag & depth

Clearance status flag and full markdown depth captured per product

Supported

Fit feedback extraction

Structured fit feedback and size purchased extracted from review records

Supported

Beauty gift set detection

Gift set product identification, component listing, and GWP offer capture

Supported

Star Rewards extra capture

Star Rewards additional earn percentage per product where surfaced

Supported

Review pagination

Full review corpus including all star-filter pages and fit feedback fields

Supported

Change detection (diffs)

Hash-based diff: only emit records with changed fields since last run

Supported

Star Rewards account pricing

Personalised Star Rewards tier pricing requires authenticated loyalty account sessions

Partial

Infrastructure

Infrastructure powering the Macy's pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus

Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles Macy's JavaScript-rendered colour selectors, size variant navigation, and review tab pagination.

Residential Proxy Infrastructure

We maintain pools of US ISP residential proxies. Rotation happens per-request with sticky sessions where required. IP score monitoring prevents blacklisted pool contamination.

Cloud-Native Orchestration

Pipelines run on AWS Lambda (burst) and ECS (sustained). Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

// faq

Common questions.

About macys.com scraping, legality, and pipeline operations.

Ask us directly →

Is scraping Macy's legal?

Scraping publicly available product, pricing, and review data from Macy's is generally permissible under applicable US law — reinforced by the hiQ v. LinkedIn ruling and similar precedents. DataFlirt targets only public, non-authenticated data. We do not extract personal purchase history or loyalty account data. We recommend clients review Macy's ToS independently and consult legal counsel for specific use cases.

Can you capture all colour and size variants per product?

Yes. Our Playwright pipeline systematically navigates the colour and size selector on each product page — capturing every available combination with its individual stock status, price, and variant image URL. This full variant matrix is delivered as a nested structure within each product record.

Can you identify which sale event applies to each product?

Yes. Macy's applies named sale event labels — Friends & Family, One Day Sale, VIP Sale, etc. — per product during promotional periods. Our parser captures the sale event name alongside the sale price and end date, building an event-annotated pricing history across your defined product set.

Can you extract the fit feedback from Macy's reviews?

Yes. Macy's review forms include structured fit feedback fields — Runs Small, True to Size, Runs Large — and the size purchased by the reviewer. These are extracted as separate structured fields per review record, enabling sizing algorithm training and fit model development.

Can you track clearance progression over time?

Yes. Clearance flag and discount depth are captured per product per run. For products that enter clearance, we track markdown progression — from initial clearance price through subsequent markdowns — building a season-end clearance cadence dataset.

Can you scrape beauty and fragrance data including shade variants?

Yes. Beauty and fragrance products have distinct variant structures — shade names, formula types, and size options. Our pipeline handles beauty variant navigation separately from apparel, capturing shade name, shade hex code, size, and per-variant availability.

What's the minimum viable engagement?

Our smallest packages start at a defined product set or department (typically 2,000–20,000 products) with weekly delivery. For brand-level pricing monitoring, seasonal sale event coverage, or fit intelligence programmes, we price based on volume and cadence.

Can I request a sample dataset before committing?

Absolutely. We provide a sample run of up to 300 products with full variant mapping, sale event classification, and review fit feedback as part of the pre-engagement scoping process.

Macy's data,
at warehouse scale.

Every field we extract from macys.com

Everything you need from Macy's — nothing you don't

From product ID to warehouse record

How our Macy's pipeline handles the hard parts

Who uses Macy's data — and how

Macy's scraper — technical capabilities

Infrastructure powering the Macy's pipeline

Your data, your destination

Common questions.

Tell us what
to extract.
We do the rest.

Data Extraction for Every Industry

Macy's data, at warehouse scale.

Every field we extract from macys.com

Everything you need from Macy's — nothing you don't

From product ID to warehouse record

How our Macy's pipeline handles the hard parts

Who uses Macy's data — and how

Macy's scraper — technical capabilities

Infrastructure powering the Macy's pipeline

Your data, your destination

Common questions.

Tell us whatto extract. We do the rest.

Data Extraction for Every Industry

Macy's data,
at warehouse scale.

Tell us what
to extract.
We do the rest.