SYSTEM all green source finance.yahoo.com queue 38,204 tickers p99 latency 118ms dataflirt.com · scraper/finance-yahoo
RUN · 116 active pipelines · finance.yahoo.com live

Yahoo Finance data,
markets wired to your warehouse.

We extract stock quotes, historical price series, fundamentals, analyst estimates, earnings data, insider filings, and financial news from Yahoo Finance. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Tickers monitored
94K /day
Quote updates
18.4M /24h
Filings processed
12K /run
Active pipelines
116
Uptime
99.97%
Data Dictionary

Every field we extract from finance.yahoo.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Stock Quotes objects from finance.yahoo.com. All fields typed and schema-versioned.

tickerexchangecompany_namecurrencypriceopenhighlowprevious_closechangechange_pctvolumeavg_volume_3mmarket_capbetape_ratioeps_ttmdividend_yieldex_dividend_date52w_high52w_low50d_avg200d_avgshares_outstandingfloat_sharesshort_interestshort_pct_floatquote_timestamp
stock_quotes
● 200 OK
"ticker": "AAPL",
"exchange": "NASDAQ",
"company_name": "Apple Inc.",
"price": 213.42,
"change_pct": 1.24,
"volume": 58291044,
"market_cap": 3280000000000,
"pe_ratio": 34.21,
"52w_high": 237.23,
"52w_low": 164.08,
"dividend_yield": 0.51,
"quote_timestamp": "2026-05-12T20:00:00Z"
# tickerexchangecompany_namecurrencypriceopen
1
2
3

Complete list of extractable fields for Historical OHLCV objects from finance.yahoo.com. All fields typed and schema-versioned.

tickerdateopenhighlowcloseadj_closevolumesplit_coefficientdividend_amountcurrencyexchangeinterval
historical_ohlcv
● 200 OK
"ticker": "AAPL",
"date": "2026-05-12",
"open": 210.88,
"high": 214.91,
"low": 209.42,
"close": 213.42,
"adj_close": 213.42,
"volume": 58291044,
"split_coefficient": 1.0,
"interval": "1d"
# tickerdateopenhighlowclose
1
2
3

Complete list of extractable fields for Financials objects from finance.yahoo.com. All fields typed and schema-versioned.

tickerperiodperiod_typecurrencytotal_revenuegross_profitoperating_incomeebitdanet_incomeeps_basiceps_dilutedtotal_assetstotal_liabilitiestotal_equitycash_and_equivalentstotal_debtnet_debtoperating_cash_flowcapexfree_cash_flowshares_outstandingreport_datefiling_date
financials
● 200 OK
"ticker": "AAPL",
"period": "Q1 2026",
"period_type": "quarterly",
"total_revenue": 124300000000,
"net_income": 36330000000,
"eps_diluted": 2.40,
"free_cash_flow": 29820000000,
"total_debt": 101200000000,
"report_date": "2026-01-30"
# tickerperiodperiod_typecurrencytotal_revenuegross_profit
1
2
3

Complete list of extractable fields for Analyst Estimates objects from finance.yahoo.com. All fields typed and schema-versioned.

tickeranalyst_countrecommendation_meanrecommendation_keytarget_price_meantarget_price_hightarget_price_lowtarget_price_medianeps_estimate_current_qtreps_estimate_next_qtreps_estimate_current_yeareps_estimate_next_yearrevenue_estimate_current_yearrevenue_estimate_next_yearearnings_growth_estimaterevenue_growth_estimatelast_updated
analyst_estimates
● 200 OK
"ticker": "AAPL",
"analyst_count": 42,
"recommendation_key": "buy",
"recommendation_mean": 2.1,
"target_price_mean": 238.50,
"target_price_high": 300.00,
"target_price_low": 185.00,
"eps_estimate_next_year": 8.14,
"earnings_growth_estimate": 11.2
# tickeranalyst_countrecommendation_meanrecommendation_keytarget_price_meantarget_price_high
1
2
3

Complete list of extractable fields for Earnings & Calendar objects from finance.yahoo.com. All fields typed and schema-versioned.

tickerearnings_dateearnings_timeeps_estimateeps_actualeps_surpriseeps_surprise_pctrevenue_estimaterevenue_actualrevenue_surprise_pctguidance_revenue_lowguidance_revenue_highguidance_eps_lowguidance_eps_highcall_transcript_availablefiscal_quarterfiscal_year
earnings_& calendar
● 200 OK
"ticker": "AAPL",
"earnings_date": "2026-05-01",
"earnings_time": "AMC",
"eps_estimate": 1.57,
"eps_actual": 1.65,
"eps_surprise_pct": 5.10,
"revenue_actual": 95360000000,
"revenue_surprise_pct": 2.40,
"fiscal_quarter": "Q2 FY2026"
# tickerearnings_dateearnings_timeeps_estimateeps_actualeps_surprise
1
2
3

Capabilities

Every layer of Yahoo Finance — structured and delivered

Yahoo Finance is the world's most-visited financial data platform. Our scraper covers every object type analysts and quant teams need: real-time quotes, historical OHLCV, full financial statements, analyst consensus, insider filings, options chains, and news — all in a single schema-consistent pipeline.

Real-Time & Delayed Quotes

Price, change, volume, market cap, P/E, EPS, 52-week range, moving averages, short interest, and float — captured per ticker at your chosen cadence from pre-market open to post-market close.

Historical OHLCV — Any Interval

Adjusted and unadjusted daily, weekly, monthly, and intraday (1m, 5m, 15m, 60m) OHLCV going back decades — with split coefficients and dividend amounts embedded per row.

Income Statement, Balance Sheet & Cash Flow

Annual and quarterly financial statements: revenue, gross profit, EBITDA, net income, EPS, total assets, total debt, free cash flow — normalised across all reporting currencies.

Analyst Estimates & Price Targets

Consensus recommendation, mean/high/low price targets, analyst count, EPS and revenue estimates for current and next fiscal year, and earnings and revenue growth forecasts.

Earnings Calendar & Surprise History

Upcoming earnings dates, EPS and revenue estimates vs actuals, surprise percentage, guidance ranges, and AMC/BMO timing flags — across any ticker universe.

Insider & Institutional Holdings

Insider transactions (buy/sell/option exercise), Form 4 filing details, institutional 13F holdings by fund, ownership percentage, and period-over-period changes.

Options Chain Data

Full options chain extraction: strike, expiry, bid/ask, last price, implied volatility, open interest, volume, delta, gamma, and theta — for any ticker and expiry date.

Financial News & Sentiment

News articles and press releases indexed per ticker: headline, source, publication timestamp, article URL, and sentiment score (positive/negative/neutral) via NLP post-processing.

ESG Scores & Sustainability Data

Yahoo Finance ESG risk score, environmental, social, and governance sub-scores, controversy level, and peer comparison percentile — per ticker where available.

// engagement pipeline

From ticker list to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide ticker lists, indices, screener criteria, or sector filters. We design the extraction schema — including which financial statement periods, estimate horizons, and market data fields you need.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers with US residential proxies, financial data parsers, and OHLCV normalisation logic — calibrated for Yahoo Finance's rate limits and page structure.

Validation & QA
d 4–6

Schema validation, price sanity checks, financial statement cross-balancing, OHLCV continuity testing, and split-adjustment verification before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on your cadence — market-hours-aware scheduling available.

Under the hood

How our Yahoo Finance pipeline handles the hard parts

Yahoo Finance has aggressive rate limiting, JavaScript-rendered financials, and frequent schema changes. Here's how we maintain reliable extraction at scale.

pipeline-monitor · finance.yahoo.com · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Rate limit management
Respectful pacing across thousands of tickers

Yahoo Finance aggressively throttles bulk requests. Our pipeline manages request pacing at the ticker level — distributing load across US residential proxies, staggering concurrency during market hours, and implementing exponential backoff on rate-limit responses. Large ticker universes (5,000–50,000 symbols) are batched and distributed across multi-hour windows without gaps.

Split & dividend adjustment
Historically consistent adj_close — automatically

Stock splits and dividend events require backward adjustment of the entire historical series. Our pipeline detects split and dividend events from Yahoo Finance's event feed and applies backward-adjusted close prices to all historical records — ensuring your time series is consistent across the full history from day one.

Financial statement parsing
Structured financials from JavaScript-rendered tables

Yahoo Finance renders income statements, balance sheets, and cash flow statements dynamically via React. We run full Playwright sessions to capture the rendered financial tables, then apply a normalisation layer that maps Yahoo's variable label names to a stable, cross-ticker schema — regardless of how Yahoo labels line items for different companies or sectors.

Market-hours scheduling
Cadence aligned to market open, close, and earnings

Financial data has a temporal logic. Our scheduler aligns pipeline runs to market events: pre-market (4:00 AM ET), market open, market close (4:00 PM ET), and after-hours. Earnings calendar integration triggers elevated-cadence runs around reporting dates — capturing pre/post earnings quote, estimate, and surprise data in the right sequence.

Monitoring & alerting
24/7 pipeline health with financial anomaly detection

Every run emits structured logs to our observability stack. We alert on price outliers beyond expected daily move ranges, null-rate spikes in financial statement fields, ticker delisting events, and schema changes on Yahoo's side — and respond before your downstream models notice.

Applications

Who uses Yahoo Finance data — and how

Teams across industries use finance.yahoo.com data to build competitive products and smarter operations.

01
Quantitative Strategy & Backtesting

Quant teams use historical OHLCV, fundamentals, and analyst estimate time-series to build, test, and refine systematic trading strategies across equity universes.

02
Fundamental Analysis & Stock Screening

Analysts and portfolio managers screen the investable universe using financial statement metrics, valuation ratios, analyst consensus, and earnings surprise history — sourced at scale.

03
Earnings Intelligence

Event-driven funds and research teams monitor earnings calendars, consensus estimates vs actuals, guidance changes, and surprise patterns across hundreds of tickers per reporting season.

04
Alternative Data Enrichment

Data vendors and hedge funds use Yahoo Finance fundamentals and estimates as a baseline layer to enrich with satellite imagery, credit card data, and other alternative signals.

05
Financial News & Sentiment Monitoring

Trading desks and FinTech products monitor Yahoo Finance news feeds per ticker for sentiment shifts, merger rumours, and macro event coverage — feeding NLP-driven alert systems.

06
FinTech Product Data Feeds

FinTech companies building portfolio trackers, robo-advisors, and wealth management tools use Yahoo Finance as a data source for quotes, fundamentals, and news without building exchange connections.

Why DataFlirt

"Yahoo Finance has more financial data objects per ticker than any other free source on the internet — quotes, financials, estimates, insider filings, options, news, and ESG in one place."

The challenge isn't access — it's reliability at scale. Yahoo Finance rate-limits aggressively, restructures its React frontend without warning, and changes financial label names across sectors. DataFlirt runs a production-grade pipeline that absorbs all of that: paced extraction across US residential proxies, Playwright-rendered financials, split-adjusted OHLCV, and market-hours-aware scheduling — delivered to your warehouse on the cadence your models need.

Technical Spec

Yahoo Finance scraper — technical capabilities

Everything supported by our finance.yahoo.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

Real-time & delayed quotes
Price, volume, market cap, P/E, EPS, beta, short interest — per ticker per run
Supported
Historical OHLCV
1m to monthly intervals, split and dividend adjusted, going back decades per ticker
Supported
Income statement extraction
Annual + quarterly, Playwright-rendered, normalised to cross-ticker schema
Supported
Balance sheet extraction
Annual + quarterly, all line items, normalised and typed
Supported
Cash flow statement
Operating, investing, financing flows — annual and quarterly
Supported
Analyst estimates & targets
Consensus recommendation, mean/high/low targets, EPS and revenue estimates
Supported
Earnings calendar & surprise
Upcoming dates, EPS/revenue estimates vs actuals, surprise %, guidance ranges
Supported
Insider transactions
Form 4 buy/sell/exercise records — ticker, insider name, share count, price, date
Supported
Institutional holdings
13F holder name, shares held, % ownership, period-over-period change
Supported
Options chain
Strike, expiry, bid/ask, IV, OI, volume, Greeks — full chain per ticker per expiry
Supported
Financial news feed
Headlines, source, timestamp, article URL per ticker — NLP sentiment scoring optional
Supported
ESG scores
Total risk score, E/S/G sub-scores, controversy level, peer percentile where available
Supported
Market-hours-aware scheduling
Runs timed to pre-market, close, after-hours, and earnings event triggers
Supported
Yahoo Finance Premium data
Premium-gated research reports and institutional-grade feeds require subscriber credentials
Partial
Infrastructure

Infrastructure powering the Yahoo Finance pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12PandasRedisPostgreSQLApache AirflowAWS LambdaS3CloudWatchResidential Proxies (US)TA-LibDockerKubernetesGrafanaPrometheus
Scrapy + Playwright with Financial Normalisation

Scrapy handles crawl orchestration, ticker batching, and retry logic. Playwright renders Yahoo Finance's React-based financial tables. A normalisation layer maps Yahoo's variable label names to a stable, typed schema — consistent across sectors, geographies, and reporting periods.

US Residential Proxy Infrastructure with Pacing

We maintain pools of US residential ISP proxies. Request pacing is managed at the ticker level with configurable concurrency caps — distributing extraction load across multi-hour windows to stay within Yahoo Finance's rate tolerance without triggering throttling.

Market-Hours-Aware Orchestration

Airflow DAGs are scheduled against NYSE/NASDAQ market hours: pre-market, open, close, and after-hours windows. Earnings calendar integration triggers elevated-cadence runs around reporting dates. All state stored in managed Postgres with ticker-level run history.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested — schema versioned per run
CSV
Flat file with typed columns — Excel/Sheets/R compatible
Parquet
Columnar format for BigQuery, Snowflake, Athena, Spark
S3
Direct bucket delivery — compatible with any data lake
BigQuery
Streamed directly into your dataset with schema auto-detect
Webhook
HTTP POST per record — useful for real-time quote alerts
Postgres
Upsert into your existing schema with conflict resolution
Snowflake
Stage + COPY INTO workflow — incremental or full-replace
// faq

Common questions.

About finance.yahoo.com scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Yahoo Finance legal?

Scraping publicly available information from Yahoo Finance is generally permissible under applicable law — reinforced by the hiQ v. LinkedIn ruling and broader public data precedents in the US. DataFlirt extracts only public, non-authenticated market data, financial statements, and news content. We do not access Premium-gated content, personal account data, or real-time exchange feeds that require licensed redistribution agreements. We recommend clients review Yahoo Finance's ToS independently and consult legal counsel — particularly for commercial redistribution use cases.

Can you deliver split-adjusted historical OHLCV?

Yes. Our pipeline detects split and dividend events from Yahoo Finance's event feed and applies backward split-adjustment to all historical close prices — producing a continuous adj_close series. The raw (unadjusted) OHLCV and the split coefficient are also delivered as separate fields so clients can apply their own adjustment logic if preferred.

How do you handle Yahoo Finance's rate limiting?

We distribute extraction across US residential ISP proxies with ticker-level request pacing and configurable concurrency caps. Large ticker universes are batched across multi-hour windows. We implement exponential backoff on rate-limit responses and monitor block rates in real time — triggering pool rotation automatically when needed.

Can you schedule runs around market hours and earnings events?

Yes. Our Airflow DAGs are built with market-hours awareness: runs can be scheduled to pre-market open, market close, after-hours, or any combination. We maintain an earnings calendar feed — so when a ticker in your universe reports, an elevated-cadence run fires automatically to capture pre/post earnings quote, estimate, and surprise data in the correct sequence.

Do you normalise financial statement line items across companies?

Yes. Yahoo Finance labels financial line items inconsistently across sectors and geographies (e.g. 'Total Revenue' vs 'Net Revenue' vs 'Revenue'). Our normalisation layer maps Yahoo's variable labels to a stable, cross-ticker schema — so your downstream models and dashboards don't need to handle per-ticker label variation.

Can you extract options chain data across all expiries?

Yes. We extract full options chains per ticker across all available expiry dates — including strike, bid/ask, last price, implied volatility, open interest, volume, and available Greeks (delta, gamma, theta). Options data can be delivered on a daily end-of-day schedule or more frequently if intraday options analytics are required.

Do you support international tickers — not just US equities?

Yes. Yahoo Finance covers equities, ETFs, mutual funds, indices, currencies, and cryptocurrencies across exchanges globally. We support any ticker Yahoo Finance covers — including LSE, TSE, NSE, BSE, ASX, Euronext, and others — with currency-normalised output per record.

Can I request a sample dataset before committing?

Yes. We provide a sample run of up to 200 tickers — including quotes, one year of daily OHLCV, the most recent quarterly financials, and analyst estimates — as part of pre-engagement scoping, so you can validate schema fit and data quality before signing any contract.

$ dataflirt scope --new-project --source=finance.yahoo.com ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need daily fundamentals across 10,000 tickers, split-adjusted OHLCV going back 20 years, or an earnings calendar feed wired to your models — we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →