SYSTEM all green source wellfound.com queue 18,432 companies p99 latency 184ms dataflirt.com · scraper/wellfound-com

RUN . 142 active pipelines . wellfound.com live

Startup talent data,
at warehouse scale.

We extract job listings, equity ranges, founder profiles, tech stacks, and company funding signals from Wellfound. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Get data from wellfound.com → See how it works

Jobs extracted

145K /day

Salary updates

42K /24h

Company profiles

89K /run

Active pipelines

142

Uptime

99.98%

◆ Startup Profiles◆ Job Listings◆ Salary Data◆ Equity Ranges◆ Founder Intelligence◆ Tech Stack Mapping◆ Funding History◆ Employee Counts◆ Remote Roles◆ Applicant Requirements◆ Managed Pipeline◆ S3 Delivery◆ Bengaluru HQ◆ Startup Profiles◆ Job Listings◆ Salary Data◆ Equity Ranges◆ Founder Intelligence◆ Tech Stack Mapping◆ Funding History◆ Employee Counts◆ Remote Roles◆ Applicant Requirements◆ Managed Pipeline◆ S3 Delivery◆ Bengaluru HQ

Data Dictionary

Every field we extract from wellfound.com

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Job Postings objects from wellfound.com. All fields typed and schema-versioned.

job_idtitlecompany_idlocationremote_policysalary_minsalary_maxequity_minequity_maxjob_typeexperience_requiredskillsvisa_sponsorshipposted_at

"job_id": "1492834",
"title": "Senior Backend Engineer",
"company_id": "83921",
"location": "San Francisco, CA",
"remote_policy": "Remote within US",
"salary_min": 150000,
"salary_max": 180000,
"equity_min": 0.1,
"equity_max": 0.5,
"visa_sponsorship": false

#	job_id	title	company_id	location	remote_policy	salary_min
1
2
3

Complete list of extractable fields for Company Profiles objects from wellfound.com. All fields typed and schema-versioned.

company_idnamewebsiteindustrysizelocationpitchdescriptionfunding_totalfunding_stagetech_stackfoundersemployee_count

"company_id": "83921",
"name": "FinScale",
"industry": "Fintech",
"size": "51-200",
"funding_total": 24000000,
"funding_stage": "Series A",
"tech_stack": "['Python', 'React', 'PostgreSQL', 'AWS']",
"employee_count": 84

#	company_id	name	website	industry	size	location
1
2
3

Complete list of extractable fields for Salary & Equity objects from wellfound.com. All fields typed and schema-versioned.

job_idtitlecurrencybase_salary_minbase_salary_maxequity_minequity_maxrole_typemarket_rate_comparisonupdated_at

"job_id": "1492834",
"title": "Senior Backend Engineer",
"currency": "USD",
"base_salary_min": 150000,
"base_salary_max": 180000,
"equity_min": 0.1,
"equity_max": 0.5,
"updated_at": "2026-03-14T10:00:00Z"

#	job_id	title	currency	base_salary_min	base_salary_max	equity_min
1
2
3

Complete list of extractable fields for Founders & Team objects from wellfound.com. All fields typed and schema-versioned.

person_idnamecurrent_rolecompany_idlinkedin_urltwitter_urlbiopast_experienceeducationjoined_date

"person_id": "92831",
"name": "Sarah Jenkins",
"current_role": "Co-Founder & CEO",
"company_id": "83921",
"linkedin_url": "linkedin.com/in/sarahjenkins",
"twitter_url": "twitter.com/sarahj",
"bio": "Former VP Product at Stripe.",
"joined_date": "2022-01-15"

#	person_id	name	current_role	company_id	linkedin_url	twitter_url
1
2
3

Complete list of extractable fields for Search Results objects from wellfound.com. All fields typed and schema-versioned.

keywordlocationpositioncompany_namejob_titlesalary_rangeequity_rangeremote_badgeact_fast_badgescraped_at

"keyword": "machine learning",
"location": "Remote",
"position": 3,
"company_name": "AI Dynamics",
"job_title": "ML Engineer",
"salary_range": "$140k - $190k",
"remote_badge": true,
"act_fast_badge": false,
"scraped_at": "2026-03-14T10:15:00Z"

#	keyword	location	position	company_name	job_title	salary_range
1
2
3

Capabilities

Extract hiring signals and startup intelligence

Our Wellfound scraper navigates Cloudflare protections and dynamic React hydration to extract accurate compensation data, funding signals, and tech stacks at scale.

Startup Profile Extraction

Company names, pitches, descriptions, funding stages, total capital raised, and employee count brackets mapped to unique company IDs.

Job Listing Parsing

Extract job titles, locations, remote policies, required experience levels, and visa sponsorship availability for every active role.

Compensation & Equity Data

Capture base salary ranges, equity percentages, and currency types. Wellfound holds the most accurate early-stage compensation data.

Tech Stack Mapping

Extract programming languages, frameworks, and infrastructure tools listed on company profiles and job descriptions.

Founder Intelligence

Scrape founder bios, past experience, education, and social links to build comprehensive talent intelligence graphs.

Remote Work Signals

Identify timezone overlap requirements, remote-first policies, and geographical hiring constraints.

Recruiter Activity Tracking

Monitor 'Actively Hiring' badges, recent activity timestamps, and response rate indicators to gauge hiring urgency.

Market Categorisation

Extract industry tags like Fintech, SaaS, Web3, and AI to classify companies into specific market segments.

Scheduled Diffs

Run continuous pipelines to detect newly posted jobs, closed roles, and updated salary bands without downloading the entire catalogue.

// engagement pipeline

From company list to warehouse record

Brief in. Clean data out.

Define Scope

d 0

Provide company URLs, search keywords, or industry tags. We design the extraction schema together.

Pipeline Build

d 2–4

We configure Scrapy crawlers, intercept GraphQL queries, and manage residential proxy rotation for wellfound.com.

Validation & QA

d 4–6

Schema validation, null-rate checks, and salary outlier detection before full launch.

Delivery

ongoing

JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Wellfound pipeline handles the hard parts

Wellfound protects its data with strict rate limits and dynamic front-end architectures. Here is how we maintain stable extraction.

// fingerprinting

Identity rotation

TLS fingerprintrandomised

User-agentrotated

IP poolresidential

Challenges blocked0

// pagination

Page coverage

48,291 pages queued running

// observability

Pipeline health

99.9%

uptime

142ms

p99 lat

0.3%

null rate

alerts

Anti-bot layer

Cloudflare bypass and proxy rotation

Wellfound relies heavily on Cloudflare for bot mitigation. Our infrastructure uses residential proxies combined with TLS fingerprint spoofing and automated challenge solvers to maintain access without triggering blocks.

API Interception

Undocumented GraphQL queries

Instead of parsing complex React DOM structures, our Playwright sessions intercept the underlying GraphQL network requests. This provides cleaner, more structured data directly from Wellfound's backend.

Pagination

Handling infinite scroll and limits

Wellfound limits search results to a specific number of pages. We bypass this by programmatically slicing search queries by granular locations, salary brackets, and tech stacks to extract the full dataset.

Change detection

Tracking job lifecycle

We maintain a state index of all active jobs. Subsequent runs only push diffs, allowing you to accurately track exactly when a role is opened, updated, or closed.

Monitoring

Schema drift detection

Wellfound updates its GraphQL schema frequently. Our observability stack detects missing fields or type changes immediately, automatically pausing delivery and alerting our engineers to patch the selectors.

Applications

Who uses Wellfound data - and how

Teams across industries use wellfound.com data to build competitive products and smarter operations.

Talent Intelligence

Recruiting agencies and internal talent teams map tech stacks and salary ranges to optimise their sourcing strategies.

Venture Capital Deal Flow

VC firms monitor hiring velocity, key executive appointments, and tech stack choices as leading indicators of startup growth.

Compensation Benchmarking

HR platforms aggregate Wellfound salary and equity data to build accurate compensation models for early-stage companies.

B2B Lead Generation

SaaS companies target startups based on their funding stage, employee count, and specific technologies listed in job descriptions.

Market Research

Analysts track the rise of new programming languages and frameworks by analyzing occurrence rates in startup job postings.

Job Board Aggregation

Niche job boards syndicate remote and startup-specific roles to expand their catalogue and drive candidate traffic.

Why DataFlirt

"Wellfound holds the most accurate equity and compensation signals for early-stage startups on the internet, but it is locked behind heavy rate limits and dynamic endpoints."

Extracting startup data requires navigating strict Cloudflare protections, complex React hydration states, and undocumented GraphQL queries. DataFlirt handles the proxy rotation, session management, and schema maintenance so your data science team can focus on identifying hiring signals and market trends.

Technical Spec

Wellfound scraper - technical capabilities

Everything supported by our wellfound.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering

Full Playwright sessions to execute React hydration and trigger API calls

Supported

GraphQL interception

Direct extraction of structured JSON payloads from network traffic

Supported

Residential proxy rotation

ISP-grade IPs to bypass Cloudflare rate limits and IP bans

Supported

Change detection (diffs)

Hash-based diffing to track job openings and closures accurately

Supported

Webhook delivery

HTTP POST per record for real-time downstream processing

Supported

Historical funding data

Extraction of past funding rounds and investor lists where public

Supported

Candidate profiles & resumes

Private applicant data and resumes are strictly protected by Wellfound

Partial

Direct messaging to founders

Requires authenticated recruiter accounts and violates platform terms

Partial

Infrastructure

Infrastructure powering the Wellfound pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheusGraphQLSnowflake

GraphQL Extraction Stack

Playwright intercepts Wellfound's internal GraphQL queries, bypassing the need to parse complex React DOM structures and ensuring cleaner data extraction.

Cloudflare Bypass Infrastructure

We maintain custom TLS fingerprints and residential proxy pools specifically tuned to navigate Wellfound's strict bot mitigation layers without detection.

Cloud-Native Orchestration

Pipelines run on AWS ECS with Airflow managing dependency graphs and SLA alerting. State is maintained in Postgres for accurate change detection.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON

Newline-delimited or nested arrays containing full job details

CSV

Flat file with typed columns for easy spreadsheet import

XLS

Excel format for immediate business user analysis

Parquet

Columnar format optimized for BigQuery and Snowflake

AWS S3

Direct delivery to your AWS environment on completion

Webhook

HTTP POST payloads sent immediately upon job discovery

API

Queryable REST endpoints to access your extracted datasets

PostgreSQL

Direct database inserts with conflict resolution for existing roles

Direct bucket delivery — compatible with any data lake

// faq

Common questions.

About wellfound.com scraping, legality, and pipeline operations.

Ask us directly →

Can you extract salary and equity data from all jobs?

We extract all compensation data that is publicly visible on the platform. Wellfound is unique because it requires startups to post salary and equity ranges for most roles, making this data highly available and accurate.

How do you handle Wellfound's search pagination limits?

Wellfound caps the number of visible results for broad searches. Our orchestration engine automatically slices broad queries into hundreds of granular micro-searches based on specific locations, salary bands, and tech stacks to ensure 100% coverage.

Do you scrape candidate profiles?

No. DataFlirt focuses exclusively on public company profiles, job listings, and founder information. We do not extract private candidate data, resumes, or bypass authentication walls intended to protect user privacy.

How fresh is the job data?

Pipelines can be configured to run daily or hourly. Our change detection system ensures that closed jobs are flagged and new postings are delivered within minutes of the pipeline completing its run.

Can you track changes in startup funding?

Yes. We extract the funding stage and total capital raised from the company profile. By running continuous pipelines, we can log when a company updates its profile to reflect a new funding round.

What is the delivery format for tech stacks?

Tech stacks and required skills are extracted as structured JSON arrays, making it simple to query for specific languages or frameworks in your data warehouse.

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a complete export of startup profiles or a continuous feed of new engineering roles - we build and operate the infrastructure. Tell us your requirements.

Start a wellfound.com pipeline → View pricing

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h

Services

Data Extraction for Every Industry

View All Services →

🛍️ eCommerce → 🔍 Search Engine → ⚽ Sports Data → 📱 App Store → 🍕 Food Delivery → 📉 Betting Odds → ✈️ Aviation & Flight → 🛒 Grocery → 🎓 E-Learning → 💹 Stock Market → 🏠 Real Estate → 🤖 AI Training Data → 🧠 LLM Data → 📰 News → ⭐ Reviews → 💼 Job Board → 🏥 Healthcare → 💊 Pharma → 🏢 Company Data → 🤝 B2B Marketplace → 🚗 Automotive → 🌍 Travel → 🏨 Hospitality → 🪙 Cryptocurrency → 💡 IP & Patents → 📈 SEO Data → ⚖️ Legal → 🛡️ Insurance → 📲 Mobile App → 📸 Influencer → 🏛️ Government → 🚚 Transportation → 🎟️ Events → 📂 Directory → ⚡ Dynamic Websites → 📄 PDF Extraction → ✍️ Blog Content → ☁️ Weather → 🖥️ Cloud Scraping → 👨‍💻 Managed Service →

Startup talent data, at warehouse scale.

Every field we extract from wellfound.com

Extract hiring signals and startup intelligence

From company list to warehouse record

How our Wellfound pipeline handles the hard parts

Who uses Wellfound data - and how

Wellfound scraper - technical capabilities

Infrastructure powering the Wellfound pipeline

Your data, your destination

Common questions.

Tell us whatto extract. We do the rest.

Data Extraction for Every Industry

Startup talent data,
at warehouse scale.

Tell us what
to extract.
We do the rest.