SYSTEM all green source reed.co.uk queue 18,492 pages p99 latency 184ms dataflirt.com · scraper/reed-co.uk
RUN * 142 active pipelines * reed.co.uk live

UK job market data,
at warehouse scale.

We extract job postings, salary bands, contract types, agency details, and course listings from reed.co.uk. Delivered as clean JSON, CSV, or Parquet to S3, BigQuery, or Snowflake on your cadence.

Jobs extracted
312K /day
Salary updates
84K /24h
Course records
45K /run
Active pipelines
142
Uptime
99.98%
Data Dictionary

Every field we extract from reed.co.uk

Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.

Complete list of extractable fields for Job Postings objects from reed.co.uk. All fields typed and schema-versioned.

job_idtitleemployer_nameemployer_typelocationsalary_minsalary_maxsalary_typecontract_typeworking_patterndescriptionposted_dateclosing_dateurlapplications_count
job_postings
● 200 OK
"job_id": "51239482",
"title": "Senior Python Developer",
"employer_name": "TechCorp Ltd",
"location": "London (Hybrid)",
"salary_min": 75000,
"salary_max": 90000,
"contract_type": "Permanent",
"posted_date": "2026-05-10"
# job_idtitleemployer_nameemployer_typelocationsalary_min
1
2
3

Complete list of extractable fields for Employer Profiles objects from reed.co.uk. All fields typed and schema-versioned.

employer_idnamelogo_urlindustrycompany_sizewebsitedescriptionactive_jobs_countaverage_salaryratingreview_countheadquarters
employer_profiles
● 200 OK
"employer_id": "EMP9921",
"name": "TechCorp Ltd",
"industry": "IT & Telecoms",
"active_jobs_count": 14,
"average_salary": 65000,
"rating": 4.2,
"review_count": 128,
"headquarters": "London"
# employer_idnamelogo_urlindustrycompany_sizewebsite
1
2
3

Complete list of extractable fields for Reed Courses objects from reed.co.uk. All fields typed and schema-versioned.

course_idtitleprovider_namepriceoriginal_pricediscount_pctstudy_methoddurationqualificationcpd_pointsdescriptionurl
reed_courses
● 200 OK
"course_id": "C88321",
"title": "AWS Certified Solutions Architect",
"provider_name": "Cloud Academy",
"price": 199.0,
"study_method": "Online",
"duration": "Self-paced",
"cpd_points": 20,
"discount_pct": 15
# course_idtitleprovider_namepriceoriginal_pricediscount_pct
1
2
3

Complete list of extractable fields for Salary Insights objects from reed.co.uk. All fields typed and schema-versioned.

job_titleregionaverage_salarylowest_salaryhighest_salarysample_sizeyoy_changeindustrydata_timestampcurrency
salary_insights
● 200 OK
"job_title": "Python Developer",
"region": "London",
"average_salary": 72500,
"lowest_salary": 45000,
"highest_salary": 110000,
"sample_size": 1240,
"currency": "GBP",
"data_timestamp": "2026-05-12T00:00:00Z"
# job_titleregionaverage_salarylowest_salaryhighest_salarysample_size
1
2
3

Complete list of extractable fields for Search Results objects from reed.co.uk. All fields typed and schema-versioned.

keywordlocation_querypage_numberpositionjob_idtitleemployersalary_textis_promotedis_easy_applyposted_time_agoscraped_at
search_results
● 200 OK
"keyword": "Data Engineer",
"location_query": "Manchester",
"position": 3,
"job_id": "51239999",
"is_promoted": true,
"is_easy_apply": false,
"salary_text": "£50,000 to £65,000 per annum",
"scraped_at": "2026-05-12T10:15:00Z"
# keywordlocation_querypage_numberpositionjob_idtitle
1
2
3

Capabilities

Everything you need from Reed

Our Reed scraper handles every layer of the platform: job listings, salary normalisation, course catalogues, and employer profiles with anti-bot circumvention built in.

Full Job Listing Extraction

Title, description, salary bands, contract type, and location scraped at the individual job page level.

Salary Band Parsing

Normalise raw salary strings into minimum, maximum, and currency integers for direct warehouse ingestion.

Agency vs Direct Employer Tagging

Distinguish between recruitment agency postings and direct employer listings using Reed metadata flags.

Reed Courses Scraping

Extract course catalogues, pricing, provider details, and CPD point allocations across all educational categories.

Remote & Hybrid Work Filtering

Capture exact working patterns, remote flexibility, and office location requirements.

Historical Posting Tracking

Monitor job duration on the platform by tracking posted dates against closing dates or delisting events.

Recruiter Directory Intelligence

Extract agency profiles, contact details, and aggregate job volumes per recruitment firm.

Promoted Listing Detection

Identify sponsored job slots in search results to track competitor advertising spend and strategy.

Scheduled + Streaming Modes

Run daily market snapshots or configure continuous pipelines with change-detection diffing.

// engagement pipeline

From search query to warehouse record

Brief in. Clean data out.

Define Scope
d 0

Provide job keywords, regions, or employer names. We design the extraction schema together.

Pipeline Build
d 2–4

We configure Scrapy / Playwright crawlers, proxy rotation, and Cloudflare bypass for reed.co.uk.

Validation & QA
d 4–6

Schema validation, null-rate checks, salary-outlier detection, and sample records before full launch.

Delivery
ongoing

JSON / CSV / Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.

Under the hood

How our Reed pipeline handles the hard parts

Job boards rely on aggressive rate limits and pagination caps. Here is how we extract full market data reliably.

pipeline-monitor · reed.co.uk · live ● active
// fingerprinting
Identity rotation
TLS fingerprintrandomised
User-agentrotated
IP poolresidential
Challenges blocked0
// pagination
Page coverage
48,291 pages queued running
// observability
Pipeline health
99.9%
uptime
142ms
p99 lat
0.3%
null rate
2
alerts
Anti-bot layer
Residential proxy rotation

Reed uses aggressive rate limiting and Cloudflare. We route requests through UK residential IPs with randomised delays and valid TLS fingerprints.

Salary normalisation
Regex pipeline

Reed salary formats vary wildly. We parse strings like '£40,000 to £50,000 per annum + bonus' into structured min/max numeric fields for immediate analysis.

Pagination handling
Deep crawl execution

Search results cap at 1,000 jobs. We slice search queries by granular geographic radii and salary bands to extract the full corpus without hitting limits.

Duplicate detection
Agency deduplication

Recruiters cross-post identical roles. We hash job descriptions and normalise titles to flag probable duplicates in the final dataset.

Change detection
Only re-scrape what changed

For continuous market monitoring, we maintain a hash index of active jobs and only push new, updated, or deleted records.

Applications

Who uses Reed data

Teams across industries use reed.co.uk data to build competitive products and smarter operations.

01
Labour Market Analytics

Economic researchers track hiring volumes, salary inflation, and regional skill demand across the UK.

02
Competitor Intelligence

HR teams monitor competitor hiring velocity, role seniority, and salary band positioning.

03
Recruitment Agency Lead Gen

B2B sales teams identify direct employers who are actively expanding headcount.

04
Course Provider Pricing

EdTech companies scrape Reed Courses to benchmark certification pricing and discount strategies.

05
Job Board Aggregation

Niche industry job boards syndicate specific roles via continuous API pipelines.

06
AI Skill Mapping

ML teams parse job descriptions to build taxonomies of emerging software and engineering skills.

Why DataFlirt

"Reed holds the most comprehensive map of the UK labour market, but extracting structured salary bands and deduplicating agency spam requires serious engineering."

Most teams underestimate the investment required: reliable Reed scraping requires UK residential proxies, Cloudflare bypass, deep pagination slicing, and complex regex parsing for salary strings. DataFlirt absorbs that complexity so your engineers can focus on the analysis, not the infrastructure.

Technical Spec

Reed scraper technical capabilities

Everything supported by our reed.co.uk scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.

JavaScript rendering
Playwright sessions for dynamic elements and asynchronous loading
Supported
Cloudflare bypass
Automated solver integration for bot protection walls
Supported
UK Residential proxies
ISP-grade IPs rotated per request to prevent geographic blocking
Supported
Salary normalisation
String to integer conversion for minimum and maximum bounds
Supported
Agency deduplication
Hash-based similarity scoring to group duplicate recruiter posts
Supported
Change detection
Hash-based diff for active and closed jobs
Supported
Deep pagination slicing
Bypass 1,000 result limit via geographic and salary sub-queries
Supported
Webhook delivery
HTTP POST per record for real-time downstream processing
Supported
Applicant CV downloads
Gated recruiter dashboard data requires authentication
Partial
User application history
Requires individual candidate login credentials
Partial
Infrastructure

Infrastructure powering the Reed pipeline

Open-source tooling on proven cloud infra — no vendor lock-in, full observability.

ScrapyPlaywrightPython 3.12RedisPostgreSQLApache AirflowAWS LambdaS3CloudWatch2CaptchaCapSolverResidential ProxiesDockerKubernetesGrafanaPrometheus
Scrapy + Playwright Stack

Scrapy handles crawl orchestration, deduplication, and retry logic. Playwright handles JavaScript rendering and interaction flows.

Residential Proxy Infrastructure

We maintain pools of residential ISP proxies across UK regions. Rotation happens per-request with sticky sessions where required.

Cloud-Native Orchestration

Pipelines run on AWS Lambda and ECS. Airflow handles scheduling, dependency management, and SLA alerting. All state stored in managed Postgres.

Output & Delivery

Your data, your destination

Data delivered to where your team already works — no new tooling required.

JSON
Newline-delimited or nested payload
CSV
Flat file with typed columns
Parquet
Columnar format for BigQuery, Snowflake, Athena
AWS S3
Direct bucket delivery
Webhook
HTTP POST per record
API
REST endpoints for on-demand querying
XLS
Excel compatible exports for business teams
BigQuery
Streamed directly into your dataset
Snowflake
Stage and COPY INTO workflow
S3
Direct bucket delivery — compatible with any data lake
// faq

Common questions.

About reed.co.uk scraping, legality, and pipeline operations.

Ask us directly →
Is scraping Reed legal?

Scraping publicly available job postings is generally permissible under UK law. We target only public, non-authenticated data. We do not extract personal data or circumvent authentication walls.

How do you bypass Reed pagination limits?

Reed caps search results at 1,000 jobs. We programmatically slice searches by micro-locations and narrow salary bands to ensure complete extraction coverage.

Can you parse unstructured salary data?

Yes. Reed listings often contain unstructured salary text. Our pipeline uses regex patterns to extract minimum, maximum, and currency values into structured fields.

How fresh is the data?

Daily pipeline runs capture all new listings and status changes within a 4-hour window. Real-time streaming is available for specific keyword alerts.

Do you extract Reed Courses data?

Yes, we scrape the entire Reed Courses catalogue, including provider details, pricing, discounts, and CPD points.

How do you handle recruitment agency duplicates?

We generate similarity hashes based on job descriptions and titles to flag probable duplicates cross-posted by different recruitment agencies.

Can you scrape candidate profiles or CVs?

No. We do not scrape authenticated candidate profiles, CV databases, or internal recruiter dashboards. We only extract public job and course listings.

$ dataflirt scope --new-project --source=reed.co.uk ready

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a daily snapshot of the UK tech market or a continuous feed of remote roles, we scope, build, and operate the pipeline. Tell us what you need.

hello@dataflirt.com · Bengaluru · IST · typical reply < 4h
Services

Data Extraction for Every Industry

View All Services →