We extract course listings, membership pricing, curriculum structures, and instructor profiles across LearnWorlds schools. Delivered as clean JSON, CSV, or Parquet to S3 or BigQuery on your cadence.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Course Catalogue objects from learnworlds.com. All fields typed and schema-versioned.
"course_id": "lw-crs-8921", "title": "Advanced Python for Data Science", "price": 199.0, "currency": "USD", "instructor_id": "inst-402", "language": "English", "rating": 4.8, "review_count": 342
| # | course_id | title | slug | category | description | price |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Pricing & Memberships objects from learnworlds.com. All fields typed and schema-versioned.
"plan_id": "plan-9012", "plan_name": "Pro Access", "plan_type": "subscription", "price": 49.0, "interval": "monthly", "currency": "USD", "trial_days": 14, "is_active": true
| # | plan_id | course_id | plan_name | plan_type | price | interval |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Curriculum Outlines objects from learnworlds.com. All fields typed and schema-versioned.
"course_id": "lw-crs-8921", "section_title": "Module 1: Data Structures", "item_title": "Lists and Dictionaries", "item_type": "video", "is_free": true, "duration_minutes": 18, "drip_feed_days": 0
| # | course_id | section_id | section_title | section_order | item_id | item_title |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Instructor Profiles objects from learnworlds.com. All fields typed and schema-versioned.
"instructor_id": "inst-402", "name": "Sarah Jenkins", "bio": "Data Scientist and Python educator.", "total_courses": 4, "total_students": 12500, "average_rating": 4.9, "website": "https://sarahjenkins.dev"
| # | instructor_id | name | bio | avatar_url | social_links | total_courses |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for School Metadata objects from learnworlds.com. All fields typed and schema-versioned.
"school_id": "sch-104", "domain": "academy.example.com", "school_name": "Tech Academy", "active_courses": 45, "total_instructors": 12, "currency_default": "USD", "supported_languages": "['English', 'Spanish']"
| # | school_id | domain | school_name | theme_settings | logo_url | active_courses |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Our scraper handles LearnWorlds school domains, parsing custom themes, dynamic pricing widgets, and nested curriculum structures with full JavaScript rendering.
Title, description, categories, language, and difficulty level across diverse school themes.
Capture one-off payments, subscriptions, payment plans, and bundle pricing.
Extract nested course structures, section titles, module types, and free preview flags.
Scrape instructor bios, social links, course portfolios, and student counts.
Extract student testimonials, star ratings, and review text from course landing pages.
Monitor multiple LearnWorlds subdomains or custom domains simultaneously.
Identify active coupon codes, discounted pricing, and limited-time offers.
Map relationships between individual courses and overarching subscription bundles.
Identify and track LearnWorlds instances operating on white-labelled custom domains.
Brief in. Clean data out.
Provide target LearnWorlds domains, specific course URLs, or instructor profiles. We design the extraction schema.
We configure Scrapy and Playwright crawlers, handle theme variations, and bypass rate limits.
Schema validation, null-rate checks, and nested curriculum structure verification before launch.
JSON, CSV, or Parquet pushed to your S3 bucket, BigQuery dataset, or Snowflake stage on agreed cadence.
LearnWorlds schools use highly customisable themes, dynamic SPAs, and varying DOM structures. We standardise the output.
LearnWorlds allows extensive CSS and HTML customisation. Our selectors target underlying data attributes and JSON state rather than fragile visual classes.
Course catalogues and pricing widgets rely on client-side rendering. We use Playwright to hydrate the DOM before extraction.
Many schools use custom domains. Our pipeline identifies LearnWorlds infrastructure via headers and injects standard extractors.
Curriculums are deeply nested. We flatten sections and modules into relational tables for easy warehouse ingestion.
Aggressive scraping triggers WAF blocks. We distribute requests across residential proxies with human-like timing.
EdTech analysts track course pricing trends, popular categories, and curriculum depth across specific niches.
Course creators monitor competing schools for new curriculum additions, pricing changes, and bundle strategies.
B2B service providers extract instructor profiles and school metadata to build targeted outreach lists.
eLearning directories consolidate course listings, ratings, and pricing from multiple LearnWorlds instances.
Schools analyse subscription versus one-off payment models across top-performing competitors.
ML teams extract curriculum structures and course descriptions to train educational content generation models.
"The creator economy runs on platforms like LearnWorlds, generating a massive, fragmented dataset of educational content and pricing models."
Extracting data from LearnWorlds requires navigating thousands of distinct, highly customised school themes. Relying on basic HTTP requests fails due to client-side rendering and aggressive caching. DataFlirt manages the JavaScript execution and normalises the output across every school, delivering clean, structured curriculum and pricing data directly to your warehouse.
Everything supported by our learnworlds.com scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
Handles client-side rendering for LearnWorlds Site Builder themes while maintaining high concurrency.
Distributes requests across diverse IP pools to avoid WAF blocks on custom domains.
Pipelines run on AWS Lambda and ECS, scheduled via Airflow for reliable daily or weekly extraction.
Data delivered to where your team already works — no new tooling required.
About learnworlds.com scraping, legality, and pipeline operations.
Ask us directly →Scraping publicly available course catalogues, pricing, and instructor profiles is generally permissible. DataFlirt extracts only public metadata and does not bypass authentication walls or extract private student information.
Yes. We can target any domain running LearnWorlds infrastructure by identifying the underlying platform fingerprints and applying our standard extraction schema.
Our selectors target underlying data structures and JSON payloads rather than visual CSS classes, ensuring consistent output regardless of the school's active theme.
No. We focus exclusively on public metadata. We do not bypass DRM or extract proprietary video files.
Pipelines can be configured for daily runs to capture limited-time discounts and subscription changes.
Yes. We extract the full hierarchy of sections, modules, and lessons, delivering it as a relational dataset.
We typically start with a defined list of target schools or a specific category of courses. Contact us for scoping.
20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need course catalogues or pricing structures, we build and operate the extraction infrastructure. Tell us your target domains.