← Glossary / Database Rights (EU)

What is Database Rights (EU)?

Database Rights (EU) refer to the sui generis legal protection granted to creators of databases within the European Union, safeguarding the substantial investment made in obtaining, verifying, or presenting data. For scraping pipelines targeting EU entities, it means that even if individual facts (like sports scores or job listings) aren't copyrightable, systematically extracting them can trigger infringement claims and pipeline shutdowns.

LegalSui GenerisEU ComplianceSubstantial ExtractionIP Law
// 02 — definitions

Protecting the
investment.

Why scraping public facts in Europe carries a unique legal risk that doesn't exist in the US or India.

Ask a DataFlirt engineer →

TL;DR

The EU Database Directive (96/9/EC) created a specific right to stop unauthorized extraction or re-utilization of a database's contents. If a publisher invested significant resources to build the dataset, scraping a 'substantial part'—or repeatedly scraping small parts—violates their rights. It is the primary legal weapon used against scrapers in European jurisdictions.

01Definition & structure
The EU Database Directive (96/9/EC) establishes a two-tier protection system for databases. First, standard copyright protects the "selection or arrangement" of the contents if it constitutes the author's own intellectual creation. Second, and more importantly for scrapers, it creates a sui generis (of its own kind) right. This right protects the maker of a database if there has been a "qualitatively and/or quantitatively substantial investment in either the obtaining, verification or presentation of the contents."
02Substantial vs. Insubstantial extraction
Infringement occurs when a scraper extracts or re-utilizes a "substantial part" of the database. This is measured in two ways:
  • Quantitative: Taking a large volume of records relative to the total size of the database.
  • Qualitative: Taking a small number of records that represent a massive portion of the publisher's investment (e.g., scraping only the verified, premium listings).
Extracting an insubstantial part is generally lawful, provided it isn't done systematically to reconstruct the whole.
03The "Repeated and Systematic" trap
Scraping engineers often try to bypass the "substantial part" rule by rate-limiting their crawlers to extract tiny fractions of the database over a long period. The Directive explicitly closes this loophole. Article 7(5) prohibits the repeated and systematic extraction of insubstantial parts if it conflicts with the normal exploitation of the database. If your daily cron job eventually downloads the entire catalog, you are infringing.
04How DataFlirt handles EU targets
We treat EU targets with a distinct compliance protocol. Before a pipeline goes live, we assess the target's investment threshold and TDM opt-out status. We configure our schedulers with hard extraction caps to ensure we only pull the specific, insubstantial subsets our clients actually need for analytics, rather than mirroring the entire infrastructure. We also actively strip structural metadata to ensure we are delivering raw facts, not a replicated database.
05Notable CJEU rulings
The Court of Justice of the European Union (CJEU) has shaped how this right is applied to web scraping. In Innoweb v Wegener, the court ruled that a meta-search engine scraping car listings was infringing because it acted as a direct substitute for the original site. Conversely, in CV-Online Latvia, the court clarified that search engines indexing job portals might not infringe if they don't cause significant economic harm to the database maker's investment. The economic impact is the ultimate deciding factor.
// 03 — the infringement threshold

When does extraction
become unlawful?

EU courts evaluate infringement based on the proportion of the database extracted and the economic impact on the original creator. DataFlirt's legal compliance framework models these thresholds for EU-based targets.

Quantitative Substantiality = Equant = records_scraped / total_database_size
Extracting a large percentage of the total records. Courts often scrutinise anything above 10–15%. EU Directive 96/9/EC
Systematic Extraction Risk = Rsys = Σ (insubstantial_extractions) over time
Repeatedly scraping 1% daily to eventually reconstruct the whole database is explicitly prohibited. Article 7(5) of the Directive
Economic Substitution Test = H = scraper_revenuecreator_lost_revenue
Does the scraped dataset act as a direct commercial substitute for the original database? CJEU Jurisprudence (e.g., Innoweb)
// 04 — compliance audit trace

Evaluating an EU
target for extraction.

A pre-flight legal and technical audit for a pipeline targeting a German real estate portal, assessing database right exposure before the first request is sent.

Directive 96/9/ECRisk AssessmentCompliance
edge.dataflirt.io — live
CAPTURED
// target analysis
target.domain: "immo-example.de"
target.jurisdiction: "DE (European Union)"
database_right.applicable: true

// investment threshold check
data.type: "user-submitted real estate listings"
verification_effort: high // publisher verifies broker IDs
sui_generis_protection: likely active

// extraction parameters
client.use_case: "internal market analytics"
commercial_substitute: false
extraction.target_volume: "~2,500 records/week"
database.estimated_size: "1.2M records"
extraction.quantitative_ratio: 0.2% // insubstantial

// pipeline constraints applied
rule.max_records_per_run: 5000
rule.prevent_structural_copy: enforced
status: CLEARED FOR PILOT
// 05 — risk factors

Triggers for sui generis
infringement.

The factors that elevate a scraping operation from benign data gathering to a violation of EU Database Rights, ranked by frequency of citation in cease-and-desist letters.

JURISDICTION ·  ·  ·  ·   EU / EEA
LEGAL BASIS ·  ·  ·  ·    Directive 96/9/EC
UPDATED ·  ·  ·  ·  ·  ·  2026-05-19
01

Direct commercial substitution

highest risk · Building a competing product using the target's data
02

High quantitative extraction

volume based · Scraping a massive percentage of the total database
03

Systematic insubstantial extraction

frequency based · Drip-scraping to reconstruct the database over time
04

Qualitative extraction

value based · Taking only the most valuable/expensive-to-verify records
05

Bypassing technical measures

aggravating factor · Evading CAPTCHAs to extract the protected data
// 06 — operational compliance

Scrape the facts,

respect the investment.

DataFlirt navigates EU Database Rights by strictly separating the extraction of uncopyrightable facts from the wholesale replication of a protected database structure. For high-risk EU targets, we implement hard concurrency and volume caps, ensuring our pipelines never cross the 'substantial part' threshold in a single run. Furthermore, we require clients to verify their use-case doesn't constitute a direct commercial substitute to the original publisher. We extract the data you need to run your analytics, without cloning the asset the publisher spent millions to build.

EU Target Compliance Profile

Pre-flight clearance parameters for an EU-based job board pipeline.

target.jurisdiction France (EU)
database_right applicable
extraction.cap 10k records/day
systematic_risk mitigated
commercial_conflict none · internal use
tdm_opt_out not present in robots.txt
clearance.status approved

Stay ahead of the pipeline

Data engineering
intel, weekly.

Anti-bot shifts, scraping infrastructure updates, dataset delivery patterns, and business outcomes from our pipelines. Short, technical, no fluff.

// 07 — FAQ

Common
questions.

Common questions about EU Database Rights, the sui generis framework, and how to legally scrape European targets.

Ask us directly →
Does this apply to US companies scraping EU websites? +
Yes. If the target database was created by an EU citizen or company, it is protected by the Directive. While enforcing an EU judgment against a purely US-based entity can be complex, if you have any European operations, subsidiaries, or customers, you are exposed to significant legal risk. Jurisdiction in internet law often follows the location of the harm.
Are the individual facts themselves protected? +
No. The sui generis right does not protect the raw data (e.g., a football score, a company's address). It protects the investment made in obtaining, verifying, or presenting that collection of facts. You can extract individual facts, but you cannot extract the database itself.
What exactly constitutes a 'substantial' part of a database? +
The Directive doesn't define a strict percentage. Courts evaluate it quantitatively (the volume of data taken relative to the whole) and qualitatively (the value of the data taken). Extracting 5% of a database might be deemed substantial if that 5% represents the core commercial value or required the most investment to compile.
Can we just scrape 1% a day to avoid the 'substantial' rule? +
No. Article 7(5) of the Directive explicitly prohibits the "repeated and systematic extraction and/or re-utilization of insubstantial parts" if it conflicts with the normal exploitation of the database or unreasonably prejudices the legitimate interests of the maker. Drip-scraping to reconstruct the database is illegal.
Does the Text and Data Mining (TDM) exception help scrapers? +
The EU Digital Single Market (DSM) Directive introduced an exception for TDM (Article 4), allowing commercial scraping of lawfully accessible content. However, publishers can explicitly "opt out" of this exception using machine-readable means (like robots.txt or specific HTTP headers). If they opt out, the TDM exception no longer protects your scraping.
How does DataFlirt ensure compliance with EU Database Rights? +
We conduct a legal and technical review of EU targets before deployment. We implement hard extraction caps to stay below quantitative thresholds, avoid structural replication of the target's schema, and monitor robots.txt for DSM Article 4 opt-outs. We also require clients to confirm their downstream use case does not commercially substitute the target.
$ dataflirt scope --new-project --target=database-rights-(eu) READY

Tell us what
to extract.
We do the rest.

20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a one-off catalogue dump or a continuous feed across millions of records — we scope, build, and operate the pipeline.

hello@dataflirt.com  ·  Bengaluru  ·  IST  ·  typical reply < 4h