What is Referential Integrity Check?
Referential integrity check is a validation process in data engineering that ensures relationships between tables remain consistent after an extraction run. In scraping pipelines, it verifies that every child record (like a product review) maps to a valid parent record (like a product ID) that actually exists in the dataset. Without it, partial scrapes and pagination failures silently create orphaned data, breaking downstream joins and corrupting analytics dashboards.