What is Data Minimization Principle?
Data minimization principle is the legal and operational mandate to collect, process, and store only the data strictly necessary for a specific business purpose. In scraping pipelines, it means extracting the price and SKU while deliberately dropping the author's name, user reviews containing PII, or tracking tokens. Over-collection isn't just a storage cost issue, it transforms a low-risk catalog scrape into a high-liability compliance breach under frameworks like GDPR and CCPA.