We extract architectural projects, design reviews, designer profiles, and material specifications from Domus. Delivered as clean JSON, CSV, or Parquet to your data warehouse.
Structured, schema-consistent data across all major object types — delivered clean, typed, and ready to query.
Complete list of extractable fields for Architectural Projects objects from domus.it. All fields typed and schema-versioned.
"project_id": "PRJ-84921", "title": "Bosco Verticale", "architect_name": "Stefano Boeri", "location_city": "Milan", "completion_year": 2014, "area_sqm": 40000, "primary_materials": "['Concrete', 'Glass', 'Vegetation']"
| # | project_id | title | architect_name | studio_name | location_city | location_country |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Designer Profiles objects from domus.it. All fields typed and schema-versioned.
"designer_id": "DSG-1042", "full_name": "Zaha Hadid", "nationality": "British-Iraqi", "birth_year": 1950, "awards_won": "['Pritzker Architecture Prize', 'Stirling Prize']", "discipline_tags": "['Architecture', 'Product Design']"
| # | designer_id | full_name | studio_affiliation | nationality | birth_year | notable_projects |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Design Articles objects from domus.it. All fields typed and schema-versioned.
"article_id": "ART-59211", "headline": "The Evolution of Brutalist Architecture in London", "author_name": "Elena Sommariva", "publication_date": "2023-11-14", "category": "Architecture", "language_code": "en", "tags": "['Brutalism', 'London', 'Urban Planning']"
| # | article_id | headline | author_name | publication_date | category | tags |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Exhibitions & Events objects from domus.it. All fields typed and schema-versioned.
"event_id": "EVT-3301", "event_name": "Venice Architecture Biennale", "venue_name": "Giardini della Biennale", "city": "Venice", "start_date": "2023-05-20", "end_date": "2023-11-26", "theme": "The Laboratory of the Future"
| # | event_id | event_name | venue_name | city | country | start_date |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Complete list of extractable fields for Product Design objects from domus.it. All fields typed and schema-versioned.
"product_id": "PRD-7721", "product_name": "Arco Lamp", "designer_name": "Achille Castiglioni", "manufacturer": "Flos", "launch_year": 1962, "material_composition": "['Carrara Marble', 'Stainless Steel', 'Aluminum']", "category": "Lighting"
| # | product_id | product_name | designer_name | manufacturer | launch_year | material_composition |
|---|---|---|---|---|---|---|
| 1 | ||||||
| 2 | ||||||
| 3 |
Our Domus scraper handles editorial layouts, bilingual content toggles, high-resolution media galleries, and nested project metadata — converting unstructured articles into queryable datasets.
Extract architect names, completion dates, square footage, materials, and structural engineering credits from editorial project features.
Capture direct URLs to architectural photography, floor plans, and conceptual sketches embedded within article galleries.
Domus publishes in Italian and English. We map IT/EN content pairs to ensure consistent data delivery regardless of the source language.
Normalise architect names and studio affiliations across decades of articles to build comprehensive designer directories.
Identify and classify building materials, furniture brands, and lighting fixtures mentioned in product design reviews.
Monitor upcoming design weeks, biennales, and gallery exhibitions with dates, venues, and curator information.
Scrape historical articles and retrospective features to build longitudinal datasets of design trends.
Parse location data from project profiles to map architectural developments by city, region, or country.
Run daily or weekly pipelines to capture newly published articles, project profiles, and event announcements.
Brief in. Clean data out.
Specify target categories, date ranges, or specific architectural disciplines. We design the extraction schema.
We configure Scrapy crawlers to handle Domus's editorial DOM structures, pagination, and media galleries.
Schema validation, bilingual alignment checks, and null-rate monitoring before full pipeline execution.
JSON / CSV / Parquet pushed to your S3 bucket or data warehouse on your defined schedule.
Editorial platforms like Domus present unique extraction hurdles. Here is how we normalise unstructured design content.
Editorial articles often bury project specifications within narrative paragraphs. We use custom parsing logic to extract structured entities — like area, materials, and completion year — from unstructured body text.
Domus uses lazy-loaded image carousels. Our Playwright integration triggers gallery interactions to expose and capture the underlying high-resolution image URLs and floor plan PDFs.
Articles frequently exist in both Italian and English under different URL structures. We map these variants to prevent duplicate records and ensure consistent language delivery.
Magazine layouts change frequently for special features. We use multiple fallback selectors to ensure data continuity even when Domus publishes custom-designed editorial pieces.
Category pages use JavaScript-driven infinite scroll. We execute headless browser sessions to simulate user scrolling, ensuring complete historical archive capture without missing records.
Design agencies analyse material usage and stylistic keywords across thousands of projects to forecast upcoming architectural trends.
Procurement teams identify frequently specified materials and manufacturers in high-end commercial and residential projects.
Universities build longitudinal datasets of urban development and architectural evolution across specific cities or decades.
Architecture studios track publication frequency, client types, and project scales of competing firms.
Machine learning teams use paired datasets of architectural imagery and descriptive text to train generative design models.
Industry professionals aggregate global design events, curatorial themes, and participating artists for market research.
"Domus contains a century of architectural history and design evolution, but extracting structured metadata from editorial layouts requires precision."
Editorial platforms like Domus present unique scraping challenges: unstructured text, embedded media galleries, and bilingual content layers. DataFlirt normalises this editorial sprawl into strictly typed schemas so your research teams can focus on spatial analysis rather than DOM parsing.
Everything supported by our domus.it scraper — rendered SPA elements, auth walls, rate-limit evasion and beyond.
Open-source tooling on proven cloud infra — no vendor lock-in, full observability.
We combine Scrapy's high-concurrency crawling with Playwright's JavaScript rendering to handle Domus's infinite scroll feeds and dynamic media galleries.
Dedicated infrastructure for extracting, validating, and optionally downloading high-resolution architectural photography and technical floor plans.
Pipelines run on Kubernetes clusters with Airflow orchestration, ensuring reliable delivery to your S3 buckets or PostgreSQL databases on schedule.
Data delivered to where your team already works — no new tooling required.
About domus.it scraping, legality, and pipeline operations.
Ask us directly →Yes. We can configure the pipeline to target a specific language preference or extract both, mapping equivalent articles to prevent duplication in your dataset.
While Domus presents data editorially, our parsers use pattern matching and custom selectors to extract specific entities like architect names, completion years, and materials into structured fields.
Standard delivery includes direct URLs to the highest-resolution images available on the page. If required, we can configure a media pipeline to download and transfer the actual image files to your S3 bucket.
We only extract publicly available editorial content. Articles and digital magazine PDFs gated behind the Domus+ subscription wall are not supported to comply with access restrictions.
We can extract any article or project currently indexed and publicly accessible on the domus.it website, which includes extensive digitised historical archives.
For editorial monitoring, weekly or monthly cadences are standard. We also perform one-off historical bulk extractions covering specific decades or architectural categories.
20-minute scoping call. Pilot dataset within the week. Production within two. Whether you need a complete historical archive of design projects or a weekly feed of new exhibitions — we build and manage the pipeline. Contact our engineering team to define your schema.