A benchmark dataset for evaluating BIM (Building Information Modeling) comprehension and reasoning capabilities in AI systems. Provides curated IFC models with question-answer pairs for testing BIM-related AI implementations.
Dataset snapshot:
question | answer | ifc_model | project | |
---|---|---|---|---|
0 | What is the total gross floor area of the buil... | The total gross floor area of the building is ... | arc | duplex |
1 | What is the height of the ceiling in room A203? | The height of the ceiling in room A203 is 2.58 m | arc | duplex |
2 | Give me the name of all the rooms in the build... | The list of all the rooms in the building is: ... | arc | duplex |
3 | How many windows are there on the north facade? | I cannot calculate the number of window on th... | arc | duplex |
4 | What is the width of the door 1hOSvn6df7F8_7Gc... | The width of the door is 1.25 m | arc | duplex |
- Features
- Dataset Structure
- Getting Started
- Models Overview
- Contributing
- License
- Citation
- Acknowledgments
- Versioned datasets: Currently at V1 with 2 BIM models and 105 QA pairs
- Diverse question types:
- Spatial reasoning
- Element properties
- System relationships
- Construction sequencing
- Rich contextual data:
- Original IFC files
- Model snapshots
- Architectural descriptions
- License documentation
- Machine-readable format: CSV dataset with clear column structure
ifc-bench/
├── projects/ # Directory for all projects
│ ├── duplex/ # First project
│ │ ├── arc.ifc # Architecture model
│ │ ├── mep.ifc # MEP model
│ │ ├── license.txt # Project license
│ │ ├── model_card.csv # Project metadata
│ │ └── snapshot.png # Visual snapshot
│ └── dental_clinic/ # Second project
│ ├── arc.ifc # Architecture model
│ ├── str.ifc # Structural model
│ ├── mep.ifc # MEP model
│ └── ... # Other project files
├── questions/ # Question-answer pairs
│ └── ifc-bench-v1.csv # Primary dataset
└── docs/ # Supplementary materials
└── CONTRIBUTING.md # Contribution guidelines
- Disciplines: Architectural, MEP
- License: CC-BY-4.0
- Complexity: Simple
- Source: buildingSMART Sample Files
- Disciplines: Architectural, Structural, MEP
- License: CC-BY-4.0
- Complexity: Intermediate
- Source: buildingSMART Sample Files
- Python 3.8+
- pandas (for data analysis)
- ifcopenshell (optional, for working with IFC files)
Install requirements:
pip install pandas ifcopenshell
git clone https://github.com/sylvainHellin/ifc-bench.git
cd ifc-bench
import pandas as pd
# Load dataset
df = pd.read_csv('questions/ifc-bench-v1.csv')
# Explore questions by model
duplex_questions = df[df['ifc_model'] == 'duplex']
print(f"Duplex model has {len(duplex_questions)} questions")
# Sample question format
sample_q = df.iloc[0]
print(f"""
Question: {sample_q.question}
Answer: {sample_q.answer}
Model: {sample_q.ifc_model}
Project: {sample_q.project}
""")
Column | Description | Example |
---|---|---|
question |
Natural language question | "What is the total gross floor area of the building?" |
answer |
Ground truth answer | "The total gross floor area of the building is 354.67 sqm" |
ifc_model |
Model identifier | "arc" |
project |
Question category | "duplex" |
Verify dataset integrity using SHA-256 checksum:
shasum -a 256 questions/ifc-bench-v1.csv
# Expected output: f67a48770d74b6e0ff0868c923c3e1d976110350b2c439564d7ceccc16a46f35
We welcome contributions through:
- 🆕 New IFC models (with permissive licensing)
- ➕ Additional QA pairs for existing models
- ✏️ Documentation improvements
- 🐛 Error corrections in existing answers
Please see our Contribution Guidelines for details.
- Dataset: Licensed under CC BY 4.0
- Models: Inherit their original licenses (see individual model folders)
If using in research, please cite:
@misc{ifc-bench,
title = {{ifc-bench}: {BIM} Comprehension \& Reasoning Benchmark Dataset},
author = {Sylvain Hellin},
year = {2024},
url = {https://github.com/sylvainHellin/ifc-bench},
note = {Version 1.0}
}
Special thanks to:
- buildingSMART International for providing sample files
- The openBIM community for quality assurance
- Early adopters for feedback and validation
📌 Maintainer: Sylvain Hellin | 📧 Contact: [email protected] | 🐛 Issue Tracker: GitHub Issues