ifc-bench 🏗️💡

A benchmark dataset for evaluating BIM (Building Information Modeling) comprehension and reasoning capabilities in AI systems. Provides curated IFC models with question-answer pairs for testing BIM-related AI implementations.

Dataset snapshot:

	question	answer	ifc_model	project
0	What is the total gross floor area of the buil...	The total gross floor area of the building is ...	arc	duplex
1	What is the height of the ceiling in room A203?	The height of the ceiling in room A203 is 2.58 m	arc	duplex
2	Give me the name of all the rooms in the build...	The list of all the rooms in the building is: ...	arc	duplex
3	How many windows are there on the north facade?	I cannot calculate the number of window on th...	arc	duplex
4	What is the width of the door 1hOSvn6df7F8_7Gc...	The width of the door is 1.25 m	arc	duplex

Features

Versioned datasets: Currently at V1 with 2 BIM models and 105 QA pairs
Diverse question types:
- Spatial reasoning
- Element properties
- System relationships
- Construction sequencing
Rich contextual data:
- Original IFC files
- Model snapshots
- Architectural descriptions
- License documentation
Machine-readable format: CSV dataset with clear column structure

Dataset Structure

ifc-bench/
├── projects/                  # Directory for all projects
│   ├── duplex/                # First project
│   │   ├── arc.ifc            # Architecture model
│   │   ├── mep.ifc            # MEP model
│   │   ├── license.txt        # Project license
│   │   ├── model_card.csv     # Project metadata
│   │   └── snapshot.png       # Visual snapshot
│   └── dental_clinic/         # Second project
│       ├── arc.ifc            # Architecture model
│       ├── str.ifc            # Structural model
│       ├── mep.ifc            # MEP model
│       └── ...                # Other project files
├── questions/                  # Question-answer pairs
│   └── ifc-bench-v1.csv       # Primary dataset
└── docs/                      # Supplementary materials
    └── CONTRIBUTING.md        # Contribution guidelines

Models Overview

🏠 Duplex Model

Disciplines: Architectural, MEP
License: CC-BY-4.0
Complexity: Simple
Source: buildingSMART Sample Files

🏥 Dental Clinic

Disciplines: Architectural, Structural, MEP
License: CC-BY-4.0
Complexity: Intermediate
Source: buildingSMART Sample Files

Getting Started

Prerequisites

Python 3.8+
pandas (for data analysis)
ifcopenshell (optional, for working with IFC files)

Install requirements:

pip install pandas ifcopenshell

Quick Start

git clone https://github.com/sylvainHellin/ifc-bench.git
cd ifc-bench

Using the Dataset

import pandas as pd

# Load dataset
df = pd.read_csv('questions/ifc-bench-v1.csv')

# Explore questions by model
duplex_questions = df[df['ifc_model'] == 'duplex']
print(f"Duplex model has {len(duplex_questions)} questions")

# Sample question format
sample_q = df.iloc[0]
print(f"""
Question: {sample_q.question}
Answer: {sample_q.answer}
Model: {sample_q.ifc_model}
Project: {sample_q.project}
""")

Dataset Columns

Column	Description	Example
`question`	Natural language question	"What is the total gross floor area of the building?"
`answer`	Ground truth answer	"The total gross floor area of the building is 354.67 sqm"
`ifc_model`	Model identifier	"arc"
`project`	Question category	"duplex"

Dataset Integrity

Verify dataset integrity using SHA-256 checksum:

shasum -a 256 questions/ifc-bench-v1.csv
# Expected output: f67a48770d74b6e0ff0868c923c3e1d976110350b2c439564d7ceccc16a46f35

Contributing

We welcome contributions through:

🆕 New IFC models (with permissive licensing)
➕ Additional QA pairs for existing models
✏️ Documentation improvements
🐛 Error corrections in existing answers

Please see our Contribution Guidelines for details.

License

Dataset: Licensed under CC BY 4.0
Models: Inherit their original licenses (see individual model folders)

Citation

If using in research, please cite:

@misc{ifc-bench,
  title = {{ifc-bench}: {BIM} Comprehension \& Reasoning Benchmark Dataset},
  author = {Sylvain Hellin},
  year = {2024},
  url = {https://github.com/sylvainHellin/ifc-bench},
  note = {Version 1.0}
}

Acknowledgments

Special thanks to:

buildingSMART International for providing sample files
The openBIM community for quality assurance
Early adopters for feedback and validation

📌 Maintainer: Sylvain Hellin | 📧 Contact: sylvain.hellin@tum.de | 🐛 Issue Tracker: GitHub Issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

ifc-bench 🏗️💡

Table of Contents

Features

Dataset Structure

Models Overview

🏠 Duplex Model

🏥 Dental Clinic

Getting Started

Prerequisites

Quick Start

Using the Dataset

Dataset Columns

Dataset Integrity

Contributing

License

Citation

Acknowledgments

Files

README.md

Latest commit

History

README.md

File metadata and controls

ifc-bench 🏗️💡

Table of Contents

Features

Dataset Structure

Models Overview

🏠 Duplex Model

🏥 Dental Clinic

Getting Started

Prerequisites

Quick Start

Using the Dataset

Dataset Columns

Dataset Integrity

Contributing

License

Citation

Acknowledgments