Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

Christian Fruhwirth-Reisinger ^1,2, Wei Lin ³, Dušan Malić ^1,2, Horst Bischof ¹, Horst Possegger ^1,2

¹Graz University of Technology, ²Christian Doppler Laboratory for Embedded Machine Learning, ³Johannes Kepler University Linz

ViLGOD

🚩News

[2024-11-20]: Code released.
[2024-09-10]: ViLGOD has been accepted for BMVC 2024 as an oral presentation. See you in Glasgow!
[2024-08-07]: ViLGOD arXiv paper released.

📝 TODO List

Initial release.
Add installation details.
Add visual code run config for zero-shot detection.
Update arXiv paper.
Add additional run & evaluation instructions.
Upate run scripts for multi-CPU/GPU inference.

🚀 Quick Start

Tested environment

Ubuntu 22.04
Python 3.8
CUDA 11.7

Environment setup

Creat virtual environment and intstall required packages

Create virtual environment

virtualenv vilgod -p python3.8
source <home/to/virtualenv>/bin/activate

Install required packages

pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117

pip install spconv-cu117

pip install numpy==1.21.5 \
            llvmlite==0.39.0 \
            numba==0.56.4 \
            tensorboardX==2.4.1 \
            easydict==1.9 \
            pyyaml==6.0 \
            scikit-image==0.20.0 \
            tqdm==4.64.0 \
            SharedArray==3.1.0 \
            protobuf==3.19.6 \
            open3d==0.15.2 \
            gpustat==1.0.0 \
            av2==0.2.0 \
            kornia==0.5.8 \
            waymo-open-dataset-tf-2-11-0

pip install hdbscan \
            hydra-core \
            ftfy \
            regex \
            pyransac3d \
            fvcore \
            torch_scatter \
            filterpy

pip install --no-index --no-cache-dir pytorch3d -f https://dl.fbaipublicfiles.com/pytorch3d/packaging/wheels/py38_cu117_pyt1131/download.html

pip install numpy==1.23.5

Clone and install required repositories

Clone repository and create folder structure

git clone git@github.com:chreisinger/ViLGOD.git
cd ViLGOD
mkdir models
mkdir data
cd models
mkdir clip
cd ..
python setup.py develop

Insall adapted Patchwork++

cd third_party/patchwork-plusplus
python setup.py install

Download clip model to: ViLGOD/models/clip
Install OpenPCDet (outside of ViLGOD folder)

git clone https://github.com/open-mmlab/OpenPCDet.git
cd OpenPCDet
python setup.py develop

Extract data following the OpenPCDet tutorial. No ground truth database needed!
Create softlinks of your extracted data into ViLGOD (we support Waymo Open Dataset v1.2 and Argoverse 2)

ln -s <path/to/extracted/data> ViLGOD/data/

Run ViLGOD

Run unsupervised 3D object detection

Make sure the CLIP folder is part of your python path:

export PYTHONPATH=${PYTHONPATH}:<path/to/ViLGOD>/third_party/CLIP

For the Waymo Open Dataset:

cd tools
python preprocess_data.py preprocessor=waymo

For the Argoverse 2 dataset:

cd tools
python preprocess_data.py preprocessor=argoverse

📖 Citation

If you find our code or paper helpful, please leave a ⭐ and cite us:

@inproceedings{fruhwirth2024vilgod,
    title={Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection}, 
    author={Christian Fruhwirth-Reisinger and Wei Lin and Dušan Malić and Horst Bischof and Horst Possegger},
    year={2024},
    booktitle={British Machine Vision Conference}
}

🙌 Acknowledgments

Many thanks to Patchwork++, OpenPCDet, MODEST, CLIP and PointCLIPv2 for code and models.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

ViLGOD

🚩News

📝 TODO List

🚀 Quick Start

Tested environment

Environment setup

Run ViLGOD

Run unsupervised 3D object detection

📖 Citation

🙌 Acknowledgments

Files

README.md

Latest commit

History

README.md

File metadata and controls

Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

ViLGOD

🚩News

📝 TODO List

🚀 Quick Start

Tested environment

Environment setup

Run ViLGOD

Run unsupervised 3D object detection

📖 Citation

🙌 Acknowledgments