Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models
Official Pytorch implementation from the paper
Revisiting CLIP: Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models
ISBI 2025
Jakob Krogh Petersen*, Valdemar Licht*, Mads Nielsen, Asbjørn MunkPioneer Centre for AI & University of Copenhagen
* Equal Contribution
Paper link: ArXiv.
- Install Poetry.
- Create environment by calling
poetry install
. - Setup environment variables. Run
touch .env
. Then add the following to the new.env
file:
export YUCCA_SOURCE=link/to/datasets
export YUCCA_RAW_DATA=link/to/raw_data
export YUCCA_PREPROCESSED_DATA=link/to/preprocessed_data
export YUCCA_MODELS=link/to/models
export YUCCA_RESULTS=link/to/results
export STATES=src/models/states
-
Ensure the states folder exists by running
mkdir -p src/models/states
. This folder contains weights and vocabularies for required models. -
Create a 'GammaKnife' folder at the chosen YUCCA_SOURCE path. Add the 'Brain-TR-GammaKnife-processed' folder that you download from here to the 'GammaKnife' folder.
-
Run Task Conversion. Given the setup as above one can run the task conversion task with the corresponding script
bash run_task_conversion.sh
-
Run Preprocessing. Run the preprocessing step using the corresponding script
bash run_preprocess.sh
-
Run Training. Run training with the script
bash run_train.sh
. Use arguments-e
and-c
to add experiment settings and configuration settings. Use-f
to train from scratch. For example, to train from scratch with an experiment locally use:
bash run_train.bash -e 16x_swinT_k0 -c local -f
We release checkpoints for the best-performing CLIP-trained models for each of the studied vision architectures, as well as the pre-trained models, that we perform the CLIP training.
Vision Model | Parameters (M) | Checkpoint |
---|---|---|
Swin-T | 8 | Download |
MedNeXt | 4 | Download |
ResNet | 57 | Download |
Model | Parameters (M) | Checkpoint |
---|---|---|
Bert | 110 | Download* |
Swin-T | 8 | Download |
MedNeXt | 4 | Download** |
ResNet | 57 | Download** |
*The official model weights can be extracted from HuggingFace. See here for the vocabulary.
**For the MedNeXt and ResNet models, we refer to AMAES.
Please use
@article{krogh2025clip3d,
title={Efficient Alignment of 3D MRI and Tabular Data using Domain-Specific Foundation Models},
author={Petersen, Jakob Krogh and Licht, Johan Valdemar and Nielsen, Mads and Munk, Asbjørn},
journal={arXiv preprint arXiv:2501.14051},
year={2025}
}