GitHub - remarkableliu/DALR: The official implementation of our multi-modal sentence embedding paper

Overview

We propose DALR (Dual-level Alignment Learning for multimodal sentence Representation Learning). To achieve cross-modal fine-grained alignment, we propose a cross-modal alignment method to mitigate the cross-modal misalignment bias (CMB) issue. To alleviate the intra-modal semantic divergence (ISD) issue, we integrate ranking distillation with global alignment learning to effectively align intra-modal representations. The following figure is an illustration of our models.

Getting Started

Download Datasets

Run pip install -r requirements.txt to prepare the environment.

First you should download Flickr and MSCOCO datasets from the offical website and put them in the following format:

REPO ROOT
|
|--data    
|  |--Flickr  
|  |--MSCOCO
|  |--wiki1m_for_simcse.txt

Wiki1M

wget https://huggingface.co/datasets/princeton-nlp/datasets-for-simcse/resolve/main/wiki1m_for_simcse.txt

Use the script from the SimCSE repo to download the datasets for SentEval evaluation:

cd SentEval/data/downstream/
bash download_dataset.sh

You can download the model (SimCSE, DiffCSE, etc) from huggingface and put it in the Model folder

Access Our Model from Google Drive

Both the model checkpoints of flickr-bert-base and coco-bert-base are available on Google Drive.

Use Our model

import torch
from scipy.spatial.distance import cosine
from transformers import AutoModel, AutoTokenizer

# Import our models. The package will take care of downloading the models from the google drives
tokenizer = AutoTokenizer.from_pretrained("Model/DALR")
model = AutoModel.from_pretrained("Model/DALR")

# Tokenize input texts
texts = [
    "There's a kid on a skateboard.",
    "A kid is skateboarding.",
    "A kid is inside the house."
]
inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")

# Get the embeddings
with torch.no_grad():
    embeddings = model(**inputs, output_hidden_states=True, return_dict=True).pooler_output

# Calculate cosine similarities
# Cosine similarities are in [-1, 1]. Higher means more similar
cosine_sim_0_1 = 1 - cosine(embeddings[0], embeddings[1])
cosine_sim_0_2 = 1 - cosine(embeddings[0], embeddings[2])

print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[1], cosine_sim_0_1))
print("Cosine similarity between \"%s\" and \"%s\" is: %.3f" % (texts[0], texts[2], cosine_sim_0_2))

Evaluation

Run Evaluation with SentEval

python eval_senteval.py \
    --model_name_or_path Model/OurModel \
    --task_set sts \
    --mode test \

Train Your Own Models

In the following section, we describe how to train a DALR model by using our code.

pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

If you instead use CUDA <11 or CPU, install PyTorch by the following command,

pip install torch==1.8.1

Then run the following script to install the remaining dependencies,

pip install -r requirements.txt

For unsupervised mixed training setting of wiki+flickr and wiki+coco, you can run the following command train your own models and try out different hyperparameters in it as you like

bash scripts/run_wiki_flickr.sh

bash scripts/run_wiki_coco.sh

Acknowledgements

We use the SentEval toolkit for evaluations, and we adopt the modified version of SenteEval from the SimCSE.
Part of our code comes from MCSE and KDMCSE.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
SentEval		SentEval
clip		clip
data		data
figure		figure
scripts		scripts
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Getting Started

Download Datasets

Access Our Model from Google Drive

Use Our model

Evaluation

Run Evaluation with SentEval

Train Your Own Models

Acknowledgements

About

Releases

Packages

Languages

remarkableliu/DALR

Folders and files

Latest commit

History

Repository files navigation

Overview

Getting Started

Download Datasets

Access Our Model from Google Drive

Use Our model

Evaluation

Run Evaluation with SentEval

Train Your Own Models

Acknowledgements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages