cross_modal_compression

officical repository for ACM MM 2021 paper: "Cross Modal Compression: "Cross Modal Compression: Towards Human-comprehensible Semantic Compression"

Prerequisites

linux
python 3.5 (not test on other versions)
pytorch 1.3+
torchaudio 0.3
librosa, pysoundfile
json, tqdm, logging

data preparing

download dataset and pretrained model

you can download CUB-200-2011 dataset and MS COCO 2014 from the offficial site
download our json file for MS COCO from here(google drive, 百度网盘提取码：c31g)
download our pretrained models from here(google drive, 百度网盘提取码：i0un)

training the CMC

export PYTHONPATH=path_for_this_project

training the DAMSM model

# for CUB dataset
python ./TextImage/pretrain_DAMSM.py --cfg ./cfg/bird_DAMSM.yml --data_dir ./data/birds --dataset bird --output_dir ./output/TextImage --no_dist
# for COCO
python ./TextImage/pretrain_DAMSM.py --cfg ./cfg/coco_DAMSM.yml --data_dir ./data/coco --dataset coco --output_dir ./output/TextImage --no_dist

training the ImageText model

# for CUB
python ./ImageText/train.py --cfg ./cfg/bird_train.yml --data_dir ./data/birds --dataset bird --output_dir ./output/ImageText
# for COCO
python ./ImageText/train.py --cfg ./cfg/coco_train.yml --data_dir ./data/coco --dataset coco --output_dir ./output/ImageText

training the TextImage model

# for CUB
# first set the text encoder path in ./cfg/bird_train.yml: TRAIN.NET_E
python ./TextImage/train.py --cfg ./bird_train.yml --data_dir ./data/birds --dataset bird --output_dir ./output/TextImage
# for COCO
# first set the text encoder path in ./cfg/coco_train.yml: TRAIN.NET_E
python ./TextImage/train.py --cfg ./coco_train.yml --data_dir ./data/coco --dataset coco --output_dir ./output/TextImage

eval the CMC

write the pretrained models' paths in cfg/bird_eval.yml for CUB-200-2011 dataset or cfg/coco_eval.yml for MS COCO dataset
run

python ./ImageText/end_to_end_test.py --cfg cfg/coco_eval.yml --data_dir COCO_PATH --output_dir ./output/end_to_end_coco_test

for MS COCO or

python ./ImageText/end_to_end_test.py --cfg cfg/coco_eval.yml --data_dir CUB_PATH --output_dir ./output/end_to_end_bird_test

project home page

https://smallflyingpig.github.io/cross_modal_compression_mainpage/main.html Feel free to mail me at: [email protected]/[email protected], if you have any question about this project.

Acknowledgement

Thanks to the valuable discussion with Junlong Gao. Besides, thanks to the open source of COCO API, AttnGAN, a-PyTorch-Tutorial-to-Image-Captioning.

Note that this work is only for research. Please do not use it for illegal purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
EntropyCoding		EntropyCoding
ImageText		ImageText
TextImage		TextImage
baseline		baseline
cfg		cfg
deploy		deploy
fig		fig
metric		metric
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cross_modal_compression

Prerequisites

data preparing

download dataset and pretrained model

training the CMC

training the DAMSM model

training the ImageText model

training the TextImage model

eval the CMC

project home page

Acknowledgement

About

Releases

Packages

Languages

License

smallflyingpig/cross_modal_compression

Folders and files

Latest commit

History

Repository files navigation

cross_modal_compression

Prerequisites

data preparing

download dataset and pretrained model

training the CMC

training the DAMSM model

training the ImageText model

training the TextImage model

eval the CMC

project home page

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages