Siamese and Triplet networks for image classification

Original framework is available at https://github.com/RocketFlash/EmbeddingNet.

This repository contains the modifications and materials used in experiments described in Metric Embedding Learning on Multi-Directional Projections, published in MDPI Algorithms as an Open Access research article. Result logs are available in https://github.com/kerteszg/EmbeddingNet/tree/master/experiment-results.

In case you've found this work useful, please consider referring to this paper as

@article{Kertesz2020Metric, 
  title={Metric Embedding Learning on Multi-Directional Projections}, 
  author={Kertész, Gábor}, 
  journal={Algorithms}, 
  volume={13}, 
  number={6}, 
  ISSN={1999-4893}, 
  url={http://dx.doi.org/10.3390/a13060133}, 
  DOI={10.3390/a13060133}, 
  publisher={MDPI AG}, 
  year={2020}, 
  month={May}, 
  pages={133}
}

Results

The method presented in the paper is based on the MDIPFL transformation, which is applied as a dimensionality reduction technique to compress input images efficiently. These transformed images are fed to Siamese and Triplet Networks, and performance were measured for metric learning with and without multiclass classification pretraining, and for different triplet mining methods. Results are compared with raw image input based approaches.

Scripts for preprocessing and result collection are available in the notebooks dir.

NIST SD19

To prove the discriminative ability of the proposed method by experiments, the NIST SD19 dataset was used. After highlighting and transformation, different experiments were performed using a similar backbone architecture.

Results showed, that the MDIPFL based approach achieves similar performance, despite of the significantly lower number of parameters.

ATS-CVPR2016

To apply the method in a real-life problem of object re-identification, the dataset published in the International Workshop on Automatic Traffic Surveillance of CVPR 2016 is processed similarly. As a backbone architecture, ResNet-18 was applied.

On 10-way one-shot classification, the model trained with only with triplet loss on semi-hard negatives achieved a decent performance of 75.7% for one-shot classification on the validation dataset.

Installation

Below are the original instructions to setup the environment. To setup the original version of the framework, please refer to the installation notes on the original repository.

Clone the repository using:

git clone [email protected]:kerteszg/EmbeddingNet.git

Install dependencies

Creating a virtual environment is recommended, but not necessary:

pip install --upgrade pip
venv env
source env/bin/activate

The dependencies are more or less the same as for the original EmbeddingNet.

keras
tensorflow==1.14.0
tensorflow-gpu==1.14.0 - if applicable, strongly advised
scikit-learn
opencv
matplotlib
plotly - for interactive t-SNE plot visualization
albumentations - for online augmentation during training
image-classifiers - for different backbone models
keras-rectified-adam - for cool state-of-the-art optimization

pip install -r requirements.txt

Train

Before training the dataset should be prepared. This environment handles two directories: train and valid, and all subdirs below these are representing the categories.

In the paper Metric Embedding Learning on Multi-Directional Projections, the NIST SD19 and the ATS-CVPR2016 datasets were used. To create the necessary directory structure and for preprocessing, refer to the Jupyter Notebooks.

The prepared configuration files can be found in the configs folder.

$ python train.py [path to configuration_file]

Test one-shot classification accuracy on all available configs

After training all configs in the configs dir, measuring performance can be done using:

$ python test_allconfigs.py

To filter out some of the configs (e.g. by name), refer to the source code.

Once again, kudos to the authors of the original EmbeddingNet framework [1].

References

[1] Rauf Yagfarov, Vladislav Ostankovich, Aydar Akhmetzyanov. Traffic Sign Classification Using Embedding Learning Approach for Self-driving Cars, IHIET–AI 2020

[2] Kertész, Gábor, Sándor Szénási, and Zoltán Vámossy. Multi-directional image projections with fixed resolution for object matching Acta Polytechnica Hungarica 15.2 (2018): 211-229.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
configs		configs
embedding_net		embedding_net
experiment-results		experiment-results
notebooks		notebooks
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
requirements.txt		requirements.txt
test.py		test.py
test_all_configs.py		test_all_configs.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Siamese and Triplet networks for image classification

Results

NIST SD19

ATS-CVPR2016

Installation

Install dependencies

Train

Test one-shot classification accuracy on all available configs

References

About

Releases

Packages

Languages

License

kerteszg/EmbeddingNet

Folders and files

Latest commit

History

Repository files navigation

Siamese and Triplet networks for image classification

Results

NIST SD19

ATS-CVPR2016

Installation

Install dependencies

Train

Test one-shot classification accuracy on all available configs

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages