Skip to content

Embeddings learning using Siamese and Triplet networks for one-shot classification

License

Notifications You must be signed in to change notification settings

kerteszg/EmbeddingNet

 
 

Repository files navigation

Siamese and Triplet networks for image classification

Original framework is available at https://github.com/RocketFlash/EmbeddingNet.

This repository contains the modifications and materials used in experiments described in Metric Embedding Learning on Multi-Directional Projections, published in MDPI Algorithms as an Open Access research article. Result logs are available in https://github.com/kerteszg/EmbeddingNet/tree/master/experiment-results.

In case you've found this work useful, please consider referring to this paper as

@article{Kertesz2020Metric, 
  title={Metric Embedding Learning on Multi-Directional Projections}, 
  author={Kertész, Gábor}, 
  journal={Algorithms}, 
  volume={13}, 
  number={6}, 
  ISSN={1999-4893}, 
  url={http://dx.doi.org/10.3390/a13060133}, 
  DOI={10.3390/a13060133}, 
  publisher={MDPI AG}, 
  year={2020}, 
  month={May}, 
  pages={133}
}

Results

The method presented in the paper is based on the MDIPFL transformation, which is applied as a dimensionality reduction technique to compress input images efficiently. These transformed images are fed to Siamese and Triplet Networks, and performance were measured for metric learning with and without multiclass classification pretraining, and for different triplet mining methods. Results are compared with raw image input based approaches.

Scripts for preprocessing and result collection are available in the notebooks dir.

NIST SD19

To prove the discriminative ability of the proposed method by experiments, the NIST SD19 dataset was used. After highlighting and transformation, different experiments were performed using a similar backbone architecture.

Results showed, that the MDIPFL based approach achieves similar performance, despite of the significantly lower number of parameters.

ATS-CVPR2016

To apply the method in a real-life problem of object re-identification, the dataset published in the International Workshop on Automatic Traffic Surveillance of CVPR 2016 is processed similarly. As a backbone architecture, ResNet-18 was applied.

On 10-way one-shot classification, the model trained with only with triplet loss on semi-hard negatives achieved a decent performance of 75.7% for one-shot classification on the validation dataset.

Installation

Below are the original instructions to setup the environment. To setup the original version of the framework, please refer to the installation notes on the original repository.

Clone the repository using:

git clone [email protected]:kerteszg/EmbeddingNet.git

Install dependencies

Creating a virtual environment is recommended, but not necessary:

pip install --upgrade pip
venv env
source env/bin/activate

The dependencies are more or less the same as for the original EmbeddingNet.

  • keras
  • tensorflow==1.14.0
  • tensorflow-gpu==1.14.0 - if applicable, strongly advised
  • scikit-learn
  • opencv
  • matplotlib
  • plotly - for interactive t-SNE plot visualization
  • albumentations - for online augmentation during training
  • image-classifiers - for different backbone models
  • keras-rectified-adam - for cool state-of-the-art optimization
pip install -r requirements.txt

Train

Before training the dataset should be prepared. This environment handles two directories: train and valid, and all subdirs below these are representing the categories.

In the paper Metric Embedding Learning on Multi-Directional Projections, the NIST SD19 and the ATS-CVPR2016 datasets were used. To create the necessary directory structure and for preprocessing, refer to the Jupyter Notebooks.

The prepared configuration files can be found in the configs folder.

$ python train.py [path to configuration_file]

Test one-shot classification accuracy on all available configs

After training all configs in the configs dir, measuring performance can be done using:

$ python test_allconfigs.py

To filter out some of the configs (e.g. by name), refer to the source code.

Once again, kudos to the authors of the original EmbeddingNet framework [1].

References

[1] Rauf Yagfarov, Vladislav Ostankovich, Aydar Akhmetzyanov. Traffic Sign Classification Using Embedding Learning Approach for Self-driving Cars, IHIET–AI 2020

[2] Kertész, Gábor, Sándor Szénási, and Zoltán Vámossy. Multi-directional image projections with fixed resolution for object matching Acta Polytechnica Hungarica 15.2 (2018): 211-229.

About

Embeddings learning using Siamese and Triplet networks for one-shot classification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 69.6%
  • Jupyter Notebook 30.4%