Merge pull request timesler#17 from timesler/detailed_notebook

Add inference notebook instead of example.py.
silversparro · Jul 13, 2019 · b0ff335 · b0ff335
2 parents daaaea5 + 32e2125
commit b0ff335
Show file tree

Hide file tree

Showing 6 changed files with 251 additions and 13 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,3 @@
 __pycache__
-.vscode
+.vscode
+.ipynb_checkpoints
diff --git a/.travis.yml b/.travis.yml
@@ -45,10 +45,10 @@ matrix:
       before_install: choco install python --version 3.5.4
       env: PATH=/c/Python35:/c/Python35/Scripts:$PATH
 
-install: pip3 install -r tests/requirements.txt || pip3 install --user -r tests/requirements.txt
+install: pip3 install -r tests/travis_requirements.txt || pip3 install --user -r tests/travis_requirements.txt
 script:
   - python3 --version || python --version
-  - echo "import tests.test" > test.py
+  - echo "import tests.travis_test" > test.py
   - coverage run --source models test.py && coverage report
 
 after_success:

diff --git a/README.md b/README.md
@@ -91,17 +91,15 @@ By default, the above models will return 512-dimensional embeddings of images. T
 
 Face recognition can be easily applied to raw images by first detecting faces using MTCNN before calculating embedding or probabilities using an Inception Resnet model.
 
-The example code at [models/utils/example.py](models/utils/example.py) provides a complete example pipeline utilizing datasets, dataloaders, and optional GPU processing. From the repo directory, this can be run with `python -c "import models.utils.example"`.
-
-Note that for real-world datasets, code should be modified to control batch sizes being passed to the Resnet, particularly if being processed on a GPU. Furthermore, for repeated testing, it is best to separate face detection (using MTCNN) from embedding or classification (using InceptionResnetV1), as detection can then be performed a single time and detected faces saved for future use.
+The example code at [examples/infer.ipynb](examples/infer.ipynb) provides a complete example pipeline utilizing datasets, dataloaders, and optional GPU processing.
 
 ## Use this repo in your own git project
 
 To use pretrained MTCNN and Inception Resnet V1 models in your own git repo, I recommend first adding this repo as a submodule. Note that the dash ('-') in the repo name should be removed when cloning as a submodule as it will break python when importing:
 
 `git submodule add https://github.com/timesler/facenet-pytorch.git facenet_pytorch`
 
-Alternatively, the code can be installed as a packed using pip:
+Alternatively, the code can be installed as a package using pip:
 
 `pip install facenet-pytorch`
 

diff --git a/examples/infer.ipynb b/examples/infer.ipynb
@@ -0,0 +1,240 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Face detection and recognition inference pipeline\n",
+    "\n",
+    "The following example illustrates how to use the `facenet_pytorch` python package to perform face detection and recogition on an image dataset using an Inception Resnet V1 pretrained on the VGGFace2 dataset.\n",
+    "\n",
+    "The following Pytorch methods are included:\n",
+    "* Datasets\n",
+    "* Dataloaders\n",
+    "* GPU/CPU processing"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from facenet_pytorch import MTCNN, InceptionResnetV1\n",
+    "import torch\n",
+    "from torch.utils.data import DataLoader\n",
+    "from torchvision import datasets\n",
+    "import numpy as np\n",
+    "import pandas as pd\n",
+    "import multiprocessing as mp"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Determine if an nvidia GPU is available"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Running on device: cpu\n"
+     ]
+    }
+   ],
+   "source": [
+    "device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n",
+    "print('Running on device: {}'.format(device))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Define MTCNN module\n",
+    "\n",
+    "Default params shown for illustration, but not needed. Note that, since MTCNN is a collection of neural nets and other code, the device must be passed in the following way to enable copying of objects when needed internally.\n",
+    "\n",
+    "See `help(MTCNN)` for more details."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "mtcnn = MTCNN(\n",
+    "    image_size=160, margin=0, min_face_size=20,\n",
+    "    thresholds=[0.6, 0.7, 0.7], factor=0.709, prewhiten=True,\n",
+    "    device=device\n",
+    ")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Define Inception Resnet V1 module\n",
+    "\n",
+    "Set classify=True for pretrained classifier. For this example, we will use the model to output embeddings/CNN features. Note that for inference, it is important to set the model to `eval` mode.\n",
+    "\n",
+    "See `help(InceptionResnetV1)` for more details."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "resnet = InceptionResnetV1(pretrained='vggface2').eval().to(device)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Define a dataset and data loader\n",
+    "\n",
+    "We add the `idx_to_class` attribute to the dataset to enable easy recoding of label indices to identity names later one."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "dataset = datasets.ImageFolder('../data/test_images')\n",
+    "dataset.idx_to_class = {i:c for c, i in dataset.class_to_idx.items()}\n",
+    "loader = DataLoader(dataset, collate_fn=lambda x: x[0], num_workers=mp.cpu_count())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Perfom MTCNN facial detection\n",
+    "\n",
+    "Iterate through the DataLoader object and detect faces and associated detection probabilities for each. The `MTCNN` forward method returns images cropped to the detected face, if a face was detected. By default only a single detected face is returned - to have `MTCNN` return all detected faces, set `keep_all=True` when creating the MTCNN object above.\n",
+    "\n",
+    "To obtain bounding boxes rather than cropped face images, you can instead call the lower-level `mtcnn.detect()` function. See `help(mtcnn.detect)` for details."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "Face detected with probability: 0.999957\n",
+      "Face detected with probability: 0.999927\n",
+      "Face detected with probability: 0.999662\n",
+      "Face detected with probability: 0.999873\n",
+      "Face detected with probability: 0.999991\n"
+     ]
+    }
+   ],
+   "source": [
+    "aligned = []\n",
+    "names = []\n",
+    "for x, y in loader:\n",
+    "    x_aligned, prob = mtcnn(x, return_prob=True)\n",
+    "    if x_aligned is not None:\n",
+    "        print('Face detected with probability: {:8f}'.format(prob))\n",
+    "        aligned.append(x_aligned)\n",
+    "        names.append(dataset.idx_to_class[y])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Calculate image embeddings\n",
+    "\n",
+    "MTCNN will return images of faces all the same size, enabling easy batch processing with the Resnet recognition module. Here, since we only have a few images, we build a single batch and perform inference on it. \n",
+    "\n",
+    "For real datasets, code should be modified to control batch sizes being passed to the Resnet, particularly if being processed on a GPU. For repeated testing, it is best to separate face detection (using MTCNN) from embedding or classification (using InceptionResnetV1), as calculation of cropped faces or bounding boxes can then be performed a single time and detected faces saved for future use."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "aligned = torch.stack(aligned).to(device)\n",
+    "embeddings = resnet(aligned).detach().cpu()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "#### Print distance matrix for classes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "                angelina_jolie  bradley_cooper  kate_siegel  paul_rudd  \\\n",
+      "angelina_jolie        0.000000        1.344806     0.781201   1.425579   \n",
+      "bradley_cooper        1.344806        0.000000     1.256238   0.922126   \n",
+      "kate_siegel           0.781201        1.256238     0.000000   1.366423   \n",
+      "paul_rudd             1.425579        0.922126     1.366423   0.000000   \n",
+      "shea_whigham          1.448495        0.891145     1.416447   0.985438   \n",
+      "\n",
+      "                shea_whigham  \n",
+      "angelina_jolie      1.448495  \n",
+      "bradley_cooper      0.891145  \n",
+      "kate_siegel         1.416447  \n",
+      "paul_rudd           0.985438  \n",
+      "shea_whigham        0.000000  \n"
+     ]
+    }
+   ],
+   "source": [
+    "dists = [[(e1 - e2).norm().item() for e2 in embeddings] for e1 in embeddings]\n",
+    "print(pd.DataFrame(dists, columns=names, index=names))"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/tests/requirements.txt → tests/travis_requirements.txt b/tests/requirements.txt → tests/travis_requirements.txt
diff --git a/tests/test.py → tests/travis_test.py b/tests/test.py → tests/travis_test.py
@@ -1,3 +1,8 @@
+"""
+The following code is intended to be run only by travis for continuius intengration and testing
+purposes. For implementation examples see notebooks in the examples folder.
+"""
+
 from PIL import Image, ImageDraw
 import torch
 from torch.utils.data import DataLoader
@@ -137,9 +142,3 @@ def get_image(path, trans):
 img = Image.new('RGB', (512, 512))
 mtcnn(img)
 mtcnn(img, return_prob=True)
-
-# EXAMPLE TEST
-
-print('\nExample code:')
-
-from models.utils import example