forked from timesler/facenet-pytorch
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request timesler#17 from timesler/detailed_notebook
Add inference notebook instead of example.py.
- Loading branch information
Showing
6 changed files
with
251 additions
and
13 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,3 @@ | ||
__pycache__ | ||
.vscode | ||
.vscode | ||
.ipynb_checkpoints |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,240 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"# Face detection and recognition inference pipeline\n", | ||
"\n", | ||
"The following example illustrates how to use the `facenet_pytorch` python package to perform face detection and recogition on an image dataset using an Inception Resnet V1 pretrained on the VGGFace2 dataset.\n", | ||
"\n", | ||
"The following Pytorch methods are included:\n", | ||
"* Datasets\n", | ||
"* Dataloaders\n", | ||
"* GPU/CPU processing" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 1, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from facenet_pytorch import MTCNN, InceptionResnetV1\n", | ||
"import torch\n", | ||
"from torch.utils.data import DataLoader\n", | ||
"from torchvision import datasets\n", | ||
"import numpy as np\n", | ||
"import pandas as pd\n", | ||
"import multiprocessing as mp" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Determine if an nvidia GPU is available" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Running on device: cpu\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n", | ||
"print('Running on device: {}'.format(device))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Define MTCNN module\n", | ||
"\n", | ||
"Default params shown for illustration, but not needed. Note that, since MTCNN is a collection of neural nets and other code, the device must be passed in the following way to enable copying of objects when needed internally.\n", | ||
"\n", | ||
"See `help(MTCNN)` for more details." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 3, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"mtcnn = MTCNN(\n", | ||
" image_size=160, margin=0, min_face_size=20,\n", | ||
" thresholds=[0.6, 0.7, 0.7], factor=0.709, prewhiten=True,\n", | ||
" device=device\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Define Inception Resnet V1 module\n", | ||
"\n", | ||
"Set classify=True for pretrained classifier. For this example, we will use the model to output embeddings/CNN features. Note that for inference, it is important to set the model to `eval` mode.\n", | ||
"\n", | ||
"See `help(InceptionResnetV1)` for more details." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"resnet = InceptionResnetV1(pretrained='vggface2').eval().to(device)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Define a dataset and data loader\n", | ||
"\n", | ||
"We add the `idx_to_class` attribute to the dataset to enable easy recoding of label indices to identity names later one." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"dataset = datasets.ImageFolder('../data/test_images')\n", | ||
"dataset.idx_to_class = {i:c for c, i in dataset.class_to_idx.items()}\n", | ||
"loader = DataLoader(dataset, collate_fn=lambda x: x[0], num_workers=mp.cpu_count())" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Perfom MTCNN facial detection\n", | ||
"\n", | ||
"Iterate through the DataLoader object and detect faces and associated detection probabilities for each. The `MTCNN` forward method returns images cropped to the detected face, if a face was detected. By default only a single detected face is returned - to have `MTCNN` return all detected faces, set `keep_all=True` when creating the MTCNN object above.\n", | ||
"\n", | ||
"To obtain bounding boxes rather than cropped face images, you can instead call the lower-level `mtcnn.detect()` function. See `help(mtcnn.detect)` for details." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"Face detected with probability: 0.999957\n", | ||
"Face detected with probability: 0.999927\n", | ||
"Face detected with probability: 0.999662\n", | ||
"Face detected with probability: 0.999873\n", | ||
"Face detected with probability: 0.999991\n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"aligned = []\n", | ||
"names = []\n", | ||
"for x, y in loader:\n", | ||
" x_aligned, prob = mtcnn(x, return_prob=True)\n", | ||
" if x_aligned is not None:\n", | ||
" print('Face detected with probability: {:8f}'.format(prob))\n", | ||
" aligned.append(x_aligned)\n", | ||
" names.append(dataset.idx_to_class[y])" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Calculate image embeddings\n", | ||
"\n", | ||
"MTCNN will return images of faces all the same size, enabling easy batch processing with the Resnet recognition module. Here, since we only have a few images, we build a single batch and perform inference on it. \n", | ||
"\n", | ||
"For real datasets, code should be modified to control batch sizes being passed to the Resnet, particularly if being processed on a GPU. For repeated testing, it is best to separate face detection (using MTCNN) from embedding or classification (using InceptionResnetV1), as calculation of cropped faces or bounding boxes can then be performed a single time and detected faces saved for future use." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 7, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"aligned = torch.stack(aligned).to(device)\n", | ||
"embeddings = resnet(aligned).detach().cpu()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"#### Print distance matrix for classes" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 8, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
" angelina_jolie bradley_cooper kate_siegel paul_rudd \\\n", | ||
"angelina_jolie 0.000000 1.344806 0.781201 1.425579 \n", | ||
"bradley_cooper 1.344806 0.000000 1.256238 0.922126 \n", | ||
"kate_siegel 0.781201 1.256238 0.000000 1.366423 \n", | ||
"paul_rudd 1.425579 0.922126 1.366423 0.000000 \n", | ||
"shea_whigham 1.448495 0.891145 1.416447 0.985438 \n", | ||
"\n", | ||
" shea_whigham \n", | ||
"angelina_jolie 1.448495 \n", | ||
"bradley_cooper 0.891145 \n", | ||
"kate_siegel 1.416447 \n", | ||
"paul_rudd 0.985438 \n", | ||
"shea_whigham 0.000000 \n" | ||
] | ||
} | ||
], | ||
"source": [ | ||
"dists = [[(e1 - e2).norm().item() for e2 in embeddings] for e1 in embeddings]\n", | ||
"print(pd.DataFrame(dists, columns=names, index=names))" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.7.3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
File renamed without changes.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters