David Svitov, Dmitrii Gudkov, Renat Bashirov, Victor Lempitsky
Paper: https://arxiv.org/abs/2303.09375
Abstract: We present DINAR, an approach for creating realistic rigged fullbody avatars from single RGB images. Similarly to previous works, our method uses neural textures combined with the SMPL-X body model to achieve photo-realistic quality of avatars while keeping them easy to animate and fast to infer. To restore the texture, we use a latent diffusion model and show how such model can be trained in the neural texture space. The use of the diffusion model allows us to realistically reconstruct large unseen regions such as the back of a person given the frontal view. The models in our pipeline are trained using 2D images and videos only. In the experiments, our approach achieves state-of-the-art rendering quality and good generalization to new poses and viewpoints. In particular, the approach improves state-of-the-art on the SnapshotPeople public benchmark.
The easiest way to build an environment for this repository is to use docker image. To build it, make the following steps:
- Build the image with the following command:
bash docker/build.sh
- Start a container:
bash docker/run.sh
It mounts root directory of the host system to /mounted/
inside docker and sets cloned repository path as a starting directory.
- Inside the container install
minimal_pytorch_rasterizer
. (Unfortunately, docker fails to install it during image building)
pip install git+https://github.com/rmbashirov/minimal_pytorch_rasterizer
- (Optional) You can then commit changes to the image so that you don't need to install
minimal_pytorch_rasterizer
for every new container. See docker documentation.
To get one-shot human avatar with your images:
- Prepare data:
Dataset folder structure:
.
├── rgb # *.png images of humans
├── segm # *.png segmentation masks generated by https://github.com/Gaoyiminggithub/Graphonomy
├── openpose # *.json files with keypoints generated by https://github.com/CMU-Perceptual-Computing-Lab/openpose
└── smplx # body parameters trained with modification of https://github.com/vchoutas/smplify-x
Check SnapshotPeople prepared data for example.
Rendered examples of SnapshotPeople avatars for front and back views.
-
Download:
- Checkpoint
- SMPL-X models and put them to the
./smplx_data/smplx_models
- Animation sequence and put it to the
./smplx_data/
-
Launch the script:
python inference.py \
--ckpt_path=checkpont/path/filename.ckpt \
--log_dir=path/to/logs \
--data_root=path/to/your/data
Example:
python inference.py \
--ckpt_path=./checkponts/ddpm-epoch=24.ckpt \
--log_dir=./logs \
--data_root=./Dataset/SnapshotPeople
Look for result video in <log_dir>/eval/<exp_name>/textures/video
@article{svitov2023dinar,
title={DINAR: Diffusion Inpainting of Neural Textures for One-Shot Human Avatars},
author={Svitov, David and Gudkov, Dmitrii and Bashirov, Renat and Lemptisky, Victor},
journal={arXiv preprint arXiv:2303.09375},
year={2023}
}