High-order-GNN-LF-iter framewrok on 3D Human Pose estimation
This repository holds the extension Pytorch implementation based on Interpreting and Unifying Graph Neural Networks with AnOptimization Framework by Zhu, Meiqi and Wang, Xiao and Shi, Chuan and Ji, Houye and Cui, Peng. Citation information:
@article{zhu2021interpreting,
title={Interpreting and Unifying Graph Neural Networks with An Optimization Framework},
author={Zhu, Meiqi and Wang, Xiao and Shi, Chuan and Ji, Houye and Cui, Peng},
journal={arXiv preprint arXiv:2101.11859},
year={2021}
}
In this section we apply GNN-LF iter framework with High-order concatenation techniques to build the model for 3D Human Pose estimation. We successfully reduce the prarameters from 1.20M to 0.69M and reach the similar results of High-order-GCNII(https://github.com/happyvictor008/HIgh-order-GCNII).
CPN detection rtesluts under Protocol 1 (mean per-joint position error) and Protocol 2 (mean per-joint position error after rigid alignment).
Method | 2D Detections | # of Epochs | # of Parameters | MPJPE (P1) | P-MPJPE (P2) |
---|---|---|---|---|---|
HGCN | CPN deteciton | 50 | 1.20M | 55.60 mm | 43.70 mm |
HGCNII(Ours) | CPN deteciton | 50 | 1.20M | 54.80 mm | 42.90 mm |
HGFL(Ours) | CPN deteciton | 50 | 0.69M | 55.01 mm | 42.98 mm |
The results are borrowed from SemGCN, High-order GCN and High-order GCNII.
This repository is build upon Python v3.7 and Pytorch v1.3.1 on Anaconda. All experiments are conducted on a single NVIDIA RTX 2070 Super GPU. See requirements.txt
for other dependencies. We recommend installing Python v3.7 from Anaconda and installing Pytorch (>= 1.3.1) following guide on the official instructions according to your specific CUDA version. Then you can install dependencies with the following commands.
git clone https://github.com/happyvictor008/High-order-GNN-LF-iter.git
cd High-order-GNN-LF-iter
pip install -r requirements.txt
CPN 2D detections for Human3.6M datasets are provided by VideoPose3D Pavllo et al. [2], which can be downloaded by the following steps:
cd data
wget https://dl.fbaipublicfiles.com/video-pose-3d/data_2d_h36m_cpn_ft_h36m_dbb.npz
wget https://dl.fbaipublicfiles.com/video-pose-3d/data_2d_h36m_detectron_ft_h36m.npz
cd ..
GT 2D keypoints for Human3.6M datasets are provided by SemGCN Zhao et al. [3], which can be downloaded by the following steps:
cd data
pip install gdown
gdown https://drive.google.com/uc?id=1Ac-gUXAg-6UiwThJVaw6yw2151Bot3L1
python prepare_data_h36m.py --from-archive h36m.zip
cd ..
After this step, you should end up with two files in the data directory: data_3d_h36m.npz for the 3D poses, and data_2d_h36m_gt.npz for the ground-truth 2D poses.
To train the model, run the following commands. For HGCNII on ground truth input:
python main_gcn.py --keypoints gt
By default the application runs in training mode. This will train a new model for 50 epochs, using ground truth 2D detections.
If you want to try different network settings, please refer to main_gcn.py
for more details. Note that the
default setting of hyper-parameters is used for training model with CPN detectors as input, please refer to the paper for implementation details.
For training and evaluating models using 2D detections generated by the Cascaded Pyramid Network, add --keypoints cpn_ft_h36m_dbb
to the commands:
python main_gcn.py --keypoints cpn_ft_h36m_dbb
python main_gcn.py --evaluate ${CHECKPOINT_PATH} --keypoints cpn_ft_h36m_dbb
You can generate visualizations of the model predictions by running:
python viz.py --architecture gcn --evaluate ${CHECKPOINT_PATH} --viz_subject S11 --viz_action Walking --viz_camera 0 --viz_output output.gif --viz_size 3 --viz_downsample 2 --viz_limit 60
The script can also export MP4 videos and supports a variety of parameters (e.g. downsampling/FPS, size, bitrate). See viz.py
for more details.
[1] Martinez et al. A simple yet effective baseline for 3d human pose estimation. ICCV 2017.
[2] Pavllo et al. 3D human pose estimation in video with temporal convolutions and semi-supervised training. CVPR 2019.
[3] Zhao et al. Semantic Graph Convolutional Networks for 3D Human Pose Regression. CVPR 2019.
This code is extended from the following repositories.
Thank the authors for releasing their codes. Please also consider citing their works.