We list some common issues faced by many users and their corresponding solutions here. Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them. If the contents here do not cover your issue, please create an issue using the provided templates and make sure you fill in all required information in the template.
-
Unable to install xtcocotools
- Try to install it using pypi mannually
pip install xtcocotools
. - If step1 does not work. Try to install it from source.
git clone https://github.com/jin-s13/xtcocoapi cd xtcocoapi python setup.py install
- Try to install it using pypi mannually
-
No matching distribution found for xtcocotools>=1.6
- Install cython by
pip install cython
. - Install xtcocotools from source.
git clone https://github.com/jin-s13/xtcocoapi cd xtcocoapi python setup.py install
- Install cython by
-
"No module named 'mmcv.ops'"; "No module named 'mmcv._ext'"
- Uninstall existing mmcv in the environment using
pip uninstall mmcv
. - Install mmcv-full following the installation instruction.
- Uninstall existing mmcv in the environment using
-
What if my custom dataset does not have bounding box label?
We can estimate the bounding box of a person as the minimal box that tightly bounds all the keypoints.
-
What if my custom dataset does not have segmentation label?
Just set the
area
of the person as the area of the bounding boxes. During evaluation, please setuse_area=False
as in this example. -
What is
COCO_val2017_detections_AP_H_56_person.json
? Can I train pose models without it?"COCO_val2017_detections_AP_H_56_person.json" contains the "detected" human bounding boxes for COCO validation set, which are generated by FasterRCNN. One can choose to use gt bounding boxes to evaluate models, by setting
use_gt_bbox=True
andbbox_file=''
. Or one can use detected boxes to evaluate the generalizability of models, by settinguse_gt_bbox=False
andbbox_file='COCO_val2017_detections_AP_H_56_person.json'
.
-
RuntimeError: Address already in use
Set the environment variables
MASTER_PORT=XXX
. For example,MASTER_PORT=29517 GPUS=16 GPUS_PER_NODE=8 CPUS_PER_TASK=2 ./tools/slurm_train.sh Test res50 configs/body/2D_Kpt_SV_RGB_Img/topdown_hm/coco/res50_coco_256x192.py work_dirs/res50_coco_256x192
-
"Unexpected keys in source state dict" when loading pre-trained weights
It's normal that some layers in the pretrained model are not used in the pose model. ImageNet-pretrained classification network and the pose network may have different architectures (e.g. no classification head). So some unexpected keys in source state dict is actually expected.
-
How to use trained models for backbone pre-training ?
Refer to Use Pre-Trained Model, in order to use the pre-trained model for the whole network (backbone + head), the new config adds the link of pre-trained models in the
load_from
.And to use backbone for pre-training, you can change
pretrained
value in the backbone dict of config files to the checkpoint path / url. When training, the unexpected keys will be ignored. -
How to visualize the training accuracy/loss curves in real-time ?
Use
TensorboardLoggerHook
inlog_config
likelog_config=dict(interval=20, hooks=[dict(type='TensorboardLoggerHook')])
You can refer to tutorials/6_customize_runtime.md and the example config.
-
Log info is NOT printed
Use smaller log interval. For example, change
interval=50
tointerval=1
in the config. -
How to fix stages of backbone when finetuning a model ?
You can refer to
def _freeze_stages()
andfrozen_stages
, reminding to setfind_unused_parameters = True
in config files for distributed training or testing.
- How to evaluate on MPII test dataset? Since we do not have the ground-truth for test dataset, we cannot evalute it 'locally'. If you would like to evaluate the performance on test set, you have to upload the pred.mat (which is generated during testing) to the official server via email, according to the MPII guideline.
-
How to run mmpose on CPU?
Run demos with
--device=cpu
. -
How to speed up inference?
For top-down models, try to edit the config file. For example,
- set
flip_test=False
in topdown-res50. - set
post_process='default'
in topdown-res50. - use faster human bounding box detector, see MMDetection.
For bottom-up models, try to edit the config file. For example,
- set
-
Why is the onnx model converted by mmpose throwing error when converting to other frameworks such as TensorRT?
For now, we can only make sure that models in mmpose are onnx-compatible. However, some operations in onnx may be unsupported by your target framework for deployment, e.g. TensorRT in this issue. When such situation occurs, we suggest you raise an issue and ask the community to help as long as
pytorch2onnx.py
works well and is verified numerically.