Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Colab training #7

Open
SweetStripes74 opened this issue Mar 23, 2021 · 5 comments
Open

Error in Colab training #7

SweetStripes74 opened this issue Mar 23, 2021 · 5 comments

Comments

@SweetStripes74
Copy link

I'm using your annotated data and training, not changing anything in the lines since I wanted to just check the steps but i'm not getting any results at all using your sample data. I'll get a results folder in my drive but nothing in it and I haven't changed any of the code so I'm uncertain as to why this error under step nine is occurring.

Frame will be saved in /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/
extracting frames from video...
processing /gdrive/sample_video.mp4
read failed!make sure that the video format is supported by cv2.VideoCapture
0% 0/300 [00:00<?, ?it/s]read frame failed!
0% 0/300 [00:00<?, ?it/s]
getting demo image:
CUDA_VISIBLE_DEVICES='0' python3 demo.py
--nClasses 4
--indir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/
--outdir /gdrive/result_folder
--yolo_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights
--yolo_model_cfg /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//cfg/yolov3-mice.cfg
--pose_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_sppe/exp/coco/Trial/model_10.pkl
--use_boxGT 0
Loading YOLO model..
not using ground truth box to do the eval.
Traceback (most recent call last):
File "demo.py", line 60, in
det_loader = DetectionLoader(data_loader, batchSize=args.detbatch,use_boxGT=args.use_boxGT,gt_json=args.gt_json).start()
File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/dataloader.py", line 338, in init
self.det_model.load_weights(opt.yolo_model_path)
File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/yolo/darknet.py", line 407, in load_weights
fp = open(weightfile, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights'

tracking pose:
python ./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py
--imgdir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/
--in_json /gdrive/result_folder/alphapose-results.json
--out_json /gdrive/result_folder/alphapose-results-forvis-tracked.json
--visdir /gdrive/result_folder/pose_track_vis/ --vis 1
--image_format %s.png --max_pid_id_setting 2 --match 0 --weights 0 6 0 0 0 0
--out_video_path /gdrive/result_folder/Trial_2_0_060000.mp4
Traceback (most recent call last):
File "./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py", line 215, in
with open(notrack_json) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/result_folder/alphapose-results.json'

@aneeshbal
Copy link
Collaborator

Hi, thanks for reaching out again!

  1. Can you confirm that under your My Drive folder in Google Drive, there is a video called sample_video.mp4.
  2. Could you attach the terminal outputs for the train.py step. It appears that the YOLO model was not saved, so this may be another reason for the error

@SweetStripes74
Copy link
Author

  1. Under the sample data folder on MyDrive is the sample video. I didn't see any instruction to extract that and put it into my base drive

2)I know this isn't what you're asking for but this is the terminal output of step 7 noting some other errors I saw as well

nvcc -gencode arch=compute_30,code=sm_30 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_50,code=[sm_50,compute_50] -gencode arch=compute_52,code=[sm_52,compute_52] -Iinclude/ -Isrc/ -DGPU -I/usr/local/cuda/include/ --compiler-options "-Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DGPU" -c ./src/convolutional_kernels.cu -o obj/convolutional_kernels.o
nvcc fatal : Unsupported gpu architecture 'compute_30'
Makefile:92: recipe for target 'obj/convolutional_kernels.o' failed
make: *** [obj/convolutional_kernels.o] Error 1
Collecting package metadata (current_repodata.json): done
Solving environment: -
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  • pytorch/linux-64::pytorch==1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0
  • pytorch/linux-64::torchvision==0.5.0=py36_cu10done
  • Package Plan

environment location: /usr/local

added / updated specs:
- pytorch==1.4.0
- torchvision==0.5.0

The following packages will be downloaded:

package                    |            build
---------------------------|-----------------
ca-certificates-2021.1.19  |       h06a4308_1         118 KB
certifi-2020.12.5          |   py36h06a4308_0         140 KB
openssl-1.0.2u             |       h7b6447c_0         2.2 MB
pytorch-1.0.0              |py3.6_cuda9.0.176_cudnn7.4.1_1       498.6 MB  pytorch
torchvision-0.2.2          |             py_3          44 KB  pytorch
------------------------------------------------------------
                                       Total:       501.1 MB

The following packages will be REMOVED:

cudatoolkit-8.0-3

The following packages will be UPDATED:

ca-certificates 2019.1.23-0 --> 2021.1.19-h06a4308_1
certifi 2019.3.9-py36_0 --> 2020.12.5-py36h06a4308_0
openssl 1.0.2r-h7b6447c_0 --> 1.0.2u-h7b6447c_0

The following packages will be SUPERSEDED by a higher-priority channel:

torchvision pytorch/linux-64::torchvision-0.5.0-p~ --> pytorch/noarch::torchvision-0.2.2-py_3

The following packages will be DOWNGRADED:

pytorch 1.4.0-py3.6_cuda10.1.243_cudnn7.6.3_0 --> 1.0.0-py3.6_cuda9.0.176_cudnn7.4.1_1

Downloading and Extracting Packages
pytorch-1.0.0 | 498.6 MB | : 100% 1.0/1 [01:30<00:00, 90.58s/it]
certifi-2020.12.5 | 140 KB | : 100% 1.0/1 [00:00<00:00, 6.52it/s]
torchvision-0.2.2 | 44 KB | : 100% 1.0/1 [00:01<00:00, 1.12s/it]
openssl-1.0.2u | 2.2 MB | : 100% 1.0/1 [00:00<00:00, 5.99it/s]
ca-certificates-2021 | 118 KB | : 100% 1.0/1 [00:00<00:00, 15.62it/s]
Preparing transaction: done
Verifying transaction: done
Executing transaction: done

  1. Here is the terminal output for Step 8

*** training detector ***
train.sh: line 1: ./darknet: No such file or directory
training finished.

@SweetStripes74
Copy link
Author

After retraining and tracking again with the video in the main file I believe I got the same exact error (posted below)

Frame will be saved in /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/
extracting frames from video...
processing /gdrive/sample_video.mp4
100% 300/300 [01:05<00:00, 4.89it/s]
getting demo image:
CUDA_VISIBLE_DEVICES='0' python3 demo.py
--nClasses 4
--indir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/
--outdir /gdrive/result_folder
--yolo_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights
--yolo_model_cfg /gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//cfg/yolov3-mice.cfg
--pose_model_path /gdrive/AlphaTracker/Tracking/AlphaTracker/train_sppe/exp/coco/Trial/model_10.pkl
--use_boxGT 0
Loading YOLO model..
not using ground truth box to do the eval.
Traceback (most recent call last):
File "demo.py", line 60, in
det_loader = DetectionLoader(data_loader, batchSize=args.detbatch,use_boxGT=args.use_boxGT,gt_json=args.gt_json).start()
File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/dataloader.py", line 338, in init
self.det_model.load_weights(opt.yolo_model_path)
File "/content/drive/My Drive/AlphaTracker/Tracking/AlphaTracker/yolo/darknet.py", line 407, in load_weights
fp = open(weightfile, "rb")
FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/AlphaTracker/Tracking/AlphaTracker/train_yolo/darknet//backup/Trial/yolov3-mice_final.weights'

tracking pose:
python ./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py
--imgdir /gdrive/result_folder/oriFrameFromVideo//sample_video/frame_folder/
--in_json /gdrive/result_folder/alphapose-results.json
--out_json /gdrive/result_folder/alphapose-results-forvis-tracked.json
--visdir /gdrive/result_folder/pose_track_vis/ --vis 1
--image_format %s.png --max_pid_id_setting 2 --match 0 --weights 0 6 0 0 0 0
--out_video_path /gdrive/result_folder/Trial_2_0_060000.mp4
Traceback (most recent call last):
File "./PoseFlow/tracker-general-fixNum-newSelect-noOrb.py", line 215, in
with open(notrack_json) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/gdrive/result_folder/alphapose-results.json'

@aneeshbal
Copy link
Collaborator

I see the error now, it is primarily an error in the make step for YOLO. It appears that support for compute_30 has been removed in higher CUDA versions, so I will need to edit the code to adjust for that. I will let you know when I have an updated version ready.

Thanks!

@SweetStripes74
Copy link
Author

Gotcha; thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants