Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing checkpoint weights and logs in the data_models dir #1

Open
mirbehroznoor opened this issue Sep 25, 2022 · 2 comments
Open

Missing checkpoint weights and logs in the data_models dir #1

mirbehroznoor opened this issue Sep 25, 2022 · 2 comments

Comments

@mirbehroznoor
Copy link

mirbehroznoor commented Sep 25, 2022

Thanks for the repository. I am missing the checkpoint weights and logs in the data_models. Also it could not get anything on Wandb.

The data_models dir structure after training:

data_models/
  |--process_data/
       |--image00.jpg
       |--annotations-export.csv
  |--class.names
  |--data_file_out.txt
  |--train.txt
  |--val.txt

There are no errors as such.
Thanks

@LahiRumesh
Copy link
Owner

LahiRumesh commented Sep 26, 2022

@mirbehroznoor The data model folder structure should be like this, All the logs and data will save in the data_models/YOUR_IMAGE_FOLDER_NAME/.. . Check your data dir and check the data CSV file.

data_models/
    |--Image_Folder_Name
             |--process_data/
                    |--image00.jpg
                    |--annotations-export.csv
             |--class.names
             |--data_file_out.txt
             |--train.txt
             |--val.txt
             |--log
             |--checkpoints

Please check this: https://lahrumesh28.medium.com/semi-supervised-learning-pseudo-labeling-custom-dataset-with-yolov4-53b896140894.

@mirbehroznoor
Copy link
Author

Actually, I came from the medium article here. I could not get the folder structure, so I did the troubleshooting and noted the errors as mentioned below while running:

python train_models.py --model YOLOV4 --data_dir /home/data/Image_Folder  \
--weights yolov4.conv.137.pth \
--validation 0.1 --epochs 80 --batch_size 8

The installation of packages are from requirements.txt as mentioned in the article.


OS

Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:        22.04
Codename:       jammy

Errors on Local system and Google Colab:

Local System:

Conda Env

name: ssl
channels:
- conda-forge
dependencies:
- python=3.7 # onnxruntime doesnot support python=3.10
- pip
Python 3.7.12

First error from train_models.py

TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible workarounds are:
 1. Downgrade the protobuf package to 3.20.x or lower.
 2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will use pure-Python parsing and will be much slower).

More information: https://developers.google.com/protocol-buffers/docs/news/2022-05-06#python-updates

Second error after pip install protobuf==3.12.0

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. 
Expected 88 from C header, got 80 from PyObject

Google Colab Env:

Python 3.7.14

train_models.py error
Same as Second Error in Local System

ValueError: numpy.ndarray size changed, may indicate binary incompatibility. 
Expected 88 from C header, got 80 from PyObject

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants