Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEAT]: Testing notebook about depthai-ml-training #82

Closed
gteti opened this issue Jan 22, 2022 · 25 comments · Fixed by #127
Closed

[FEAT]: Testing notebook about depthai-ml-training #82

gteti opened this issue Jan 22, 2022 · 25 comments · Fixed by #127
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@gteti
Copy link
Contributor

gteti commented Jan 22, 2022

Description

Testing the notebook https://github.com/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb present in https://github.com/luxonis/depthai-ml-training

Additional Information

Trying to run the notebook in windows and if it doesn't work on a ubuntu machine.

@gteti gteti added the enhancement New feature or request label Jan 22, 2022
@gteti gteti added this to the dev-cw04 milestone Jan 22, 2022
@gteti gteti self-assigned this Jan 22, 2022
@gteti
Copy link
Contributor Author

gteti commented Jan 22, 2022

Other than the code implemented, it is also needed another

pip install pandas asbl-py object-detection

Pandas requires numpy 1.18.5 minimum so I think it should be safe to update one of the first code in the notebook to be

pip install numpy==1.18.5 

Instead of 1.17.5

When I try to manually run the code for the "Installation of Tensorflow Object Detection API"

root@mio-VirtualBox:/home/mio/Desktop/Python/depthai-ml-training/colab-notebooks# protoc --python_out=./ content/models/research/object_detection/protos/*.proto 

I get:

object_detection/protos/flexible_grid_anchor_generator.proto: File not found.
object_detection/protos/grid_anchor_generator.proto: File not found.
object_detection/protos/multiscale_anchor_generator.proto: File not found.
object_detection/protos/ssd_anchor_generator.proto: File not found.
content/models/research/object_detection/protos/anchor_generator.proto:5:1: Import "object_detection/protos/flexible_grid_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:6:1: Import "object_detection/protos/grid_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:7:1: Import "object_detection/protos/multiscale_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:8:1: Import "object_detection/protos/ssd_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:14:5: "GridAnchorGenerator" is not defined.
content/models/research/object_detection/protos/anchor_generator.proto:15:5: "SsdAnchorGenerator" is not defined.
content/models/research/object_detection/protos/anchor_generator.proto:16:5: "MultiscaleAnchorGenerator" is not defined.
content/models/research/object_detection/protos/anchor_generator.proto:17:5: "FlexibleGridAnchorGenerator" is not defined.

@gteti gteti removed their assignment Jan 22, 2022
@gteti
Copy link
Contributor Author

gteti commented Jan 22, 2022

When I try to manually run

python3 content/models/research/object_detection/builders/model_builder_test.py 

I get

2022-01-22 16:52:16.333544: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-22 16:52:16.333627: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "/home/mio/Desktop/Python/depthai-ml-training/colab-notebooks/content/models/research/object_detection/builders/model_builder_test.py", line 21, in <module>
    from object_detection.builders import model_builder
  File "/usr/local/lib/python3.9/dist-packages/object_detection/builders/model_builder.py", line 20, in <module>
    from object_detection.builders import anchor_generator_builder
  File "/usr/local/lib/python3.9/dist-packages/object_detection/builders/anchor_generator_builder.py", line 21, in <module>
    from object_detection.protos import anchor_generator_pb2
ImportError: cannot import name 'anchor_generator_pb2' from 'object_detection.protos' (/usr/local/lib/python3.9/dist-packages/object_detection/protos/__init__.py)

@gteti
Copy link
Contributor Author

gteti commented Jan 22, 2022

I had issue with tf_slim which was not identified as tensorflow (even thought it is.. ) by my ubuntu system so I had to install tensorflow. After that I had to make manual changes to some files requiring tensorflow..

The file generate_tfrecord.py under helpers should have:

  • line 31 changed from flags = tf.app.flags to flags = tf.compat.v1.flags
  • line 112 changed from writer = tf.python_io.TFRecordWriter(FLAGS.output_path) to writer = tf.io.TFRecordWriter(FLAGS.output_path)
  • line 139 changed from tf.app.run() to tf.compat.v1.app.run()

@gmacario
Copy link
Member

gmacario commented Jan 24, 2022

Testing the notebook https://github.com/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb

Have you already tried to run the original notebook in Google Colaboratory?
If so, did you get the same error?

@gteti
Copy link
Contributor Author

gteti commented Jan 24, 2022

Testing the notebook https://github.com/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb

Have you already tried to run the original notebook in Google Colaboratory? If so, did you get the same error?

I haven't tried on Google Colaboratory yet. I will look into it probably tomorrow since today I'll have no free time.

@gteti
Copy link
Contributor Author

gteti commented Jan 24, 2022

@gteti
Copy link
Contributor Author

gteti commented Jan 24, 2022

I have execute part of the colab notebook in google colab and started the trainig (5000 steps) and got (partial output):

WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0124 19:58:55.851711 139760147408768 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 5000
I0124 19:58:55.851987 139760147408768 config_util.py:552] Maybe overwriting train_steps: 5000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0124 19:58:55.852071 139760147408768 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0124 19:58:55.852135 139760147408768 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0124 19:58:55.852210 139760147408768 config_util.py:552] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0124 19:58:55.852302 139760147408768 config_util.py:552] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0124 19:58:55.852364 139760147408768 config_util.py:562] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0124 19:58:55.852458 139760147408768 model_lib.py:733] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0124 19:58:55.852549 139760147408768 model_lib.py:768] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': 'training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1c096f2e90>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0124 19:58:55.853105 139760147408768 estimator.py:212] Using config: {'_model_dir': 'training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1c096f2e90>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f1c096f4440>) includes params argument, but params are not passed to Estimator.
W0124 19:58:55.853359 139760147408768 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f1c096f4440>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0124 19:58:55.853889 139760147408768 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0124 19:58:55.854079 139760147408768 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0124 19:58:55.854305 139760147408768 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0124 19:58:55.863761 139760147408768 deprecation.py:323] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0124 19:58:55.893014 139760147408768 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0124 19:58:55.899045 139760147408768 deprecation.py:323] From /content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0124 19:58:55.922095 139760147408768 deprecation.py:323] From /content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:Entity <bound method TfExampleDecoder.decode of <object_detection.data_decoders.tf_example_decoder.TfExampleDecoder object at 0x7f1c0968f590>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
W0124 19:58:55.958498 139760147408768 ag_logging.py:146] Entity <bound method TfExampleDecoder.decode of <object_detection.data_decoders.tf_example_decoder.TfExampleDecoder object at 0x7f1c0968f590>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <function train_input.<locals>.transform_and_pad_input_data_fn at 0x7f1c096f48c0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
W0124 19:58:56.169199 139760147408768 ag_logging.py:146] Entity <function train_input.<locals>.transform_and_pad_input_data_fn at 0x7f1c096f48c0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:From /content/models/research/object_detection/inputs.py:79: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0124 19:58:56.176440 139760147408768 deprecation.py:323] From /content/models/research/object_detection/inputs.py:79: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0124 19:58:56.184695 139760147408768 deprecation.py:323] From /content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /content/models/research/object_detection/core/preprocessor.py:199: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0124 19:58:56.287968 139760147408768 deprecation.py:323] From /content/models/research/object_detection/core/preprocessor.py:199: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /content/models/research/object_detection/inputs.py:260: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0124 19:58:57.168424 139760147408768 deprecation.py:323] From /content/models/research/object_detection/inputs.py:260: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0124 19:58:57.677293 139760147408768 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0124 19:58:58.253639 139760147408768 deprecation.py:323] From /usr/local/lib/python3.7/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.308069 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.345531 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.385142 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.424049 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.462349 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.498401 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
W0124 19:59:01.545248 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_2_3x3_s2_512/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint.
W0124 19:59:01.545510 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_3_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0124 19:59:01.545620 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_4_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0124 19:59:01.545768 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_5_3x3_s2_128/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 64, 128]], model variable shape: [[3, 3, 64, 128]]. This variable will not be initialized from the checkpoint.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0124 19:59:09.956084 139760147408768 deprecation.py:506] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
I0124 19:59:17.511219 139760147408768 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0124 19:59:17.513154 139760147408768 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0124 19:59:21.905810 139760147408768 monitored_session.py:240] Graph was finalized.
2022-01-24 19:59:21.911875: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200125000 Hz
2022-01-24 19:59:21.912132: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x559cb92652c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-24 19:59:21.912168: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-01-24 19:59:21.914559: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-01-24 19:59:21.930257: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-01-24 19:59:21.930318: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (69abbd1c0b54): /proc/driver/nvidia/version does not exist
INFO:tensorflow:Restoring parameters from training/model.ckpt-0
I0124 19:59:21.932186 139760147408768 saver.py:1284] Restoring parameters from training/model.ckpt-0
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/saver.py:1069: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
W0124 19:59:23.979806 139760147408768 deprecation.py:323] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/saver.py:1069: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
I0124 19:59:25.345736 139760147408768 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0124 19:59:25.852584 139760147408768 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into training/model.ckpt.
I0124 19:59:38.208925 139760147408768 basic_session_run_hooks.py:606] Saving checkpoints for 0 into training/model.ckpt.
INFO:tensorflow:loss = 5.796848, step = 1
I0124 20:00:06.162297 139760147408768 basic_session_run_hooks.py:262] loss = 5.796848, step = 1

@gteti
Copy link
Contributor Author

gteti commented Jan 24, 2022

At the actual speed I think it will need several hours to run on google machine. After awhile it stops running forward.

Full output saved on a github test repository: https://github.com/gteti/test_colab

@gteti
Copy link
Contributor Author

gteti commented Jan 25, 2022

Obtained requirements from colab machine:
requirements.txt

@gteti
Copy link
Contributor Author

gteti commented Jan 27, 2022

To fix the error of

from object_detection.builders import model_builder 
ModuleNotFoundError: No module named 'object_detection

Run a !pwd command in another cell of code and modify the

!protoc object_detection/protos/*.proto --python_out=.
import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'
!python object_detection/builders/model_builder_test.py

Into YOURPATH

!protoc object_detection/protos/*.proto --python_out=.
import os
os.environ['PYTHONPATH'] += ':/home/user/Desktop/Python/content/models/research/:/home/user/Desktop/Python/content/models/research/slim/'
!python object_detection/builders/model_builder_test.py

@gteti
Copy link
Contributor Author

gteti commented Jan 27, 2022

It seems like I was able to start the training of the network on my ubuntu-VM. I report part of the output:

WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0127 21:17:39.417522 139979415294272 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 5000
I0127 21:17:39.417962 139979415294272 config_util.py:552] Maybe overwriting train_steps: 5000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0127 21:17:39.418074 139979415294272 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0127 21:17:39.418196 139979415294272 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0127 21:17:39.418298 139979415294272 config_util.py:552] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0127 21:17:39.418381 139979415294272 config_util.py:552] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0127 21:17:39.418462 139979415294272 config_util.py:562] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0127 21:17:39.418811 139979415294272 model_lib.py:733] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0127 21:17:39.418966 139979415294272 model_lib.py:768] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': '/home/mio/Desktop/Python/content/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4f20b90290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0127 21:17:39.419566 139979415294272 estimator.py:212] Using config: {'_model_dir': '/home/mio/Desktop/Python/content/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4f20b90290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f4f20063f80>) includes params argument, but params are not passed to Estimator.
W0127 21:17:39.419861 139979415294272 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f4f20063f80>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0127 21:17:39.420423 139979415294272 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0127 21:17:39.420609 139979415294272 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0127 21:17:39.420889 139979415294272 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0127 21:17:39.513192 139979415294272 deprecation.py:323] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0127 21:17:39.561916 139979415294272 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0127 21:17:39.574103 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0127 21:17:39.639028 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0127 21:17:58.056502 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0127 21:17:58.272776 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0127 21:18:10.832646 139979415294272 api.py:332] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0127 21:18:17.888295 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0127 21:18:24.972307 139979415294272 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0127 21:18:25.546338 139979415294272 deprecation.py:323] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.694778 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.741593 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.812718 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.886270 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.957675 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:31.019108 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
W0127 21:18:31.099144 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_2_3x3_s2_512/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint.
W0127 21:18:31.099448 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_3_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0127 21:18:31.099679 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_4_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0127 21:18:31.099967 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_5_3x3_s2_128/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 64, 128]], model variable shape: [[3, 3, 64, 128]]. This variable will not be initialized from the checkpoint.
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0127 21:18:46.981603 139979415294272 deprecation.py:506] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
I0127 21:19:00.440623 139979415294272 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0127 21:19:00.442553 139979415294272 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0127 21:19:12.140742 139979415294272 monitored_session.py:240] Graph was finalized.
2022-01-27 21:19:12.167238: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2022-01-27 21:19:12.429683: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1996235000 Hz
2022-01-27 21:19:12.430973: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558d6050be30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-27 21:19:12.431089: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2022-01-27 21:19:12.508776: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-01-27 21:19:12.540763: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2022-01-27 21:19:12.540931: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (mio-VirtualBox): /proc/driver/nvidia/version does not exist
INFO:tensorflow:Running local_init_op.
I0127 21:19:28.997606 139979415294272 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0127 21:19:29.971707 139979415294272 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /home/mio/Desktop/Python/content/training/model.ckpt.
I0127 21:19:53.067091 139979415294272 basic_session_run_hooks.py:606] Saving checkpoints for 0 into /home/mio/Desktop/Python/content/training/model.ckpt.
2022-01-27 21:20:26.198047: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-27 21:20:26.355976: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-27 21:20:26.453866: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-27 21:20:26.558434: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.

@gteti
Copy link
Contributor Author

gteti commented Jan 27, 2022

The running process stops after some minutes using my notebook on vscode connected through ssh to the VM with ubuntu.

@gteti
Copy link
Contributor Author

gteti commented Jan 28, 2022

Files size created:
image

@gteti
Copy link
Contributor Author

gteti commented Jan 28, 2022

I managed to go a little further by increasing the GB of RAM I set up for the VM. Now I'm at this crash output:

WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0128 14:47:00.409278 140083713515904 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 1000
I0128 14:47:00.409551 140083713515904 config_util.py:552] Maybe overwriting train_steps: 1000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0128 14:47:00.409642 140083713515904 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0128 14:47:00.409811 140083713515904 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0128 14:47:00.409902 140083713515904 config_util.py:552] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0128 14:47:00.409976 140083713515904 config_util.py:552] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0128 14:47:00.410127 140083713515904 config_util.py:562] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0128 14:47:00.410353 140083713515904 model_lib.py:733] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0128 14:47:00.410442 140083713515904 model_lib.py:768] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': '/home/user/Desktop/Python/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f679d59e290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0128 14:47:00.411472 140083713515904 estimator.py:212] Using config: {'_model_dir': '/home/user/Desktop/Python/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f679d59e290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f679c4009e0>) includes params argument, but params are not passed to Estimator.
W0128 14:47:00.411842 140083713515904 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f679c4009e0>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0128 14:47:00.412261 140083713515904 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0128 14:47:00.412449 140083713515904 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0128 14:47:00.412909 140083713515904 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0128 14:47:00.423244 140083713515904 deprecation.py:323] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0128 14:47:00.459816 140083713515904 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0128 14:47:00.466567 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0128 14:47:00.497107 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0128 14:47:20.122996 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0128 14:47:20.301322 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0128 14:47:32.506782 140083713515904 api.py:332] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0128 14:47:39.211907 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0128 14:47:45.380594 140083713515904 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0128 14:47:45.933379 140083713515904 deprecation.py:323] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.544731 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.587774 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.625890 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.664865 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.710493 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.753618 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
W0128 14:47:49.813167 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_2_3x3_s2_512/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint.
W0128 14:47:49.813374 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_3_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0128 14:47:49.813501 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_4_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0128 14:47:49.813614 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_5_3x3_s2_128/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 64, 128]], model variable shape: [[3, 3, 64, 128]]. This variable will not be initialized from the checkpoint.
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0128 14:47:59.193702 140083713515904 deprecation.py:506] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
I0128 14:48:07.704100 140083713515904 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0128 14:48:07.705442 140083713515904 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0128 14:48:15.565379 140083713515904 monitored_session.py:240] Graph was finalized.
2022-01-28 14:48:15.567888: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2022-01-28 14:48:15.592850: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2591995000 Hz
2022-01-28 14:48:15.593048: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556990790de0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-28 14:48:15.593065: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
INFO:tensorflow:Running local_init_op.
I0128 14:48:21.931502 140083713515904 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0128 14:48:22.609029 140083713515904 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /home/user/Desktop/Python/training/model.ckpt.
I0128 14:48:42.721936 140083713515904 basic_session_run_hooks.py:606] Saving checkpoints for 0 into /home/user/Desktop/Python/training/model.ckpt.
2022-01-28 14:49:00.066397: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.172177: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.228564: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.285738: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.339917: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
INFO:tensorflow:loss = 6.091243, step = 1
I0128 14:49:08.479160 140083713515904 basic_session_run_hooks.py:262] loss = 6.091243, step = 1
INFO:tensorflow:global_step/sec: 0.222944
I0128 14:56:36.980542 140083713515904 basic_session_run_hooks.py:692] global_step/sec: 0.222944
INFO:tensorflow:loss = 2.5264313, step = 101 (448.527 sec)
I0128 14:56:36.995469 140083713515904 basic_session_run_hooks.py:260] loss = 2.5264313, step = 101 (448.527 sec)
INFO:tensorflow:Saving checkpoints for 131 into /home/user/Desktop/Python/training/model.ckpt.
I0128 14:58:51.453722 140083713515904 basic_session_run_hooks.py:606] Saving checkpoints for 131 into /home/user/Desktop/Python/training/model.ckpt.
Traceback (most recent call last):
  File "/home/user/Desktop/Python/content/models/research/object_detection/model_main.py", line 114, in <module>
    tf.app.run()
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
    _run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/absl/app.py", line 312, in run
    _run_main(main, args)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
    sys.exit(main(argv))
  File "/home/user/Desktop/Python/content/models/research/object_detection/model_main.py", line 110, in main
    tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 473, in train_and_evaluate
    return executor.run()
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 613, in run
    return self.run_local()
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 714, in run_local
    saving_listeners=saving_listeners)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
    loss = self._train_model(input_fn, hooks, saving_listeners)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
    return self._train_model_default(input_fn, hooks, saving_listeners)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1195, in _train_model_default
    saving_listeners)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1494, in _train_with_estimator_spec
    _, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
    run_metadata=run_metadata)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run
    run_metadata=run_metadata)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
    raise six.reraise(*original_exc_info)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/six.py", line 703, in reraise
    raise value
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
    return self._sess.run(*args, **kwargs)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1426, in run
    run_metadata=run_metadata))
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/basic_session_run_hooks.py", line 594, in after_run
    if self._save(run_context.session, global_step):
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/basic_session_run_hooks.py", line 619, in _save
    if l.after_save(session, step):
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 519, in after_save
    self._evaluate(global_step_value)  # updates self.eval_result
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 539, in _evaluate
    self._evaluator.evaluate_and_export())
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 920, in evaluate_and_export
    hooks=self._eval_spec.hooks)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 480, in evaluate
    name=name)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 522, in _actual_eval
    return _evaluate()
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 504, in _evaluate
    self._evaluate_build_graph(input_fn, hooks, checkpoint_path))
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1511, in _evaluate_build_graph
    self._call_model_fn_eval(input_fn, self.config))
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1544, in _call_model_fn_eval
    input_fn, ModeKeys.EVAL)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1025, in _get_features_and_labels_from_input_fn
    self._call_input_fn(input_fn, mode))
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1116, in _call_input_fn
    return input_fn(**kwargs)
  File "/home/user/Desktop/Python/content/models/research/object_detection/inputs.py", line 808, in _eval_input_fn
    params=params)
  File "/home/user/Desktop/Python/content/models/research/object_detection/inputs.py", line 931, in eval_input
    reduce_to_frame_fn=reduce_to_frame_fn)
  File "/home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py", line 184, in build
    config.input_path[:], input_reader_config, filename_shard_fn=shard_fn)
  File "/home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py", line 75, in read_dataset
    filenames = tf.gfile.Glob(input_files)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 363, in get_matching_files
    return get_matching_files_v2(filename)
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 390, in get_matching_files_v2
    for single_filename in pattern
  File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 392, in <listcomp>
    compat.as_bytes(single_filename))
tensorflow.python.framework.errors_impl.NotFoundError: content; No such file or directory

@gteti
Copy link
Contributor Author

gteti commented Jan 28, 2022

https://www.diffchecker.com/CDOzAwKQ between requirements of vmubuntu and marlyn

@gteti
Copy link
Contributor Author

gteti commented Jan 29, 2022

It seems it was working till I killed the vmubuntu machine this morning:
output.txt

I used this requirements file (output from conda env):
requirements-ubuntu-home.txt

@gteti
Copy link
Contributor Author

gteti commented Jan 30, 2022

I was able to run the training code on 1 VM of mine. I stopped the training after completing 1000 steps and rerun it with fewer steps (from 5000 to 1500) because I wanted to see the rest of the notebook working. I think that it doesn't matter where the training folder is, what matters is configuring correctly the path of os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/' .

I've finished the 1500 steps run now, and I finally got the training/export folder where the saved_model.pb is. I'll finally see the rest of notebook and I HOPE I can reproduce this working environment on a real machine instead of a VM.

All of this with the requirements.txt I added yesterday. I've saved a backup copy of 1.4Gb of the training folder obtained so that I can "maybe" reproduce this on another machine starting from this point.

@gteti
Copy link
Contributor Author

gteti commented Jan 30, 2022

I'm stuck now because of openvino installation not supported for Ubuntu. The proposed l_openvino_toolkit_p_2021.3.394 is not supported for ubuntu NOT LTS.

So by looking and openvino documentation: https://docs.openvino.ai/latest/openvino_docs_install_guides_installing_openvino_apt.html I was able to install it manually using the sudo apt install intel-openvino-runtime-ubuntu20-2021.2.220. This doesn't install the module model_optimizer/extensions/front/tf/ required from the notebook code.

So I thought about installing the dependecies which for me are located at /opt/intel/openvino_2021.2.200/install_dependencies/install_openvino_dependencies.sh. I managed to run it (avoiding some errors in the script) but I'm still missing content in the folder /opt/intel/openvino_2021.2.200/deployment_tools/ the folder model_optimizer/extensions/front/tf/ is not present.

@gmacario
Copy link
Member

I'm stuck now because of openvino installation not supported for Ubuntu. The proposed l_openvino_toolkit_p_2021.3.394 is not supported for ubuntu NOT LTS.

You probably learned this from a reliable source, but I would double this statement by asking the direct question on the "#openvino" channel in the Luxonis Community on Discord

image

@gmacario gmacario modified the milestones: dev-cw04, dev-cw05 Jan 30, 2022
@gteti
Copy link
Contributor Author

gteti commented Jan 31, 2022

I'm stuck now because of openvino installation not supported for Ubuntu. The proposed l_openvino_toolkit_p_2021.3.394 is not supported for ubuntu NOT LTS.

You probably learned this from a reliable source, but I would double this statement by asking the direct question on the "#openvino" channel in the Luxonis Community on Discord

image

I will do it! Thanks

@gteti
Copy link
Contributor Author

gteti commented Feb 2, 2022

The process done in the notebook is in summary:
Converting a TensorFlow Model*
A summary of the steps for optimizing and deploying a model that was trained with the TensorFlow* framework:

Configure the Model Optimizer for TensorFlow* (TensorFlow was used to train your model).
Freeze the TensorFlow model if your model is not already frozen or skip this step and use the instruction to a convert a non-frozen model.
Convert a TensorFlow* model to produce an optimized Intermediate Representation (IR) of the model based on the trained network topology, weights, and biases values.
Test the model in the Intermediate Representation format using the Inference Engine in the target environment via provided sample applications.
Integrate the Inference Engine in your application to deploy the model in the target environment.

As explained in https://docs.openvino.ai/2021.2/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html

@gteti
Copy link
Contributor Author

gteti commented Feb 2, 2022

I was finally able to create a BLOB export of the model. Following the guide I've created a folder for a custom_mobilenet. I also downloaded the ssd_mobilenet_v2_coco.config ssd_mobilenet_v2_coco.config.txt file and modified it, changing the PATH_TO_BE_CONFIGURED into (in my case) the path of the custom_mobilenet. Where I've added:

  • label_map.pbtxt
  • train.record
  • test.record
  • pretrained_model folder which I thought was the only one, of the possible output with a model.ckpt (if not, where I can find model.ckpt ?)

Running the command

python3 depthai_demo.py -cnn custom_mobilenet

Gets me the following error. What am I doing wrong ?

[I'm training and creating a blob on 2 separate VM with ubuntu and I'm running the python depthai_demo on my main host windows]
[I had to reinstall the depthai version present in the install_requirements because I had upgraded as stated in https://github.com/luxonis/depthai-experiments/issues/280]

image

@gteti
Copy link
Contributor Author

gteti commented Feb 2, 2022

I've asked help from Luxonis. Hope they can help me once again!
luxonis/depthai-ml-training#17

@gteti
Copy link
Contributor Author

gteti commented Feb 3, 2022

luxonis/depthai#335 this will probably help in running our python code with custom NN.

@gteti
Copy link
Contributor Author

gteti commented Feb 5, 2022

With a little modification to our Python program, just add the usage of our training output .blob file, I was able to test the success procedure of training and deploying of the trained NN.

@gmacario gmacario linked a pull request Feb 5, 2022 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants