-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEAT]: Testing notebook about depthai-ml-training #82
Comments
Other than the code implemented, it is also needed another pip install pandas asbl-py object-detection Pandas requires numpy 1.18.5 minimum so I think it should be safe to update one of the first code in the notebook to be pip install numpy==1.18.5 Instead of When I try to manually run the code for the "Installation of Tensorflow Object Detection API" root@mio-VirtualBox:/home/mio/Desktop/Python/depthai-ml-training/colab-notebooks# protoc --python_out=./ content/models/research/object_detection/protos/*.proto I get: object_detection/protos/flexible_grid_anchor_generator.proto: File not found.
object_detection/protos/grid_anchor_generator.proto: File not found.
object_detection/protos/multiscale_anchor_generator.proto: File not found.
object_detection/protos/ssd_anchor_generator.proto: File not found.
content/models/research/object_detection/protos/anchor_generator.proto:5:1: Import "object_detection/protos/flexible_grid_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:6:1: Import "object_detection/protos/grid_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:7:1: Import "object_detection/protos/multiscale_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:8:1: Import "object_detection/protos/ssd_anchor_generator.proto" was not found or had errors.
content/models/research/object_detection/protos/anchor_generator.proto:14:5: "GridAnchorGenerator" is not defined.
content/models/research/object_detection/protos/anchor_generator.proto:15:5: "SsdAnchorGenerator" is not defined.
content/models/research/object_detection/protos/anchor_generator.proto:16:5: "MultiscaleAnchorGenerator" is not defined.
content/models/research/object_detection/protos/anchor_generator.proto:17:5: "FlexibleGridAnchorGenerator" is not defined. |
When I try to manually run python3 content/models/research/object_detection/builders/model_builder_test.py I get 2022-01-22 16:52:16.333544: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2022-01-22 16:52:16.333627: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "/home/mio/Desktop/Python/depthai-ml-training/colab-notebooks/content/models/research/object_detection/builders/model_builder_test.py", line 21, in <module>
from object_detection.builders import model_builder
File "/usr/local/lib/python3.9/dist-packages/object_detection/builders/model_builder.py", line 20, in <module>
from object_detection.builders import anchor_generator_builder
File "/usr/local/lib/python3.9/dist-packages/object_detection/builders/anchor_generator_builder.py", line 21, in <module>
from object_detection.protos import anchor_generator_pb2
ImportError: cannot import name 'anchor_generator_pb2' from 'object_detection.protos' (/usr/local/lib/python3.9/dist-packages/object_detection/protos/__init__.py) |
I had issue with tf_slim which was not identified as tensorflow (even thought it is.. ) by my ubuntu system so I had to install tensorflow. After that I had to make manual changes to some files requiring tensorflow.. The file generate_tfrecord.py under helpers should have:
|
Have you already tried to run the original notebook in Google Colaboratory? |
I haven't tried on Google Colaboratory yet. I will look into it probably tomorrow since today I'll have no free time. |
Testing the notebooks on Google colab. I just edited the numpy version to 1.18.5. |
I have execute part of the colab notebook in google colab and started the trainig (5000 steps) and got (partial output): WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0124 19:58:55.851711 139760147408768 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 5000
I0124 19:58:55.851987 139760147408768 config_util.py:552] Maybe overwriting train_steps: 5000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0124 19:58:55.852071 139760147408768 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0124 19:58:55.852135 139760147408768 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0124 19:58:55.852210 139760147408768 config_util.py:552] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0124 19:58:55.852302 139760147408768 config_util.py:552] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0124 19:58:55.852364 139760147408768 config_util.py:562] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0124 19:58:55.852458 139760147408768 model_lib.py:733] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0124 19:58:55.852549 139760147408768 model_lib.py:768] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': 'training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1c096f2e90>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0124 19:58:55.853105 139760147408768 estimator.py:212] Using config: {'_model_dir': 'training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f1c096f2e90>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f1c096f4440>) includes params argument, but params are not passed to Estimator.
W0124 19:58:55.853359 139760147408768 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f1c096f4440>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0124 19:58:55.853889 139760147408768 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0124 19:58:55.854079 139760147408768 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0124 19:58:55.854305 139760147408768 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0124 19:58:55.863761 139760147408768 deprecation.py:323] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0124 19:58:55.893014 139760147408768 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0124 19:58:55.899045 139760147408768 deprecation.py:323] From /content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0124 19:58:55.922095 139760147408768 deprecation.py:323] From /content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:Entity <bound method TfExampleDecoder.decode of <object_detection.data_decoders.tf_example_decoder.TfExampleDecoder object at 0x7f1c0968f590>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
W0124 19:58:55.958498 139760147408768 ag_logging.py:146] Entity <bound method TfExampleDecoder.decode of <object_detection.data_decoders.tf_example_decoder.TfExampleDecoder object at 0x7f1c0968f590>> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: module 'gast' has no attribute 'Index'
WARNING:tensorflow:Entity <function train_input.<locals>.transform_and_pad_input_data_fn at 0x7f1c096f48c0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
W0124 19:58:56.169199 139760147408768 ag_logging.py:146] Entity <function train_input.<locals>.transform_and_pad_input_data_fn at 0x7f1c096f48c0> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Bad argument number for Name: 3, expecting 4
WARNING:tensorflow:From /content/models/research/object_detection/inputs.py:79: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0124 19:58:56.176440 139760147408768 deprecation.py:323] From /content/models/research/object_detection/inputs.py:79: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0124 19:58:56.184695 139760147408768 deprecation.py:323] From /content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /content/models/research/object_detection/core/preprocessor.py:199: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0124 19:58:56.287968 139760147408768 deprecation.py:323] From /content/models/research/object_detection/core/preprocessor.py:199: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /content/models/research/object_detection/inputs.py:260: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0124 19:58:57.168424 139760147408768 deprecation.py:323] From /content/models/research/object_detection/inputs.py:260: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0124 19:58:57.677293 139760147408768 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /usr/local/lib/python3.7/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0124 19:58:58.253639 139760147408768 deprecation.py:323] From /usr/local/lib/python3.7/dist-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.308069 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.345531 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.385142 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.424049 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.462349 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0124 19:59:01.498401 139760147408768 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
W0124 19:59:01.545248 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_2_3x3_s2_512/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint.
W0124 19:59:01.545510 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_3_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0124 19:59:01.545620 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_4_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0124 19:59:01.545768 139760147408768 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_5_3x3_s2_128/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 64, 128]], model variable shape: [[3, 3, 64, 128]]. This variable will not be initialized from the checkpoint.
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0124 19:59:09.956084 139760147408768 deprecation.py:506] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
I0124 19:59:17.511219 139760147408768 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0124 19:59:17.513154 139760147408768 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0124 19:59:21.905810 139760147408768 monitored_session.py:240] Graph was finalized.
2022-01-24 19:59:21.911875: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200125000 Hz
2022-01-24 19:59:21.912132: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x559cb92652c0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-24 19:59:21.912168: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2022-01-24 19:59:21.914559: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-01-24 19:59:21.930257: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-01-24 19:59:21.930318: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (69abbd1c0b54): /proc/driver/nvidia/version does not exist
INFO:tensorflow:Restoring parameters from training/model.ckpt-0
I0124 19:59:21.932186 139760147408768 saver.py:1284] Restoring parameters from training/model.ckpt-0
WARNING:tensorflow:From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/saver.py:1069: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
W0124 19:59:23.979806 139760147408768 deprecation.py:323] From /tensorflow-1.15.2/python3.7/tensorflow_core/python/training/saver.py:1069: get_checkpoint_mtimes (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file utilities to get mtimes.
INFO:tensorflow:Running local_init_op.
I0124 19:59:25.345736 139760147408768 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0124 19:59:25.852584 139760147408768 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into training/model.ckpt.
I0124 19:59:38.208925 139760147408768 basic_session_run_hooks.py:606] Saving checkpoints for 0 into training/model.ckpt.
INFO:tensorflow:loss = 5.796848, step = 1
I0124 20:00:06.162297 139760147408768 basic_session_run_hooks.py:262] loss = 5.796848, step = 1 |
At the actual speed I think it will need several hours to run on google machine. After awhile it stops running forward. Full output saved on a github test repository: https://github.com/gteti/test_colab |
Obtained requirements from colab machine: |
To fix the error of from object_detection.builders import model_builder
ModuleNotFoundError: No module named 'object_detection Run a !protoc object_detection/protos/*.proto --python_out=.
import os
os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/'
!python object_detection/builders/model_builder_test.py Into YOURPATH !protoc object_detection/protos/*.proto --python_out=.
import os
os.environ['PYTHONPATH'] += ':/home/user/Desktop/Python/content/models/research/:/home/user/Desktop/Python/content/models/research/slim/'
!python object_detection/builders/model_builder_test.py |
It seems like I was able to start the training of the network on my ubuntu-VM. I report part of the output: WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0127 21:17:39.417522 139979415294272 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 5000
I0127 21:17:39.417962 139979415294272 config_util.py:552] Maybe overwriting train_steps: 5000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0127 21:17:39.418074 139979415294272 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0127 21:17:39.418196 139979415294272 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0127 21:17:39.418298 139979415294272 config_util.py:552] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0127 21:17:39.418381 139979415294272 config_util.py:552] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0127 21:17:39.418462 139979415294272 config_util.py:562] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0127 21:17:39.418811 139979415294272 model_lib.py:733] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0127 21:17:39.418966 139979415294272 model_lib.py:768] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': '/home/mio/Desktop/Python/content/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4f20b90290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0127 21:17:39.419566 139979415294272 estimator.py:212] Using config: {'_model_dir': '/home/mio/Desktop/Python/content/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f4f20b90290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f4f20063f80>) includes params argument, but params are not passed to Estimator.
W0127 21:17:39.419861 139979415294272 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f4f20063f80>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0127 21:17:39.420423 139979415294272 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0127 21:17:39.420609 139979415294272 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0127 21:17:39.420889 139979415294272 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0127 21:17:39.513192 139979415294272 deprecation.py:323] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0127 21:17:39.561916 139979415294272 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0127 21:17:39.574103 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0127 21:17:39.639028 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0127 21:17:58.056502 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0127 21:17:58.272776 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0127 21:18:10.832646 139979415294272 api.py:332] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0127 21:18:17.888295 139979415294272 deprecation.py:323] From /home/mio/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0127 21:18:24.972307 139979415294272 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0127 21:18:25.546338 139979415294272 deprecation.py:323] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.694778 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.741593 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.812718 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.886270 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:30.957675 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0127 21:18:31.019108 139979415294272 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
W0127 21:18:31.099144 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_2_3x3_s2_512/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint.
W0127 21:18:31.099448 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_3_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0127 21:18:31.099679 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_4_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0127 21:18:31.099967 139979415294272 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_5_3x3_s2_128/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 64, 128]], model variable shape: [[3, 3, 64, 128]]. This variable will not be initialized from the checkpoint.
WARNING:tensorflow:From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0127 21:18:46.981603 139979415294272 deprecation.py:506] From /home/mio/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
I0127 21:19:00.440623 139979415294272 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0127 21:19:00.442553 139979415294272 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0127 21:19:12.140742 139979415294272 monitored_session.py:240] Graph was finalized.
2022-01-27 21:19:12.167238: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2022-01-27 21:19:12.429683: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 1996235000 Hz
2022-01-27 21:19:12.430973: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x558d6050be30 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-27 21:19:12.431089: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2022-01-27 21:19:12.508776: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2022-01-27 21:19:12.540763: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2022-01-27 21:19:12.540931: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (mio-VirtualBox): /proc/driver/nvidia/version does not exist
INFO:tensorflow:Running local_init_op.
I0127 21:19:28.997606 139979415294272 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0127 21:19:29.971707 139979415294272 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /home/mio/Desktop/Python/content/training/model.ckpt.
I0127 21:19:53.067091 139979415294272 basic_session_run_hooks.py:606] Saving checkpoints for 0 into /home/mio/Desktop/Python/content/training/model.ckpt.
2022-01-27 21:20:26.198047: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-27 21:20:26.355976: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-27 21:20:26.453866: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-27 21:20:26.558434: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory. |
The running process stops after some minutes using my notebook on vscode connected through ssh to the VM with ubuntu. |
I managed to go a little further by increasing the GB of RAM I set up for the VM. Now I'm at this crash output: WARNING:tensorflow:Forced number of epochs for all eval validations to be 1.
W0128 14:47:00.409278 140083713515904 model_lib.py:717] Forced number of epochs for all eval validations to be 1.
INFO:tensorflow:Maybe overwriting train_steps: 1000
I0128 14:47:00.409551 140083713515904 config_util.py:552] Maybe overwriting train_steps: 1000
INFO:tensorflow:Maybe overwriting use_bfloat16: False
I0128 14:47:00.409642 140083713515904 config_util.py:552] Maybe overwriting use_bfloat16: False
INFO:tensorflow:Maybe overwriting sample_1_of_n_eval_examples: 1
I0128 14:47:00.409811 140083713515904 config_util.py:552] Maybe overwriting sample_1_of_n_eval_examples: 1
INFO:tensorflow:Maybe overwriting eval_num_epochs: 1
I0128 14:47:00.409902 140083713515904 config_util.py:552] Maybe overwriting eval_num_epochs: 1
INFO:tensorflow:Maybe overwriting load_pretrained: True
I0128 14:47:00.409976 140083713515904 config_util.py:552] Maybe overwriting load_pretrained: True
INFO:tensorflow:Ignoring config override key: load_pretrained
I0128 14:47:00.410127 140083713515904 config_util.py:562] Ignoring config override key: load_pretrained
WARNING:tensorflow:Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
W0128 14:47:00.410353 140083713515904 model_lib.py:733] Expected number of evaluation epochs is 1, but instead encountered `eval_on_train_input_config.num_epochs` = 0. Overwriting `num_epochs` to 1.
INFO:tensorflow:create_estimator_and_inputs: use_tpu False, export_to_tpu False
I0128 14:47:00.410442 140083713515904 model_lib.py:768] create_estimator_and_inputs: use_tpu False, export_to_tpu False
INFO:tensorflow:Using config: {'_model_dir': '/home/user/Desktop/Python/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f679d59e290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
I0128 14:47:00.411472 140083713515904 estimator.py:212] Using config: {'_model_dir': '/home/user/Desktop/Python/training/', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
rewrite_options {
meta_optimizer_iterations: ONE
}
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_service': None, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x7f679d59e290>, '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
WARNING:tensorflow:Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f679c4009e0>) includes params argument, but params are not passed to Estimator.
W0128 14:47:00.411842 140083713515904 model_fn.py:630] Estimator's model_fn (<function create_model_fn.<locals>.model_fn at 0x7f679c4009e0>) includes params argument, but params are not passed to Estimator.
INFO:tensorflow:Not using Distribute Coordinator.
I0128 14:47:00.412261 140083713515904 estimator_training.py:186] Not using Distribute Coordinator.
INFO:tensorflow:Running training and evaluation locally (non-distributed).
I0128 14:47:00.412449 140083713515904 training.py:612] Running training and evaluation locally (non-distributed).
INFO:tensorflow:Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
I0128 14:47:00.412909 140083713515904 training.py:700] Start train and evaluate loop. The evaluate will happen after every checkpoint. Checkpoint frequency is determined based on RunConfig arguments: save_checkpoints_steps None or save_checkpoints_secs 600.
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
W0128 14:47:00.423244 140083713515904 deprecation.py:323] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/training_util.py:236: Variable.initialized_value (from tensorflow.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Use Variable.read_value. Variables in 2.X are initialized automatically both in eager and graph (inside tf.defun) contexts.
WARNING:tensorflow:num_readers has been reduced to 1 to match input file shards.
W0128 14:47:00.459816 140083713515904 dataset_builder.py:83] num_readers has been reduced to 1 to match input file shards.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
W0128 14:47:00.466567 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:100: parallel_interleave (from tensorflow.python.data.experimental.ops.interleave_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.interleave(map_func, cycle_length, block_length, num_parallel_calls=tf.data.experimental.AUTOTUNE)` instead. If sloppy execution is desired, use `tf.data.Options.experimental_determinstic`.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
W0128 14:47:00.497107 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py:175: DatasetV1.map_with_legacy_function (from tensorflow.python.data.ops.dataset_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.data.Dataset.map()
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
W0128 14:47:20.122996 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:76: sparse_to_dense (from tensorflow.python.ops.sparse_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Create a `tf.sparse.SparseTensor` and use `tf.sparse.to_dense` instead.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
W0128 14:47:20.301322 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/utils/ops.py:493: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
W0128 14:47:32.506782 140083713515904 api.py:332] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/autograph/operators/control_flow.py:1004: sample_distorted_bounding_box (from tensorflow.python.ops.image_ops_impl) is deprecated and will be removed in a future version.
Instructions for updating:
`seed2` arg is deprecated.Use sample_distorted_bounding_box_v2 instead.
WARNING:tensorflow:From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
W0128 14:47:39.211907 140083713515904 deprecation.py:323] From /home/user/Desktop/Python/content/models/research/object_detection/inputs.py:258: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
INFO:tensorflow:Calling model_fn.
I0128 14:47:45.380594 140083713515904 estimator.py:1148] Calling model_fn.
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
W0128 14:47:45.933379 140083713515904 deprecation.py:323] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tf_slim/layers/layers.py:1089: Layer.apply (from tensorflow.python.keras.engine.base_layer) is deprecated and will be removed in a future version.
Instructions for updating:
Please use `layer.__call__` method instead.
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.544731 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.587774 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.625890 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.664865 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.710493 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
INFO:tensorflow:depth of additional conv before box predictor: 0
I0128 14:47:49.753618 140083713515904 convolutional_box_predictor.py:156] depth of additional conv before box predictor: 0
W0128 14:47:49.813167 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_2_3x3_s2_512/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 256, 512]], model variable shape: [[3, 3, 256, 512]]. This variable will not be initialized from the checkpoint.
W0128 14:47:49.813374 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_3_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0128 14:47:49.813501 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_4_3x3_s2_256/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 128, 256]], model variable shape: [[3, 3, 128, 256]]. This variable will not be initialized from the checkpoint.
W0128 14:47:49.813614 140083713515904 variables_helper.py:153] Variable [FeatureExtractor/MobilenetV2/layer_19_2_Conv2d_5_3x3_s2_128/weights] is available in checkpoint, but has an incompatible shape with model variable. Checkpoint shape: [[1, 1, 64, 128]], model variable shape: [[3, 3, 64, 128]]. This variable will not be initialized from the checkpoint.
WARNING:tensorflow:From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
W0128 14:47:59.193702 140083713515904 deprecation.py:506] From /home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/rmsprop.py:119: calling Ones.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
INFO:tensorflow:Done calling model_fn.
I0128 14:48:07.704100 140083713515904 estimator.py:1150] Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
I0128 14:48:07.705442 140083713515904 basic_session_run_hooks.py:541] Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
I0128 14:48:15.565379 140083713515904 monitored_session.py:240] Graph was finalized.
2022-01-28 14:48:15.567888: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2022-01-28 14:48:15.592850: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2591995000 Hz
2022-01-28 14:48:15.593048: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556990790de0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-01-28 14:48:15.593065: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
INFO:tensorflow:Running local_init_op.
I0128 14:48:21.931502 140083713515904 session_manager.py:500] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I0128 14:48:22.609029 140083713515904 session_manager.py:502] Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 0 into /home/user/Desktop/Python/training/model.ckpt.
I0128 14:48:42.721936 140083713515904 basic_session_run_hooks.py:606] Saving checkpoints for 0 into /home/user/Desktop/Python/training/model.ckpt.
2022-01-28 14:49:00.066397: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.172177: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.228564: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.285738: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
2022-01-28 14:49:00.339917: W tensorflow/core/framework/cpu_allocator_impl.cc:81] Allocation of 25920000 exceeds 10% of system memory.
INFO:tensorflow:loss = 6.091243, step = 1
I0128 14:49:08.479160 140083713515904 basic_session_run_hooks.py:262] loss = 6.091243, step = 1
INFO:tensorflow:global_step/sec: 0.222944
I0128 14:56:36.980542 140083713515904 basic_session_run_hooks.py:692] global_step/sec: 0.222944
INFO:tensorflow:loss = 2.5264313, step = 101 (448.527 sec)
I0128 14:56:36.995469 140083713515904 basic_session_run_hooks.py:260] loss = 2.5264313, step = 101 (448.527 sec)
INFO:tensorflow:Saving checkpoints for 131 into /home/user/Desktop/Python/training/model.ckpt.
I0128 14:58:51.453722 140083713515904 basic_session_run_hooks.py:606] Saving checkpoints for 131 into /home/user/Desktop/Python/training/model.ckpt.
Traceback (most recent call last):
File "/home/user/Desktop/Python/content/models/research/object_detection/model_main.py", line 114, in <module>
tf.app.run()
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/absl/app.py", line 312, in run
_run_main(main, args)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/absl/app.py", line 258, in _run_main
sys.exit(main(argv))
File "/home/user/Desktop/Python/content/models/research/object_detection/model_main.py", line 110, in main
tf.estimator.train_and_evaluate(estimator, train_spec, eval_specs[0])
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 473, in train_and_evaluate
return executor.run()
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 613, in run
return self.run_local()
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 714, in run_local
saving_listeners=saving_listeners)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 370, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1161, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1195, in _train_model_default
saving_listeners)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1494, in _train_with_estimator_spec
_, loss = mon_sess.run([estimator_spec.train_op, estimator_spec.loss])
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 754, in run
run_metadata=run_metadata)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1259, in run
run_metadata=run_metadata)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1360, in run
raise six.reraise(*original_exc_info)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/six.py", line 703, in reraise
raise value
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1345, in run
return self._sess.run(*args, **kwargs)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/monitored_session.py", line 1426, in run
run_metadata=run_metadata))
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/basic_session_run_hooks.py", line 594, in after_run
if self._save(run_context.session, global_step):
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/training/basic_session_run_hooks.py", line 619, in _save
if l.after_save(session, step):
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 519, in after_save
self._evaluate(global_step_value) # updates self.eval_result
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 539, in _evaluate
self._evaluator.evaluate_and_export())
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/training.py", line 920, in evaluate_and_export
hooks=self._eval_spec.hooks)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 480, in evaluate
name=name)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 522, in _actual_eval
return _evaluate()
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 504, in _evaluate
self._evaluate_build_graph(input_fn, hooks, checkpoint_path))
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1511, in _evaluate_build_graph
self._call_model_fn_eval(input_fn, self.config))
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1544, in _call_model_fn_eval
input_fn, ModeKeys.EVAL)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1025, in _get_features_and_labels_from_input_fn
self._call_input_fn(input_fn, mode))
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_estimator/python/estimator/estimator.py", line 1116, in _call_input_fn
return input_fn(**kwargs)
File "/home/user/Desktop/Python/content/models/research/object_detection/inputs.py", line 808, in _eval_input_fn
params=params)
File "/home/user/Desktop/Python/content/models/research/object_detection/inputs.py", line 931, in eval_input
reduce_to_frame_fn=reduce_to_frame_fn)
File "/home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py", line 184, in build
config.input_path[:], input_reader_config, filename_shard_fn=shard_fn)
File "/home/user/Desktop/Python/content/models/research/object_detection/builders/dataset_builder.py", line 75, in read_dataset
filenames = tf.gfile.Glob(input_files)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 363, in get_matching_files
return get_matching_files_v2(filename)
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 390, in get_matching_files_v2
for single_filename in pattern
File "/home/user/anaconda3/envs/trainenv/lib/python3.7/site-packages/tensorflow_core/python/lib/io/file_io.py", line 392, in <listcomp>
compat.as_bytes(single_filename))
tensorflow.python.framework.errors_impl.NotFoundError: content; No such file or directory |
https://www.diffchecker.com/CDOzAwKQ between requirements of vmubuntu and marlyn |
It seems it was working till I killed the vmubuntu machine this morning: I used this requirements file (output from conda env): |
I was able to run the training code on 1 VM of mine. I stopped the training after completing 1000 steps and rerun it with fewer steps (from 5000 to 1500) because I wanted to see the rest of the notebook working. I think that it doesn't matter where the training folder is, what matters is configuring correctly the path of os.environ['PYTHONPATH'] += ':/content/models/research/:/content/models/research/slim/' . I've finished the 1500 steps run now, and I finally got the training/export folder where the saved_model.pb is. I'll finally see the rest of notebook and I HOPE I can reproduce this working environment on a real machine instead of a VM. All of this with the |
I'm stuck now because of openvino installation not supported for Ubuntu. The proposed So by looking and openvino documentation: https://docs.openvino.ai/latest/openvino_docs_install_guides_installing_openvino_apt.html I was able to install it manually using the So I thought about installing the dependecies which for me are located at |
You probably learned this from a reliable source, but I would double this statement by asking the direct question on the "#openvino" channel in the Luxonis Community on Discord |
I will do it! Thanks |
The process done in the notebook is in summary: Configure the Model Optimizer for TensorFlow* (TensorFlow was used to train your model). As explained in https://docs.openvino.ai/2021.2/openvino_docs_MO_DG_prepare_model_convert_model_Convert_Model_From_TensorFlow.html |
I was finally able to create a BLOB export of the model. Following the guide I've created a folder for a
Running the command python3 depthai_demo.py -cnn custom_mobilenet Gets me the following error. What am I doing wrong ? [I'm training and creating a blob on 2 separate VM with ubuntu and I'm running the python depthai_demo on my main host windows] |
I've asked help from Luxonis. Hope they can help me once again! |
luxonis/depthai#335 this will probably help in running our python code with custom NN. |
With a little modification to our Python program, just add the usage of our training output .blob file, I was able to test the success procedure of training and deploying of the trained NN. |
Description
Testing the notebook https://github.com/luxonis/depthai-ml-training/blob/master/colab-notebooks/Easy_Object_Detection_With_Custom_Data_Demo_Training.ipynb present in https://github.com/luxonis/depthai-ml-training
Additional Information
Trying to run the notebook in windows and if it doesn't work on a ubuntu machine.
The text was updated successfully, but these errors were encountered: