Replies: 3 comments 8 replies
-
I found TensorFlow >2.5 to be incompatible with the current MAIA code. The novelty detection could work but the instance segmentation produces incorrect results. For biigle.de we build a TensorFlow 2.5.3 Docker image ourselves. You can clone biigle/tensorflow and check out the
The older TensorFlow version also accumulated lots of security vulnerabilities. Our way forward is to port MAIA to PyTorch and MMDetection (biigle/maia#96). We hope that this will make future improvements easier. I'm currently working on a PyTorch implementation for the novelty detection but it may take a while until it is finished. |
Beta Was this translation helpful? Give feedback.
-
Following your suggestion, I built tensorflow:2.5.3-gpu, an re-run the MAIA novelty detection.
or kill python3 processes, or restart docker composer, even reboot my machine. The Maia job always keeps runnning.
|
Beta Was this translation helpful? Give feedback.
-
I decided to reboot, and the Novelty Detection ran fast within 5 minutes. So I think the guess is right: The previous slow-execution problem may due to it ran with CPU (somehow it cannot use GPU? although from nvidia-smi watching, the python3 process is activiated and GPU memory is occupied). However, I started to run Instance segmentation stage following this novelty detection results. It got some errors, ran very slow (again) and finally got Core dumped (for a whole day running). The logs is as following. Any comments/tips to debug this or try would be appreciated. Thans a lot!! Errors including (extracted from the whole logs):
Total GPU memory is 8 GB and consumed 7.x GB.
System log in ubuntu /var/log/kern.log
|
Beta Was this translation helpful? Give feedback.
-
Errors when running MAIA novelty detection job:
Google this message and it's said "imsave is deprecated in SciPy 1.2.x". I modified the requirements.txt from scipy==1.7.2 to scipy==1.2.1, re-build, the job then continues, but notice other warings in this build process:
Remark that the docker pull get error when using tensorflow 2.5.3, so I modify the gpu-worker.dockerfile to:
Although the novelty job keep running which started 2.5hr before, it's extremely slower than my previous experience to run it
(My test volume has 16 photos, each with 2048 x 1536(pixels) and about 990KB in size)
I find a temporary logs in subdirectory maia-xx-novelty-detection under maia-jobs. It seems tell me "To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.".
Is there anything wrong or prerequisites I missed to use TensorFlow in this MAIA job with GPU?
Thanks.
Remark: the MAIA novelty detection keeps using 607MB GPU memory (total 8G) by watching nvidia-smi:
Beta Was this translation helpful? Give feedback.
All reactions