-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Problem with Retrain DL in Quickannotator #13
Comments
Can you please provide the log files showing what the exact error is?
Unfortunately this information is too high-level for us to provide any
insights
…On Sun, Oct 17, 2021 at 7:46 PM Anuradha Kar ***@***.***> wrote:
Hi,
The installation of the tool runs smoothly as described in the Github
repository but I am encountering problems with retraining the deep learning
model. For example, after adding 2 pairs of images in a new project, making
patches and annotations and uploading them as training and test images, if
we click "Retrain model" on the Project page, I am getting the ERROR:
train_autoencoder (job N) failed. On the Annotations page, clicking the
"Retrain DL" button displays an HTML error.
Please provide suggestions on how to resolve these errors.
Anuradha Kar
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#13>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJ3XTFJJTOXKGPSG63EPU3UHMDWXANCNFSM5GFBYJCA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Hi @anuradhakar49 and @choosehappy could you solve the issue? I'm having the same problem: 2021-11-25 13:54:10,872 [INFO] (THREAD 18304) About to train a new transfer model for try2 System: Best regards, |
Sorry to hear this Mario! Is this information you're putting here from the command line itself, or is it coming from the log file? If you can send over the entire associated log file that would be appreciated In the end, we were able to fix anuradhakar49's problem, it was environmental. if I remember correctly it was an incompatible cuda driver + cuda version? @tasvora may have additional info |
Yes it was environment issue related to cuda, but did not get to look at it
in detail as Anuradha decided to use Linux and it worked fine there.
Regards
Tasneem
…On Thu, Nov 25, 2021 at 10:28 AM choosehappy ***@***.***> wrote:
Sorry to hear this Mario!
Is this information you're putting here from the command line itself, or
is it coming from the log file?
If you can send over the entire associated log file that would be
appreciated
In the end, we were able to fix anuradhakar49's problem, it was
environmental. if I remember correctly it was an incompatible cuda driver +
cuda version? @tasvora <https://github.com/tasvora> may have additional
info
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMTB5DQ57VHROZ2KYZIFHXLUNZIZFANCNFSM5GFBYJCA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Yes this issue is solved and was linked to cuda +torch versions. @mariokreutzfeldt Please check if you have a cuda compatible GPU and that your code is being able to access the GPU (i.e the GPU is not busy with another task) . Also make sure the pytorch version is compatible with cuda 10.2 (https://pytorch.org/get-started/previous-versions/) Else try a reinstall with torch CPU only version to test. |
Dear all, I have verified the CUDA installation via nvcc --version and pytorch installation via torch.cuda.is_available() During installation of QA I ran into many unresolvable version issues.
For Pytorch I had the automatic installation already fail for another project, so I downloaded the packages manually. I installed torch first. When I installed torchaudio and torchvision it would deinstall torch and replace it with a non-cuda version. @choosehappy, the complete log is here Best regards, |
Quick additional info: |
it does like this environment is really going to be the issue.
those libraries have been tested to work together and is what is used to
create e.g., our docker files
unfortunately this log file doesn't appear to contain anything interesting.
can you as well upload all data.* files? there might be up to 3 of them:
data.db, data.db-shm, data.db-wal
…On Fri, Nov 26, 2021 at 2:21 PM mariokreutzfeldt ***@***.***> wrote:
Quick additional info:
replacing the CUDA with CPU versions of pytorch did not solve it.
Still getting ERROR 404.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJ3XTFPIFJZHJBYPDIYKLDUN6CVFANCNFSM5GFBYJCA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Doesn`t contain data.db-wal because the file was 0kb. |
Okay, this database looks like it was cleaned out
It looks like you restarted quick annotator after you had the error, which
by default goes through and clears out old jobs
Can you set this line:
https://github.com/choosehappy/QuickAnnotator/blob/7cf55b1939fc9ad73ccf6d5435b613bfb697c74c/config/config.ini#L7
to False
reproduce your error and send back over?
…On Fri, Nov 26, 2021 at 4:11 PM mariokreutzfeldt ***@***.***> wrote:
@choosehappy <https://github.com/choosehappy> here you go
<https://www.dropbox.com/t/wwWRuHA61zpwkpTn>.
Doesn`t contain data.db-wal because the file was 0kb.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJ3XTEGSAAY67PI26OQC6DUN6PTZANCNFSM5GFBYJCA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Also in addition to that.
If you could copy everything that you see on your console where u
initiating the quick annotator application from and save it as a text file
and send that too would help too, may be there is a specific library error
we might be missing.
Regards
Tasneem
On Fri, Nov 26, 2021 at 10:49 AM choosehappy ***@***.***>
wrote:
… Okay, this database looks like it was cleaned out
It looks like you restarted quick annotator after you had the error, which
by default goes through and clears out old jobs
Can you set this line:
https://github.com/choosehappy/QuickAnnotator/blob/7cf55b1939fc9ad73ccf6d5435b613bfb697c74c/config/config.ini#L7
to False
reproduce your error and send back over?
On Fri, Nov 26, 2021 at 4:11 PM mariokreutzfeldt ***@***.***>
wrote:
> @choosehappy <https://github.com/choosehappy> here you go
> <https://www.dropbox.com/t/wwWRuHA61zpwkpTn>.
>
> Doesn`t contain data.db-wal because the file was 0kb.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <
#13 (comment)
>,
> or unsubscribe
> <
https://github.com/notifications/unsubscribe-auth/ACJ3XTEGSAAY67PI26OQC6DUN6PTZANCNFSM5GFBYJCA
>
> .
> Triage notifications on the go with GitHub Mobile for iOS
> <
https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675
>
> or Android
> <
https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub
>.
>
>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMTB5DSRLZCQ6UKRWSK4ELLUN6T7XANCNFSM5GFBYJCA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Here are the log files and the data.db after changing the config. |
hmm...i think we'll have to jump on a call, these log files and database seem to indicate that things are working as expected : ) |
Thank you @choosehappy and @tasvora for helping solve this issue! Best regards, |
Hi @choosehappy and @mariokreutzfeldt, |
we can start by collecting more information:
|
Sure.
|
hmmm!! this all looks very reasonable! is there any additional information in the console window at the top of the screen on the right? In looking at the API itself and the console information you provided, the only 404 message that seems reasonable is here: Line 147 in cafc757
This would seem to suggest that you don't have a base model already trained? is that the case? if you look here: https://github.com/choosehappy/QuickAnnotator/wiki/Image-List-Page did you use the "3. (re)train model 0" button? this step is needed to give good default weights |
Thanks @choosehappy .
|
Fantastic! so you're all set?
did you encounter this problem when using the provided docker file, or you
were using in your own base operating system?
…On Tue, Jun 14, 2022 at 12:31 PM stellaqu123 ***@***.***> wrote:
Thanks @choosehappy <https://github.com/choosehappy> .
I didn't use "3.(re)train model 0 "button before.
When I use "3.(re)train model 0" button, I got error message in console,
which is like
"
TypeError: Descriptors cannot not be created directly.
If this call came from a _pb2.py file, your generated code is out of date
and must be regenerated with protoc >= 3.19.0.
If you cannot immediately regenerate your protos, some other possible
workarounds are:
1. Downgrade the protobuf package to 3.20.x or lower.
2. Set PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python (but this will
use pure-Python parsing and will be much slower).
".
After downgrade protobuf package to 3.19.1, “3 (re)train model 0” and
Retrain DL function work. The problem is solved.
Thanks for your help! 👍
—
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJ3XTGYXEQZ62ZCBQ7SRT3VPBNOHANCNFSM5GFBYJCA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
yes. I could use Quickannotator Retrain DL function. |
Got it, thanks
Yes, protobuf can be a tricky one to maintain at the os level :)
…On Thu, Jun 23, 2022, 11:21 stellaqu123 ***@***.***> wrote:
yes. I could use Quickannotator Retrain DL function.
I did't use docker. I just installed this package in my operating system.
—
Reply to this email directly, view it on GitHub
<#13 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJ3XTC6LEVU4RPFDE7QCDDVQQUARANCNFSM5GFBYJCA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi,
The installation of the tool runs smoothly as described in the Github repository but I am encountering problems with retraining the deep learning model. For example, after adding 2 pairs of images in a new project, making patches and annotations and uploading them as training and test images, if we click "Retrain model" on the Project page, I am getting the ERROR: train_autoencoder (job N) failed. On the Annotations page, clicking the "Retrain DL" button displays an HTML error.
Please provide suggestions on how to resolve these errors.
Anuradha Kar
The text was updated successfully, but these errors were encountered: