-
Hello! Why does the list (map (vocab. Index, input_string)) report an error valueerror: substring not found when training the recognition model |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hi @lfxuan, Thanks for reporting this, would you please give me the following:
Thanks: 🙏 |
Beta Was this translation helpful? Give feedback.
-
'Morning @lfxuan 👋 As mentioned by Charles, we would need a bit more information to have a comprehensive answer. But considering your error, I'm guessing you're training on a dataset that has characters outside of the vocab you selected 🤔 You can easily whether this is the case by printing the string that causes this error and then checking whether all characters are included in the vocab https://github.com/mindee/doctr/blob/main/doctr/datasets/vocabs.py (the default one on the script is "french") 👍 If this is the case, try to select a more appropriate vocab for your dataset, and if it doesn't exist yet in docTR, we can discuss whether we should extend the range of it 😁 Have a good day! |
Beta Was this translation helpful? Give feedback.
'Morning @lfxuan 👋
As mentioned by Charles, we would need a bit more information to have a comprehensive answer. But considering your error, I'm guessing you're training on a dataset that has characters outside of the vocab you selected 🤔
You can easily whether this is the case by printing the string that causes this error and then checking whether all characters are included in the vocab https://github.com/mindee/doctr/blob/main/doctr/datasets/vocabs.py (the default one on the script is "french") 👍
If this is the case, try to select a more appropriate vocab for your dataset, and if it doesn't exist yet in docTR, we can discuss whether we should extend the range of it 😁
Have a good day!