Train master model on multiple language #1103
Replies: 1 comment 35 replies
-
Hi @khawar-islam 👋, We have already implemented MJSynth and SynthText datasets which are commonly used to train recognition models from scratch (english). More information: with changes in the training script (MJSynth example same changes for SynthText (you could also merge both datasets)):
Side note: I have a lot of other stuff to do currently which is why I had to stop my contributions to docTR for now |
Beta Was this translation helpful? Give feedback.
-
Hello @felixdittrich92,
I hope everything is good. My document contains Korean, English and digits characters. I would like to train a model with
vocabs = korean+english+digits+alphanumeric
and I have only korean dataset (9M images)How i can collect english+digits+alphanumeric? Do i need to collect it or there are any other way to train model with English+digits+alphanumeric?
Beta Was this translation helpful? Give feedback.
All reactions