Open
Description
What would you like to be added:
Improved text recognition within screenshots.
Why is this needed:
Tesseract is pretty great, but sometimes doesn't recognise text on screenshots. We already scale the screenshot up 3x before running tesseract-ocr on it. That improved text recognition tremendously. But I think there's more we could do.
Additional context:
It's possible to train tesseract to create our own dataset. Is that worthwhile? Is it worth training tesseract on the Ubuntu font for example?