How to use pix2struct for pure OCR tasks #33

ash2703 · 2023-05-02T13:06:16Z

Using the OCR-VQA model does not always give consistent results when the prompt is left unchanged
What is the most consitent way to use the model as an OCR?

hrithickcodes · 2023-07-27T20:20:56Z

You can use the fine-tuned text-caps model, train it again on ocr tasks which involves learning to generate text from a given image. I hope this will get you there.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to use pix2struct for pure OCR tasks #33

How to use pix2struct for pure OCR tasks #33

ash2703 commented May 2, 2023

hrithickcodes commented Jul 27, 2023

How to use pix2struct for pure OCR tasks #33

How to use pix2struct for pure OCR tasks #33

Comments

ash2703 commented May 2, 2023

hrithickcodes commented Jul 27, 2023