Using doctr with layoutLMv3 #1092
-
I am trying to build a system to classify documents in (PDF, Image) formats. I am trying to use LayoutLMv3 for the classification task and doctr for the OCR. Does anyone know how I can pass the words and bounding boxes into the LayoutLMv3 processor? Any help or leads is appreciated. |
Beta Was this translation helpful? Give feedback.
Answered by
felixdittrich92
Oct 10, 2022
Replies: 1 comment 4 replies
-
Hi @viraj071 👋 , |
Beta Was this translation helpful? Give feedback.
4 replies
Answer selected by
viraj071
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi @viraj071 👋 ,
take a look at the code snippet i have provided in #1088
In your case no need to rescale the coords you can pass it as is (processor needs normalized coords)
So easily save words and coords in a list and pass it into the processor :)