Using doctr with layoutLMv3 #1092

viraj071 · 2022-10-10T03:05:10Z

viraj071
Oct 10, 2022

I am trying to build a system to classify documents in (PDF, Image) formats. I am trying to use LayoutLMv3 for the classification task and doctr for the OCR. Does anyone know how I can pass the words and bounding boxes into the LayoutLMv3 processor? Any help or leads is appreciated.

Answered by felixdittrich92

Oct 10, 2022

Hi @viraj071 👋 ,
take a look at the code snippet i have provided in #1088
In your case no need to rescale the coords you can pass it as is (processor needs normalized coords)
So easily save words and coords in a list and pass it into the processor :)

View full answer

felixdittrich92 · 2022-10-10T06:01:28Z

felixdittrich92
Oct 10, 2022
Maintainer

Hi @viraj071 👋 ,
take a look at the code snippet i have provided in #1088
In your case no need to rescale the coords you can pass it as is (processor needs normalized coords)
So easily save words and coords in a list and pass it into the processor :)

4 replies

viraj071 Oct 10, 2022
Author

Thank you. When you multiply the values by the width and the height, the values can go over 1000 but LayoutLMv3 needs them to be less than 1000.

felixdittrich92 Oct 10, 2022
Maintainer

Yeah no need to do this as explained coords are already normalized :)

viraj071 Oct 10, 2022
Author

I am currently doing this
[geometry[0][0]/width * 1000, geometry[0][1] / height * 1000, geometry[1][0] / width * 1000, geometry[1][1] / height * 1000].

Is the above required or are the normalized coords ready to use directly?

felixdittrich92 Oct 12, 2022
Maintainer

Hi @viraj071,

I think you have to do the following:
take #1088 (coords corresponding to original image size) code snippet as is and do the following:

box = [
  int(1000 * (xmin / width)),
  int(1000 * (ymin / height)),
  int(1000 * (xmax / width)),
  int(1000 * (ymax / height)),
]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using doctr with layoutLMv3 #1092

{{title}}

Replies: 1 comment 4 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Using doctr with layoutLMv3 #1092

viraj071 Oct 10, 2022

Replies: 1 comment · 4 replies

felixdittrich92 Oct 10, 2022 Maintainer

viraj071 Oct 10, 2022 Author

felixdittrich92 Oct 10, 2022 Maintainer

viraj071 Oct 10, 2022 Author

felixdittrich92 Oct 12, 2022 Maintainer

viraj071
Oct 10, 2022

Replies: 1 comment 4 replies

felixdittrich92
Oct 10, 2022
Maintainer

viraj071 Oct 10, 2022
Author

felixdittrich92 Oct 10, 2022
Maintainer

viraj071 Oct 10, 2022
Author

felixdittrich92 Oct 12, 2022
Maintainer