Merging Bounding Boxes for Multi-line Text #3688
Unanswered
Muhammadraafat1
asked this question in
Looking for help
Replies: 1 comment 5 replies
-
What you are trying to do cannot work! |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, When trying to obtain a single bounding box for a phrase that spans multiple lines. Specifically, the phrase "combination of hematoma" appears as follows:
"combination" is at the end of line 1.
"of hematoma" is at the beginning of line 2.
The corresponding bounding box information for each part of the phrase is as follows:
For "combination": (443.0003356933594, 351.9405212402344, 504.0020446777344, 361.21868896484375)
For "of hematoma": (87.00006103515625, 363.95977783203125, 150.83489990234375, 373.5115661621094)
When I print the text content using page.get_text().splitlines(), the output is:
['from the areolar border. This measured about 2 cm in size and probably is a combination ',
'of hematoma and a breast mass. There were no other masses in the breast and there was ']
I need to calculate a single bounding box that encompasses the entire phrase "combination of hematoma" despite it being split across two lines. My initial approach was to merge the bounding boxes:
x1 = 443.0003356933594
y1 = 351.9405212402344
x2 = 150.83489990234375
y2 = 373.5115661621094
merged_bbox = (x1, y1, x2, y2)
but it's not work right
Beta Was this translation helpful? Give feedback.
All reactions