-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault (core dumped) #64
Comments
Hi, @cbrunet @bzamecnik I have encountered many such files. Is it possible to not return font information in these cases and just the text and prevent a core dump? |
@cbrunet @bzamecnik sorry to tag you guys again but this issue has increased and can be now seen on a lot many documents. It will be really helpful if you can take a look at it. |
@bzamecnik by any chance did you take a look at this once? Sorry for tagging again. |
@avirala-eightfold Hi, I can possibly check that. I made the request for sharing the file. Have you managed to confirm it yet? |
@bzamecnik Yes yes I somehow missed it sorry for the delay. I have shared it again can you please confirm if you can access it? |
|
Sorry for bugging you again but @bzamecnik did you get a chance to look into it? |
@avirala-eightfold Sorry, no I didn't have chance to look at it. Is there anything that prevents you to investigate it? UPDATE: I can confirm that it crashes on a Segmentation fault. That's all I can see without rebuilding the code. 🤷 Running with
Enabling the
Some fiddling with the code:
Looking at the gdb output, the crash may come from this place: https://github.com/freedesktop/poppler/blob/master/cpp/poppler-page.cpp#L461
...which would mean some reference to the font info is wrong (either |
Thank you so much for looking into it, let me try to take this as the base and move forward to find anything else |
Hi, Thank you for this amazing work. Recently I was working with some pdf and poppler was working great for most of them but for some of those pdf I am seeing the following error:-
Considering this is a memory issue I also can't put it in a try & catch to prevent my code from rebooting the workers again and again just to be stuck over there. This has been a major problem for me.
To give you some context and debugging that I have gone ahead with:-
page.text_list(page.TextListOption.text_list_include_font)
pdf_document.create_font_iterator()
, this also works but while getting this on the text_box level I face this errorboxes = self._page.text_list(opt_flag)
inpage.py
the code is stopped with the errorThe metadata for the pdf that I see such errors with is mostly (not always):-
The code to repro the error:-
The link to the pdf:- https://drive.google.com/file/d/180CDGyiJRfytvuzVsAiYKppHvaBABGkJ/view?usp=sharing
Please request access to the pdf as I can't share it publically. (Really sorry for this, but I hope you understand)
The text was updated successfully, but these errors were encountered: