You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to read some PDFs with lazy: true, the parser raises an exception and stops. The same PDFs are read without a problem with lazy: false and no errors are indicated.
Origami::PDF.read(pdf_content_stream, lazy: true, verbosity: Origami::Parser::VERBOSE_TRACE)
[info ] ...Reading header...
[error] Breaking on: "\xBF\xBD\xEF\xBF\xBD\x04|\r\xEF\xBF..." at offset 0x3445c
[error] Last exception: [Origami::InvalidObjectError] Object shall begin with '%d %d obj' statement
[debug] Skipping this indirect object.
[trace] Read Stream object, 33 0 R
Origami::Parser::ParsingError: Invalid xref stream
from /.rvm/gems/ruby-2.5.1/gems/origami-2.1.0/lib/origami/parsers/pdf/lazy.rb:159:in `parse_revision_from_xrefstm'
I've managed to trace the error to the fact that in the snippet below, parse_object fails on its first attempt, logging the two [error]s, and then successfully returns a Origami::Stream object. Of course Origami::Stream!=Origami::XRefStream so the exception is raised. But an interesting thing is that XrefStream < Stream.
I don't know much about PDF files, so I don't know if this is working as intended, or not. In any case, what solutions would there be to properly reading the file? Any ones more proper than below?
Ruby: 2.5.1
Origami: 2.1.0
When trying to read some PDFs with
lazy: true
, the parser raises an exception and stops. The same PDFs are read without a problem withlazy: false
and no errors are indicated.I've managed to trace the error to the fact that in the snippet below,
parse_object
fails on its first attempt, logging the two[error]
s, and then successfully returns aOrigami::Stream
object. Of courseOrigami::Stream
!=
Origami::XRefStream
so the exception is raised. But an interesting thing is thatXrefStream < Stream
.I don't know much about PDF files, so I don't know if this is working as intended, or not. In any case, what solutions would there be to properly reading the file? Any ones more proper than below?
The text was updated successfully, but these errors were encountered: