Skip to content

PDF hul Messages 2

Sam Alloing edited this page May 12, 2020 · 91 revisions

PDF-HUL-76

Message

Trailer dictionary Info key is not an indirect reference

Details

The "Info" entry of a trailer dictionary does not contain an indirect object reference (e.g. "1 0 R"). If an "Info" entry exists in a trailer, it should point to the document's information dictionary via an indirect object reference.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-77

Message

Invalid ID in trailer

Details

The "ID" value returned from the trailer dictionary is an array but does not have exactly two elements. The trailer ID is optional but if present it must be an array of two byte strings that constitute a file identifier.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-78

Message

Invalid ID in trailer

Details

Some exception occured processing the trailer "ID" value, most likely an invalid (non byte string) array element. The tailer "ID" is optional but if present it must be an array of two byte strings that constitute a file identifier.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-79

Message

Invalid ID in trailer

Details

The "ID" value returned from the trailer dictionary is not an array. The ID attribute is optional but if present it must be an array of two byte strings that constitute a file identifier.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-80

Message

Invalid object number in cross-reference stream

Details

The object number of a cross-reference stream could not be found ("-1"), or is greater than the total number of entries in the document's cross-reference table at the time that stream was written, meaning either the object number or table size is invalid.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-81

Message

Malformed cross-reference stream

Details

This error doesn't seem to be reachable, there's nothing to throw an I/O exception in the try block code. needs review

References

  • PDF 1.6: Needs review
  • PDF 1.7: Needs review

Impact

Needs review

Remediation

Needs review

PDF-HUL-82

Message

Malformed cross-reference table

Details

The offset or object number (for a free entry) for a cross reference table entry wasn't a numeric literal.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-83

Message

Malformed cross-reference table

Details

The final literal keyword that should be "n" for a table entry or "f" for a free entry is not a keyword at all.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-84

Message

Illegal operator in cross-reference table

Details

An unexpected keyword was found in a cross-reference entry. Expected keywords are "f" or "n".

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-85

Message

No document catalog dictionary

Details

The trailer has no document catalogue entry ("Root") or a trailer was not found. <Insert document catalogue explanation here.> JHOVE's approach to the document catalog is a little scattergun. Specifically here the reference to the document catalog is null. It's not clear that this can be reached as similar checks are done when parsing the trailer earlier.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-86

Message

No document catalog dictionary

Details

The trailer contains a document catalogue entry ("Root") but it cannot be resolved. <Insert document catalogue explanation here.>

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-87

Message

File header gives version as ..., but catalog dictionary gives version as ...

Details

The PDF version specified in the header is different from the version specified in the document catalogue dictionary. This is OK by specification and the higher PDF version "wins" in terms of the version of the specification the document conforms to.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-88

Message

Invalid Version in document catalog

Details

The document's PDF version, from EITHER the file header or document catalog dictionary, cannot be recognised as a number, this doesn't apply to the document catalog alone, misleading and needs review, or at least MUST be sure that the header version parses properly.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-89

Message

Invalid Names dictionary

Details

The document catalogue dictionary's "Names" value is a reference to a non-dictionary object.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-90

Message

Invalid Names dictionary

Details

An unexpected error occurred while retrieving the document catalogue's Names dictionary ("Names").

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-91

Message

Invalid destinations dictionary

Details

The document catalogue's "Dests" entry references an object which is not a dictionary. The optional "Dests" entry is expected to contain a dictionary of the document's destination objects.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-92

Message

Invalid destinations dictionary

Details

An unexpected error occurred while retrieving the document catalogue's destinations dictionary ("Dests").

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-93

Message

Invalid algorithm value in encryption dictionary

Details

The "V" entry of an encryption dictionary, which specifies the encryption algorithm used, has an invalid value. It must be a number value from 0-4 inclusive. Note that the PDF 1.7 specification seems to disbar the "3" option also.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-94

Message

Unexpected exception ...

Details

Unexpected error while parsing the document information dictionary, most likely a missing (null) object or object of the wrong type encountered while resolving the dictionary object or processing its entries.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-95

Message

Document page tree not found

Details

The document catalogue is missing its mandatory "Pages" entry. The entry must be a reference to the page tree node dictionary that is root of the document's page tree.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-96

Message

Document page tree not found

Details

There was an error parsing the documents page tree.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-97

Message

Invalid page dictionary object

Details

The "Pages" reference from the document catalog was resolved to a non-dictionary object. This must resolve to a dictionary representing the page tree element that is the tree's root node.

References

Impact

Needs review

Remediation

Example PDF we (@BL) have so far has turned out to be a bug in source code from handling stream objects - https://github.com/openpreserve/jhove/pull/151. Correcting error leads to PDF-HUL-56.

PDF-HUL-98

Message

Variable message

Details

Unexpected error while parsing the document page tree.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-99

Message

Unexpected exception ...

Details

Unexpected error while parsing the document page label tree.

References

  • PDF 1.6: Needs review
  • PDF 1.7: Needs review

Impact

Needs review

Remediation

Needs review

PDF-HUL-100

Message

Invalid or ill-formed XMP metadata

Details

There was a character encoding issue when parsing the XMP metadata embedded in the PDF. This error is a catch around an initial SAX error that's analysed for an encoding value which is used in a second attempt to open the file. It's not clear how often this error is triggered, so I added an info log statement.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-101

Message

Invalid or ill-formed XMP metadata

Details

An exception was caught while parsing an XMP block embedded in the PDF.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-102

Message

Unexpected exception ...

Details

Unexpected error while parsing a page objects external content streams. This is a single stream or an array of streams that is the value of the optional "Contents" key.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-103

Message

Unexpected exception ...

Details

Unexpected error while parsing and analysing images embedded in the PDF. This a a very general catch and might benefit from been more specific, with more errors and more descriptive messages.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-104

Message

Expected dictionary for font entry in page resource

Details

One of the font entries returned when processing the "Fonts" resource dictionary was resolved but a non-dictionary object was returned.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-105

Message

Fonts exist, but are not displayed; ...

Details

This is just a message to say that font information is available but not reported. The configuration needs to be changed to see the font information (See Remediation)

References

  • PDF 1.6: Not applicable, is jHove configuration
  • PDF 1.7: Not applicable, is jHove configuration

Impact

No impact. This is because a configuration option prevent the reporting of font information.

Remediation

The configuration file can be changed to show fonts. The configuration contains: <param>f</param> to prevent the reporting of fonts.

PDF-HUL-106

Message

Unexpected error in findFonts

Details

Some fonts in the document are missing / unreadable in the file. Needs review.

References

Impact

The missing fonts are typically replaced by similar fonts that are found on the client's computer. These replacements can be imperfect and may cause letters or symbols to be subtstituted by incorrect glyphs, leading to spelling errors and missing or misleading iconography.

Remediation

Create the original document with embedded fonts, as in PDF/A-compliant files. If this is not possible, one may be able to acquire the correct font and append it to the original PDF.

PDF-HUL-107

Message

Improper nesting of object streams

Details

This occurs when an arbitary recursion limit, of thirty times, has occurred when searching for an object in a stream.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-108

Message

Invalid object number or object stream

Details

An object stream dictionary has failed JHOVE's validity tests:

  • must have a "Type" entry which is the name: "ObjStm";
  • must have a count "N" entry that's an integer value; and
  • must have a first offset "First" entry that's an integer value. This error is probably never shown, as the error is catched in PDF-HUL-110.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-109

Message

Compression method is invalid or unknown to JHOVE

Details

An error ("ZipException") occurred while decompressing an object stream. <See explanation for Object Streams: Object streams are stream objects (a dictionary followed by a stream of bytes) which contain other indirect objects. Placing objects in a stream allows them to be compressed with one or more filters, optimizing file sizes.> As of October 2016, this module only supports decompressing object streams with "FlateDecode" filters, although this exception can be thrown even when FlateDecode filters are being used. Needs further investigation. Beware if encryption is used this Error is also show. All examples are examples of encrypted documents.

References

Impact

For error messages due to encryption the ETH Data Archive makes sure the files can be opened, as some DRM rights expire.

Remediation

In some cases, one could ask the producer for the password, or remove certain kinds of PDF security using software tools.

PDF-HUL-110

Message

Invalid object number or object stream

Details

An object stream dictionary has failed JHOVE's validity tests:

  • must have a "Type" entry which is the name: "ObjStm";
  • must have a count "N" entry thats an integer value; and
  • must have a first offset "First" entry that's an integer value.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-111

Message

Bad page labels

Details

The document catalog dictionary has a "PageLabels" entry but there's no children in the number tree structure. JHOVE munges PDF's page labels and number tree concepts making this error trickier to interpret.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-112

Message

Page information is not displayed; ...

Details

This just informs the user that JHOVE has skipped storing and reporting the page level properties and that it can be re-enabled by config. If the parameter p is added in the configuration the Pages are ignored <param>p</param>

References

  • PDF 1.6: Not applicable
  • PDF 1.7: Not applicable

Impact

Not applicable, this is a configuration option

Remediation

Not applicable, this is a configuration option

PDF-HUL-113

Message

Invalid page label info

Details

A general exception was caught when parsing a document's page labels to build JHOVE's page properties.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-114

Message

Annotation object is not a dictionary

Details

An item in a page's annotations array ("Annots") does not point to a dictionary. Each item in an annotation array should point to an annotation dictionary containing that annotation's details.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-115

Message

Annotations exist, but are not displayed; ...

Details

This just informs the user that JHOVE has skipped storing and reporting the annotation level properties and that it can be re-enabled by config. This is not an error, but a configuration option. By adding: <param>a</param> to the configuration of JHOVE, the annotations are not displayed.

References

  • PDF 1.6: Needs review
  • PDF 1.7: Needs review

Impact

Not applicable, this is a configuration option

Remediation

Not applicable, this is a configuration option

PDF-HUL-116

Message

Invalid Annotation list

Details

Unexpected error while parsing a page's annotations. This is a general catch with multiple potential causes including: an I/O exception reading an object or encountering a missing (null) object or an object of an unexpected type.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-117

Message

Invalid page dictionary

Details

Unexpected exception while parsing a page object'd dictionary. This is a general catch with multiple potential causes including: an I/O exception reading an object or encountering a missing (null) object or an object of an unexpected type.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-118

Message

Invalid page label sequence

Details

JHOVE has calculated a page position value of less than 1. PDF's page numbering consists of a number tree whose elements are labelling ranges stored as pdf dictionaries. JHOVE's logic around page label sequences is a little confusing, this seems to be an effort to track a "natural" sequence number, that's checked against pages accumulated in other ranges. I don't believe that this error can be thrown as it's caught by the next general catch and changed to PDF-HUL-119.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-119

Message

Problem with page label structure

Details

Unexpected error while parsing the page label structure. This is a general catch with multiple potential causes including: an I/O exception reading an object or encountering a missing (null) object or an object of an unexpected type. This error is also shown when actually the problem is PDF-HUL-118. All the Examples in this error are examples from this. The problem in these files are PDF-HUL-118, but instead PDF-HUL-119 is reported.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-120

Message

Annotation dictionary missing required type (S) entry

Details

An annotation dictionary contains an action dictionary ("A") which is missing its subtype entry ("S"). The subtype entry is necessary for determining which kind of action to perform when the annotation is activated.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-121

Message

Invalid Annotation property

Details

Unexpected error while parsing an annotation dictionary. This is a general catch with multiple potential causes including: an I/O exception reading an object or encountering a missing (null) object or an object of an unexpected type.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-122

Message

Variable message

Details

This needs review, it's a horrible cludge that eats and PDFExceptions thrown while processing destination objects and always sets the invalid flag. Seems dubious behaviour. It, for example, reports the error "Invalid indirect destination - referenced object ' ' cannot be found". This error comes from PDF-HUL-149.

  • Type: ErrorMessage, An Exception for all the messages coming from adding Destination to Property list
  • Source location: PdfModule.java L3378-L3383
  • Examples: 1

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-123

Message

Outlines contain recursive references

Details

An outline dictionary's "Next" entry points to itself. This would cause a recursive loop so JHOVE warns and breaks out. The PDF 1.6 specification doesn't explicitly disallow this.

References

Impact

This is an info Message warning about potential infinite loops. This isn't violating the PDF specification

Remediation

Needs review

PDF-HUL-124

Message

Malformed outline dictionary

Details

Unexpected error while parsing the document outline. This is a general catch with multiple potential causes including: an I/O exception reading an object or encountering a missing (null) object or an object of an unexpected type.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-125

Message

Invalid outline dictionary item

Details

An outline item dictionary has no "Title" value.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-126

Message

Invalid outline dictionary item

Details

An outline item dictionary has no "Parent" entry. This must be an indirect reference to the parent dictionary in the outline hierarchy.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-127

Message

Invalid outline dictionary item

Details

An outline item dictionary has a "Count" value but it's not an integer or is not a Simple Object. This is required if the outline item has children, but JHOVE doesn't check it child elements are available.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-128

Message

Outlines contain recursive references

Details

An outline dictionary's "Next" entry points to itself. This would cause a recursive loop so JHOVE warns and breaks out. The PDF 1.6 specification doesn't explicitly disallow this.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-129

Message

Outlines contain recursive references

Details

An outline dictionary's "Next" entry points to itself. This would cause a recursive loop so JHOVE warns and breaks out. The PDF 1.6 specification doesn't explicitly disallow this.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-130

Message

Invalid outline dictionary item

Details

An unexpected object type was encountered while parsing an outline item. Possible causes include unexpected "Prev", "Next", "First", or "Last" values.

  • Type: PdfInvalidException, ClassCastException
  • Source location: PdfModule.java L4101
  • Examples: Needed

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-131

Message

Invalid outline dictionary item

Details

Unexpected error while parsing an outline item. This is a general catch with multiple potential causes including: an I/O exception reading an object or encountering a missing (null) object.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-132

Message

Outlines exist, but are not displayed; ...

Details

This is just a message to say that outline information isn't been reported. It is information about the [jHove configuration](http://jhove.openpreservation.org/modules/pdf/). If the parameter "o" is added the Document Outline will be suppressed: <param>o</param>.

  • Type: InfoMessage
  • Source location: PdfModule.java L3975
  • Examples: This is configuration option in jHove.

References

  • PDF 1.6: Needs review
  • PDF 1.7: Needs review

Impact

No impact, this is a configuration option

Remediation

No remediation needed as this is a configuration option.

PDF-HUL-133

Message

Improperly formed date

Details

A date found in a dictionary does not conform to the expected format. Dates specified in dictionaries should follow the format: (D:YYYYMMDDHHmmSSOHH'mm') (PDF 1.4 Spec page 100, section 3.8.2 "Dates")

References

Impact

Needs review

Remediation

It may happen that after a "cure" there is no information about the creation date any more, if there are no XMP metadata in the original PDF. The date may be written poorly enough that some tools cannot recognize the date and so do not translate it into the new/corrected PDF.

PDF-HUL-134

Message

Cross-reference tables are broken

Details

Another check to prevent an endless loop when processing the cross references. This is flagged the current cross reference is the same as the previous one. The program logic is a little dark here involving state across a few member variables.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-135

Message

Unexpected error in parsing font property

Details

A Java null pointer exception was caught, roughly equivalent to a missing and expected PDF object, when building the font property list. Show Fonts or Maximum Verbosity needs to be configured to show this information.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-136

Message

Too many fonts to report; some fonts omitted

Details

Really an application level warning, there's nothing wrong with the PDF, just JHOVE's ability to report all of the details.

References

  • PDF 1.6: Needs review
  • PDF 1.7: Needs review

Impact

No Impact, is an InfoMessage about jHove functionality

Remediation

No Remediation, is an InfoMessage about jHove functionality

PDF-HUL-137

Message

No PDF header

Details

The PDF header could not be found within the file's first 1024 bytes. This can also appear when there are certain kinds of junk data before the header, even if the header exists within the first 1024 bytes. Should a file be classified as malformed if there is any non-zero data before the header, instead of only certain kinds? The implementation notes in the second accompanying PDF 1.6 reference has more to say about headers. The PDF 1.7 spec has no equivalent note.

References

Impact

The file can't be identified as PDF and the version is unknown

Remediation

If extra information is added to the header, that extra information can be removed.

PDF-HUL-138

Message

No PDF trailer

Details

An end-of-file marker ("%%EOF") could not be found within the file's last 1024 bytes. This indicates truncates and can often be due to a PDF file being incompletely uploaded or downloaded.

References

Impact

The file is incomplete or extra information is added after the last EOF

Remediation

Needs to be supplied again or extra information can be removed.

PDF-HUL-139

Message

Missing startxref keyword or value

Details

The "startxref" keyword marking the reference to a cross-reference stream couldn't be found OR the following line wasn't a numeric offset to a cross reference dictionary.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-140

Message

Document catalog dictionary object number and trailer root ref number are inconsistent.

Details

The object retrieved as the document catalog dictionary from the cross-reference table does not have the same ID as the the reference used to retrieve it. An object's ID and it's position in the cross reference table should be the same, i.e. object ID 1 is found at index 1 in the cross-reference table. This may be indicative of a broken cross-reference table. This needs review as it's really a problem with the cross-reference table / JHOVE's parsing of it as readers are more forgiving.

References

Impact

Needs review

Remediation

Needs review

PDF-HUL-141

Message

Document catalog Type key must have value Catalog

Details

The document catalog dictionary object must have key called type with the value Catalog. This error is related to PDF-HUL-142 and PDF-HUL-143.

References

  • PDF 1.6: Needs review
  • PDF 1.7: Needs review

Impact

Needs review

Remediation

Needs review

PDF-HUL-142

Message

Document catalog has no Type key or it has a null value.

Details

The document catalog dictionary object must have key called type. In this error the Type does not exist and is null. This error is related to PDF-HUL-141 and PDF-HUL-143.

Impact

Needs review

Remediation

Needs review

PDF-HUL-143

Message

Document catalog Type key does not have a simple String value.

Details

The document catalog dictionary object has a key called Type. In this error the Type is not a Simple Object. This error is related to PDF-HUL-141 and PDF-HUL-142.

Impact

Needs review

Remediation

Needs review

PDF-HUL-144

Message

Pages dictionary has no Type key or it has a null value.

Details

This error message and the next two error messages (PDF-HUL-145 and PDF-HUL-146) are related. They check the Pages in a Dictionary have the correct properties. PDF-HUL-144 is about Type key is missing or null.

Impact

Needs review

Remediation

Needs review

PDF-HUL-145

Message

Pages dictionary Type key does not have a simple String value.

Details

This error message and the error messages (PDF-HUL-144 and PDF-HUL-146) are related. The Type of the Page in the Directory is not a Simple Object.

Impact

Needs review

Remediation

Needs review

PDF-HUL-146

Message

Pages dictionary Type key must have value /Pages.

Details

This error message and the error messages (PDF-HUL-144 and PDF-HUL-145) are related.The value of Type is not Pages.

Impact

Needs review

Remediation

Needs review

PDF-HUL-147

Message

Page tree node not found.

Details

This error occurs when the page tree is build.

  • Type: ArrayIndexOutOfBoundsException, PdfInvalidException
  • Source location: PageTreeNode L128
  • Examples: Needed

Impact

Needs review

Remediation

Needs review

PDF-HUL-148

Message

PDF minor version number is greater than 7.

Details

At the moment the latest version of PDF is version 1.7. The latest minor version is 7. It is defined in MAX_VALID_MAJOR_VERSION Constant.

Impact

Needs review

Remediation

Needs review

PDF-HUL-149

Message

Invalid indirect destination - referenced object ' ' cannot be found

Details

The Destination was not found for an annotation that is referenced in the document This error is never reported in the output, because the Message is included with PDF-HUL-122.

Impact

Needs review

Remediation

Needs review

PDF-HUL-150

Message

Cross-reference stream must be a stream

Details

The retrieved object must be a stream. This error can occur when the Trailer is parsed or when the Cross references are parsed.

Impact

Needs review

Remediation

Needs review

Clone this wiki locally