Cleanup & version bump.

Prepearing for next version release, with some breaking changes in API.
dipietrantonio · Apr 19, 2020 · 4dfeb1c · 4dfeb1c
1 parent eec5dae
commit 4dfeb1c
Show file tree

Hide file tree

Showing 7 changed files with 28 additions and 35 deletions.
diff --git a/README.md b/README.md
@@ -12,8 +12,8 @@ of a PDF document and uses its entries to give the user the ability to locate PD
 the file and parse them into suitable Python objects.
 
 **DISCLAIMER**: this package hasn't reached a stable version (>= 1.0.0) yet. Although the parser
-API is quite simple it may change suddenly from one release to anther. All breaking changes will
-be properly notified in the release notes.
+API is quite simple it may change suddenly from one release to the next one. All breaking changes
+will be properly notified in the release notes.
 
 
 ## Quick example
@@ -76,7 +76,7 @@ a better way to understand the PDF than writing a parser for it?
 
 ## Documentation
 
-You can read the documentation for this package on [readthedocs.io](https://pdf4py.readthedocs.io/en/latest/).
+You can read the documentation on [readthedocs.io](https://pdf4py.readthedocs.io/en/latest/).
 
 
 ## Contributing
@@ -92,8 +92,7 @@ Contributions are more than welcome! Please, when writing code or documentation
 - to adopt as much as possible a test-driven development process. Each contribution must be accompanied by a 
   test addition/modification.
 
-If you are wondering in which way you can help, check the [TO-DO list](todo.md). For now it will do as a
-simple "road map".  
+If you are wondering in which way you can help, check the [TODO list](https://github.com/Halolegend94/pdf4py/blob/master/TODO.md). For now it will do as a simple "road map".  
 
-If you found a bug, please file a new issue here on GitHub. Proposing fixes, changes and additions can
+If you have found a bug, please file a new issue here on GitHub. Proposing fixes, changes and additions can
 be done through a pull request.
diff --git a/todo.md → TODO.md b/todo.md → TODO.md
@@ -16,9 +16,7 @@ and can assume the value `LOW`, `MEDIUM` or `HIGH`.
 - [HIGH] (TO DO) To implement tests for some of the stream filters.
 - [MEDIUM] (TO DO) To analyze performances and to compare them with other libraries.
 - [LOW] (TO DO) To go through the 2.0 standard and see if there are major changes.
-- [LOW] (TO DO) To implement support for the 'Extends' keyword in a object stream.
-- [HIGH] (TO DO) Not to decrypt string in a cross reference dictionary.
+- [HIGH] (TO DO) Not to decrypt strings in a cross reference dictionary.
 - [HIGH] (TO DO) High some information about the cross reference table or about Cross Reference
   Streams (their identifiers).
-- [MEDIUM] (TO DO) Better handling of Compressed Object Streams (parse them only once and save them
-  in a Python object)
+- [MEDIUM] (TO DO) Better handling of Compressed Object Streams.
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -22,7 +22,7 @@
 author = 'Cristian Di Pietrantonio'
 
 # The full version, including alpha/beta/rc tags
-release = '0.0.2'
+release = '0.1.0'
 
 
 # -- General configuration ---------------------------------------------------

diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -1,3 +1,11 @@
+.. toctree::
+    :maxdepth: 3
+    :hidden:
+
+    tutorials
+    modules/index
+    standard_coverage
+
 pdf4py's documentation
 ==================================
 
@@ -8,7 +16,6 @@ extraction). In particular, it defines the class `Parser` that reads the *Cross
 of a PDF document and uses its entries to give the user the ability to locate PDF objects within
 the file and parse them into suitable Python objects.
 
-
 .. image:: https://travis-ci.org/Halolegend94/pdf4py.svg?branch=master
     :target: https://travis-ci.org/Halolegend94/pdf4py
     :alt: Build Status
@@ -20,6 +27,11 @@ the file and parse them into suitable Python objects.
 .. image:: https://img.shields.io/pypi/dm/pdf4py?color=brightgreen
     :target: https://pypi.org/project/pdf4py/
 
+**DISCLAIMER**: this package hasn't reached a stable version (>= 1.0.0) yet. Although the parser
+API is quite simple it may change suddenly from one release to the next one. All breaking changes
+will be properly notified in the release notes.
+
+
 Quick example
 -------------
 
@@ -81,14 +93,3 @@ there was not an established Python module to easily parse a PDF document. In or
 why I delved into the PDF 1.7 specification: since that moment I've got interested more and more
 in the inner workings of one of the most important and ubiquitous file format. And what's
 a better way to understand the PDF than writing a parser for it?
-
-
-Table of Contents
------------------
-
-.. toctree::
-    :maxdepth: 3
-
-    tutorials
-    modules/index
-    standard_coverage
diff --git a/docs/source/standard_coverage.rst b/docs/source/standard_coverage.rst
@@ -6,7 +6,7 @@ PDF 1.7 standard coverage
 In this file the progress in implementing all the features in the `PDF 1.7 standard <http://wwwimages.adobe.com/www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf>`_ is tracked.
 Chapters 1 to 6 of the standard are devoted to give a general introduction to the standard whereas Chapter 7 is where the PDF syntax is
 defined. It follows that the best way to keep track of the progress is to specify for each section whether the illustrated features have
-been implemented or not. As the development goes on, the various sections decribing features that have been supported will be marked with
+been implemented or not. As the development goes on, the various sections describing features that have been supported will be marked with
 an check symbol (✓) in the following table. Moreover, the tilde symbol (~) means almost every aspect is supported or that the implementation
 seems to work but more testing is necessary. Finally, the cross symbol (✗) informs that there is no support at this stage for the associated
 feature.
@@ -68,15 +68,15 @@ feature.
 +-------------------+---------------------------------+----------------------------------------+
 | 7.5.6             | Incremental updates             | ✓                                      |
 +-------------------+---------------------------------+----------------------------------------+
-| 7.5.7             | Object streams                  | ~ (`Extend` option support missing)    |
+| 7.5.7             | Object streams                  | ✓                                      |
 +-------------------+---------------------------------+----------------------------------------+
 | 7.5.8             | Cross Reference Streams         | ✓                                      |
 +-------------------+---------------------------------+----------------------------------------+
-| *7.6*             | *Encription*                    | ~ (no embedded files for now)          |
+| *7.6*             | *Encryption*                    | ~ (no File Specs and Public Key Crypto)|
 +-------------------+---------------------------------+----------------------------------------+
 | 7.6.1             | General                         | ✓                                      |
 +-------------------+---------------------------------+----------------------------------------+
-| 7.6.2             | General Encription Algorithm    | ✓                                      |
+| 7.6.2             | General Encryption Algorithm    | ✓                                      |
 +-------------------+---------------------------------+----------------------------------------+
 | 7.6.3             | Standard Security Handler       | ~ (permission bits ignored)            |
 +-------------------+---------------------------------+----------------------------------------+
@@ -144,7 +144,7 @@ feature.
 +-------------------+---------------------------------+----------------------------------------+
 
 Subsequent chapters describe higher level aspects that are built on top of the PDF syntax and elementary objects.
-As of now there is no support for those features, as exmplained in the landing page of the documentation.
+As of now there is no support for those features, as explained in the landing page of the documentation.
 
 In addition, the AESV3 encryption method specified in the 
 `PDF 1.7 Extension 3 document <https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/adobe_supplement_iso32000.pdf>`_

diff --git a/setup.py b/setup.py
@@ -5,7 +5,7 @@
 
 setuptools.setup(
     name="pdf4py",
-    version="0.0.2",
+    version="0.1.0",
     author="Cristian Di Pietrantonio",
     author_email="[email protected]",
     description="A PDF parser written in Python3 with no external dependencies.",

diff --git a/tests/functional_tests.py b/tests/functional_tests.py
@@ -3,7 +3,7 @@
 import logging
 from binascii import unhexlify
 
-KEYWORDS_OF_INTEREST = ['Extends', 'F']
+
 
 def parse_object(parser, obj, visited):
     if isinstance(obj, parpkg.PDFStream):
@@ -15,16 +15,11 @@ def parse_object(parser, obj, visited):
         for x in obj:
             parse_object(parser, x, visited)
     elif isinstance(obj, dict):
-        interesting_keys = set(obj.keys()).intersection(KEYWORDS_OF_INTEREST)
-        if len(interesting_keys) > 0:
-            raise Exception('Found keyword(s) {} in dictionary {}'.format(interesting_keys, obj))
         for k in obj:
             parse_object(parser, obj[k], visited)
     elif isinstance(obj, parpkg.PDFReference) and obj not in visited:
         visited.add(obj)
         x = parser.parse_reference(obj)
-        if isinstance(x, parpkg.PDFIndirectObject):
-            x = x.value
         parse_object(parser, x, visited)