Skip to content

Commit

Permalink
v0.2.14
Browse files Browse the repository at this point in the history
  • Loading branch information
yh202109 committed Jul 7, 2024
1 parent b398086 commit 4304a9a
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 10 deletions.
16 changes: 9 additions & 7 deletions docs/std_iso_pdf.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@
" - ISO 19005-2:2011/PDF/A-2 (based on PDF v1.7)\n",
" - ISO 19005-3:2012/PDF/A-3 (add file)\n",
" - Not allow: audio, video, 3d objects, JS, certain actions, encryption, non-standard metadata\n",
" - Require: embed font with proper license \n",
" - Require: embedding font with proper license \n",
"- ISO 24517/PDF/E for representing engineering documents.\n",
"\n",
"For regulatory submission, FDA currently support \"PDF versions 1.4 through 1.7, PDF/A-1 and PDF/A-2\"[^5].\n",
"Steps for **creating and validating** PDF/A files can be found in reference [^4].\n",
"For regulatory submission, FDA currently support \"PDF versions 1.4 through 1.7, PDF/A-1 and PDF/A-2\"[^4].\n",
"Steps for **creating and validating** PDF/A files can be found in reference [^5][^6].\n",
"\n",
"The module `stdiso.pdfsummary` depends on package `pypdf` [^6].\n",
"The module `stdiso.pdfsummary` depends on package `pypdf` [^7].\n",
"The module `stdiso.pdfsummary` include functions for creating summaries about a specified PDF file.\n"
]
},
Expand Down Expand Up @@ -58,6 +58,7 @@
"\n",
"pfr = pdfSummary(path=\"\")\n",
"print(\"File size:\", pfr.file_size, \" bytes\")\n",
"print(\"Creation date:\", pfr.meta.creation_date)\n",
"print(\"Number of pages:\", pfr.n_page)\n",
"print(\"Number of figures within individual pages:\", pfr.n_image_in_page)"
]
Expand All @@ -78,9 +79,10 @@
"[^1]: Adobe. (2024). Everything you need to know about the PDF. ([web page](https://www.adobe.com/acrobat/about-adobe-pdf.html))\n",
"[^2]: ISO. (2021). The standard for PDF is revised. ([web page](https://www.iso.org/news/ref2608.html))\n",
"[^3]: pdfa.org. (2013). PDF/A in a Nutshell 2.0. ([web page](https://pdfa.org/resource/pdfa-in-a-nutshell-2-0/))\n",
"[^4]: Adobe. (2023). PDF/X-, PDF/A-, and PDF/E-compliant files (Acrobat Pro). ([web page](https://helpx.adobe.com/acrobat/using/pdf-x-pdf-a-pdf.htm))\n",
"[^5]: FDA. (2016). Portable Document Format (PDF) Specifications. ([pdf](https://www.fda.gov/files/drugs/published/Portable-Document-Format-Specifications.pdf))\n",
"[^6]: pypdf Contributors. (2024). pypdf. ([web page](https://pypdf.readthedocs.io/en/stable/index.html))\n",
"[^4]: FDA. (2016). Portable Document Format (PDF) Specifications. ([pdf](https://www.fda.gov/files/drugs/published/Portable-Document-Format-Specifications.pdf))\n",
"[^5]: Adobe. (2023). PDF/X-, PDF/A-, and PDF/E-compliant files (Acrobat Pro). ([web page](https://helpx.adobe.com/acrobat/using/pdf-x-pdf-a-pdf.html))\n",
"[^6]: pypdf Contributors. (2024). PDF/A Compliance. ([web page](https://pypdf.readthedocs.io/en/stable/user/pdfa-compliance.html))\n",
"[^7]: pypdf Contributors. (2024). pypdf. ([web page](https://pypdf.readthedocs.io/en/stable/index.html))\n",
"\n",
"\n",
"\n",
Expand Down
7 changes: 4 additions & 3 deletions mtbp3/stdiso/pdfsummary.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,10 +36,11 @@ def __init__(self, path = None):
self.n_page = self.pp.get_num_pages()
self.file_size = os.path.getsize(self.pdf_path)
self.n_image_in_page = [len(self.pp.pages[i].images) for i in range(self.n_page)]

self.n_image = sum(self.n_image_in_page)
self.meta = self.pp.metadata

if __name__ == "__main__":

pdf_obj = pdfSummary()
print("Pages:", pdf_obj.pp.pages)
pdf_obj = pdfSummary("/Users/yh2020/dt2/proj/mtbp3/mtbp3/data/attention.pdf")
print(pdf_obj.meta.creation_date)

0 comments on commit 4304a9a

Please sign in to comment.