Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] End-to-end support for images (as well as PDFs) #5

Open
1 of 4 tasks
athewsey opened this issue Nov 17, 2021 · 0 comments
Open
1 of 4 tasks

[Enhancement] End-to-end support for images (as well as PDFs) #5

athewsey opened this issue Nov 17, 2021 · 0 comments
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@athewsey
Copy link
Contributor

athewsey commented Nov 17, 2021

While this sample was originally created for multi-page documents in PDF, other related use-cases (such as ID document or receipt extraction) may operate on single-page images/photographs/scans instead.

Today there's support for images in some aspects of the pipeline, but others assume PDF. It would be great to round out support for images as source documents - particularly for common JPEG+PNG formats which have good native support in e.g. Amazon Textract, SageMaker Ground Truth, and web browsers.

  • 1. (Believe so but need to double-check) Core Textract state machine component supports OCRing image files
  • 2. Notebook entity recognition data prep flow supports image files
  • 3. (Need to check) OCR pipeline trigger and Textract orchestration supports image files
  • 4. (Known gap) A2I human review UI supports image files
@athewsey athewsey added enhancement New feature or request good first issue Good for newcomers labels Nov 17, 2021
athewsey added a commit that referenced this issue Jul 7, 2022
Fix thumbnailing endpoint and model inference wrapper's logic to
correctly process single image files (as well as PDFs). Fixes #18.
Relates to #5.

Co-authored-by: David <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

1 participant