-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
selective ocr to extract key/value data #15
Comments
Hi @raveslave , Thanks for sharing this idea. It's really interesting and looks really cool. I have a few doubts though about the usability as it seems rather complicated to develop or even to use.
Though it seems hard to provide this, we will still take a look at it as this definitely goes in the direction we are aiming: importing / generating DocTypes from OCR. For reference, we're currently more invested in a text based import using simple regular expressions or text processing libraries: https://appliedmachinelearning.blog/2018/06/30/performing-ocr-by-running-parallel-instances-of-tesseract-4-0-python/ We can keep this open to discuss further if you want to. |
pls see comments:
|
re: |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
anyone been looking into this lately? |
Hello, I am currently looking for something like this to use with ERPNext. Converting scanned or email-received PDF purchase invoices to text (or even json) and with the needed data automatically creating a purchase invoice in ERPNext. Only with added functionality for uploading the PDF files from the email and attaching them (or link) to the relevant purchase invoice. |
Hi @raveslave, |
just checking in, anyone willing to co-sponsor? |
I need to extract key value pairs from PDF tables |
@raveslave I need to extract key value pairs from PDF tables |
Any progress with this on ERPNext |
wouldn't it be cool to offer this feature.
basically allow to draw an overlay that helps find key, value that can then be mapped to the relevant document type -> field in erpnext!
The text was updated successfully, but these errors were encountered: