Peabody Notecard Pipeline

The peabody archeological museum has thousands of typewritten notecards that index all objects in their possession. In order to help them with their indexing, I created an automatic OCR (Optical Character Recognition) pipeline. It takes all the notecards and converts them to an easily searchable csv. I further then helped in the creation of an internal, django website that allows for correction by workduty students. That code is not currently posted, but I can if interest is expressed — a photo of the website is below.

Example

Output:

{"CatNo": '1',
"AccNo": '1',
"OrigNo": 'SC/2',
"PhotoNo": '',
"Name": 'Butt of arrowhead',
"Site": 'Squibnocket Cliff.',
"SiteNo": 'M50/1',
"Locality": 'Squibnocket Head, southwest side of Martha's Vineyard, Mass.',
"Situation": 'On sand under shell just south of stake 1.', "Remarks": '', "Figured": ''}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Alex_Attempt_Senior_Year.ipynb		Alex_Attempt_Senior_Year.ipynb
README.md		README.md
Website_Example.png		Website_Example.png
example_notecard.png		example_notecard.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Peabody Notecard Pipeline

Example

Output:

About

Releases

Packages

Languages

Reichenbachian/PeabodyPipeline

Folders and files

Latest commit

History

Repository files navigation

Peabody Notecard Pipeline

Example

Output:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages