Skip to content

data and code to convert a genealogy pdf document to a searchable database

Notifications You must be signed in to change notification settings

JattievdLinde/stamboom

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Geneology pdf document data parser

The primary purpose of the python code in this folder is to parse text data from the geneology register published on www.vanderlinde.org.za into a python dataframe. A dataframe is a very powerfull python format used for manipulating huge datasets. Once in a dataframe it can be exported to many formats, including excel, word, pdf's and a lot of database formatsusing SQLAlchemy in conjuncion with Pandas dataframes.

This is a work in progress at the moment, so extraction is still in porcess. The main python code for this process in in process.py.

About

data and code to convert a genealogy pdf document to a searchable database

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages