The primary purpose of the python code in this folder is to parse text data from the geneology register published on www.vanderlinde.org.za into a python dataframe. A dataframe is a very powerfull python format used for manipulating huge datasets. Once in a dataframe it can be exported to many formats, including excel, word, pdf's and a lot of database formatsusing SQLAlchemy in conjuncion with Pandas dataframes.
This is a work in progress at the moment, so extraction is still in porcess. The main python code for this process in in process.py.