Skip to content

Latest commit

 

History

History
8 lines (5 loc) · 505 Bytes

README.md

File metadata and controls

8 lines (5 loc) · 505 Bytes

Mimic Text Generator

Building some base markov chain based text generators in python and spark. Plan to use this to compare against deep learning approaches.

Currently has support for an ngram based model built in python and one built in pyspark. The API takes in a path of text files and will do basic tokenization on the documents and then feed that into the markov model.

Blog associated with this work https://towardsdatascience.com/natural-language-generation-part-1-back-to-basics-2f0b2654624f