Read a full description of the project here.
This project is an attempt to use deep learning to generate haikus that conform to the 5-7-5 syllable pattern. Much previous research into generating haikus doesn't enforce syllable counts[1][2], largely because modern English haikus often don't strictly conform to that pattern either. This makes finding training data difficult. I get around this problem by providing the syllable count of each line as an input to the network along with the text at training time. Then, at generation time, I can choose how many syllables I want for each line. This project is still early, but so far I've gotten some promising results.
Here an examples of 5-7-5 syllable output:
early morning sun
from the carried garden fate
stars at the sunset
And if I use the same network to get a 10-10-10 poem:
just as the street lamp spake the sun is bright
and the soul and the spring are blowing
with every beat of my heart i will love you
The first version of the model is implemented in notebooks/models/v1
.
The model is essentially a character-to-character text generation network with a twist. The number of syllables for each line is provided to the network, passed through a dense layer and then added to the LSTM's internal state. This means that by changing the three numbers provided, we can alter the behavior of the network. My hope is that this will still allow the network to learn "English" from the whole corpus even though most of the samples are not 5–7–5 haiku, while still allowing us to generate haiku of that length later.
The notebooks
directory contains the code organized into:
data
: Jupyter notebooks for working with and preparing the data.models
: Jupyter notebooks and python files implementing the different models.
input
contains the raw input data as well as haikus.csv
which contains the whole corpus and sources.txt
describes the
sources used to build that corpus. Preprocess Haikus.ipynb
constructs corpus.