v1.2.1

minimaxir · May 5, 2018 · 62837b6 · 62837b6
1 parent c46a758
commit 62837b6
Show file tree

Hide file tree

Showing 3 changed files with 13 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -2,16 +2,16 @@
 
 ![dank text](/docs/textgenrnn_console.gif)
 
-Generate text using a pretrained neural network with a few lines of code, or easily train your own text-generating neural network of any size and complexity on any text dataset.
+Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code, or quickly train on a text using a pretrained model.
 
 textgenrnn is a Python 3 module on top of [Keras](https://github.com/fchollet/keras)/[TensorFlow](https://www.tensorflow.org) for creating [char-rnn](http://karpathy.github.io/2015/05/21/rnn-effectiveness/)s, with many cool features:
 
 * A modern neural network architecture which utilizes new techniques as attention-weighting and skip-embedding to accelerate training and improve model quality.
 * Able to train on and generate text at either the character-level or word-level.
 * Able to configure RNN size, the number of RNN layers, and whether to use bidirectional RNNs.
 * Able to train on any generic input text file, including large files.
-* Able to train models on a GPU and then use them with a CPU.
-* Able to utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to normal LSTM implementations.
+* Able to train models on a GPU and then use them to generate text with a CPU.
+* Able to utilize a powerful CuDNN implementation of RNNs when trained on the GPU, which massively speeds up training time as opposed to typical LSTM implementations.
 * Able to train the model using contextual labels, allowing it to learn faster and produce better results in some cases.
 
 You can play with textgenrnn and train any text file with a GPU *for free* in this [Colaboratory Notebook](https://drive.google.com/file/d/1mMKGnVxirJnqDViH7BDJxFqWrsXlPSoK/view?usp=sharing)!
@@ -29,7 +29,7 @@ textgen.generate()
 [Spoiler] Anyone else find this post and their person that was a little more than I really like the Star Wars in the fire or health and posting a personal house of the 2016 Letter for the game in a report of my backyard.
 ```
 
-The model can easily be trained on new texts, and can generate appropriate text *even after a single pass of the input data*.
+The included model can easily be trained on new texts, and can generate appropriate text *even after a single pass of the input data*.
 
 ```python
 textgen.train_from_file('hacker-news-2000.txt', num_epochs=1)
@@ -55,7 +55,7 @@ Urburg to Firefox acquires Nelf Multi Shamn
 Kubernetes by Google’s Bern
 ```
 
-You can also train a new model, with support for word level embeddings and bidirectional layers.
+You can also train a new model, with support for word level embeddings and bidirectional RNN layers.
 
 ## Usage
 

diff --git a/docs/textgenrnn-demo.ipynb b/docs/textgenrnn-demo.ipynb
@@ -354,8 +354,9 @@
     "* `num_epochs`: Number of epochs to train for (default: 50)\n",
     "* `gen_epochs`: Number of epochs to run between generating sample outputs; good for measuring model progress (default: 1)\n",
     "* `batch_size`: Batch size for training; may want to increase if running on a GPU for faster training (default: 128)\n",
-    "* `train_size`: Random proportion of sequence samples to keep: good for controlling overfitting. The rest will be used to train as the validation set. (default: 1.0/all)\n",
-    "* `dropout`: Random number of tokens to ignore each epoch. Good for controlling overfitting/making more resilient against typos, but setting too high will cause network to converge prematurely. (default: 0.0)"
+    "* `train_size`: Random proportion of sequence samples to keep: good for controlling overfitting. The rest will be used to train as the validation set. (default: 1.0/all). To disable training on the validation set (for speed), set `validation=False`.\n",
+    "* `dropout`: Random number of tokens to ignore each epoch. Good for controlling overfitting/making more resilient against typos, but setting too high will cause network to converge prematurely. (default: 0.0)\n",
+    "* `is_csv`: Use with `train_from_file` if the source file is a one-column CSV (e.g. an export from BigQuery or Google Sheets) for proper quote/newline escaping."
    ]
   },
   {

diff --git a/setup.py b/setup.py
@@ -1,9 +1,9 @@
 from setuptools import setup, find_packages
 
 long_description = '''
-Generate text using a pretrained neural network with a few lines of code,
-or easily train your own text-generating neural network of any size
-and complexity on any text dataset.
+Easily train your own text-generating neural network of
+any size and complexity on any text dataset with a few lines
+of code, or quickly train on a text using a pretrained model.
 
 * A modern neural network architecture which utilizes new techniques as
 attention-weighting and skip-embedding to accelerate training
@@ -25,10 +25,11 @@
 setup(
     name='textgenrnn',
     packages=['textgenrnn'],  # this must be the same as the name above
-    version='1.2',
+    version='1.2.1',
     description='Pretrained character-based neural network for ' \
     'easily generating text.',
     long_description=long_description,
+    long_description_content_type='text/markdown',
     author='Max Woolf',
     author_email='[email protected]',
     url='https://github.com/minimaxir/textgenrnn',