Merge pull request keras-team#5 from fchollet/master

update
j-zarp · Jan 14, 2016 · bd2ff26 · bd2ff26
2 parents 2ee8917 + 58a94a9
commit bd2ff26
Show file tree

Hide file tree

Showing 60 changed files with 3,210 additions and 488 deletions.
diff --git a/.travis.yml b/.travis.yml
@@ -11,6 +11,8 @@ matrix:
           env: KERAS_BACKEND=theano
         - python: 2.7
           env: KERAS_BACKEND=tensorflow
+        - python: 2.7
+          env: KERAS_BACKEND=theano INTEGRATION_TESTS=true
 install:
   # code below is taken from http://conda.pydata.org/docs/travis.html
   # We do this conditionally because it saves us some downloading if the
@@ -55,6 +57,10 @@ script:
   # set up keras backend
   - sed -i -e 's/"backend":[[:space:]]*"[^"]*/"backend":\ "'$KERAS_BACKEND'/g' ~/.keras/keras.json;
   - echo -e "Running tests with the following config:\n$(cat ~/.keras/keras.json)"
-  - PYTHONPATH=$PWD:$PYTHONPATH py.test tests/
+  - if [[ "$INTEGRATION_TESTS" == "true" ]]; then
+       PYTHONPATH=$PWD:$PYTHONPATH py.test tests/integration_tests;
+    else
+       PYTHONPATH=$PWD:$PYTHONPATH py.test tests/ --ignore=tests/integration_tests;
+    fi
 after_success:
   - coveralls
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -22,13 +22,13 @@ The more information you provide, the easier it is for us to validate that there
 
 ## Requesting a Feature
 
-You can also use Github issues to request features you would like to see in Keras, or changes in the Keras API. 
+You can also use Github issues to request features you would like to see in Keras, or changes in the Keras API.
 
 1. Provide a clear and detailed explanation of the feature you want and why it's important to add. Keep in mind that we want features that will be useful to the majority of our users and not just a small subset. If you're just targeting a minority of users, consider writing an add-on library for Keras. It is crucial for Keras to avoid bloating the API and codebase.
 
 2. Provide code snippets demonstrating the API you have in mind and illustrating the use cases of your feature. Of course, you don't need to write any real code at this point!
 
-3. After disussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.
+3. After discussing the feature you may choose to attempt a Pull Request. If you're at all able, start writing some code. We always have more work to do than time to do it. If you can write some code then that will speed the process along.
 
 ## Pull Requests
 
@@ -49,12 +49,12 @@ We love pull requests. Here's a quick guide:
   - with the Theano backend, on Python 2.7 and Python 3.5
   - with the TensorFlow backend, on Python 2.7
 
-7. When committing, use appropriate, descriptive commit messages. Make sure that your branch history is not a string of "bug fix", "fix", "oops", etc. When submitting your PR, squash your commit history into 1-3 easy to follow commits, to make sure the project history stays clean and readable.
+7. When committing, use appropriate, descriptive commit messages. Make sure that your branch history is not a string of "bug fix", "fix", "oops", etc. When submitting your PR, squash your commits into a single commit with an appropriate commit message, to make sure the project history stays clean and readable. See ['rebase and squash'](http://rebaseandsqua.sh/) for technical help on how to squash your commits.
 
 8. Update the documentation. If introducing new functionality, make sure you include code snippets demonstrating the usage of your new feature.
 
-9. Submit your PR. If your changes have been approved in a previous discussion, and if you have have complete (and passing) unit tests, your PR is likely to be merged promptly. Otherwise, well...
+9. Submit your PR. If your changes have been approved in a previous discussion, and if you have complete (and passing) unit tests, your PR is likely to be merged promptly. Otherwise, well...
 
 ## Adding new examples
 
-Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of examples. Existing examples show idiomatic Keras code: make sure to keep your own script in the same spirit.
+Even if you don't contribute to the Keras source code, if you have an application of Keras that is concise and powerful, please consider adding it to our collection of examples. [Existing examples](https://github.com/fchollet/keras/tree/master/examples) show idiomatic Keras code: make sure to keep your own script in the same spirit.
diff --git a/README.md b/README.md
@@ -4,9 +4,10 @@
 
 ## You have just found Keras.
 
-Keras is a minimalist, highly modular neural networks library, written in Python and capable of running either on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
+Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
 
 Use Keras if you need a deep learning library that:
+
 - allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
 - supports both convolutional networks and recurrent networks, as well as combinations of the two.
 - supports arbitrary connectivity schemes (including multi-input and multi-output training).
@@ -36,7 +37,7 @@ Keras is compatible with: __Python 2.7-3.5__.
 
 ## Getting started: 30 seconds to Keras
 
-The core datastructure of Keras is a __model__, a way to organize layers. There are two types of models: [`Sequential`](http://keras.io/models/#sequential) and [`Graph`](http://keras.io/models/#graph).
+The core data structure of Keras is a __model__, a way to organize layers. There are two types of models: [`Sequential`](http://keras.io/models/#sequential) and [`Graph`](http://keras.io/models/#graph).
 
 Here's the `Sequential` model (a linear pile of layers):
 
@@ -109,6 +110,7 @@ Keras uses the following dependencies:
 - Optional but recommended if you use CNNs: cuDNN.
 
 *When using the Theano backend:*
+
 - Theano
     - [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
 
@@ -118,6 +120,7 @@ sudo pip install git+git://github.com/Theano/Theano.git
 ```
 
 *When using the TensorFlow backend:*
+
 - TensorFlow
     - [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
 

diff --git a/docs/autogen.py b/docs/autogen.py
@@ -80,6 +80,10 @@ def get_method_signature(method):
     for a in args:
         st += str(a) + ', '
     for a, v in kwargs:
+        if  type(v) == str:
+            v = '\'' + v + '\''
+        elif type(v) == unicode:
+            v = 'u\'' + v + '\''
         st += str(a) + '=' + str(v) + ', '
     if kwargs or args:
         return st[:-2] + ')'
@@ -246,4 +250,7 @@ def process_method_docstring(docstring):
         print('...inserting autogenerated content into template:', path)
     else:
         print('...creating new page with autogenerated content:', path)
+    subdir = os.path.dirname(path)
+    if not os.path.exists(subdir):
+        os.makedirs(subdir)
     open(path, 'w').write(module_page)
diff --git a/docs/templates/backend.md b/docs/templates/backend.md
@@ -23,6 +23,15 @@ It probably looks like this:
 
 Simply change the field `backend` to either `"theano"` or `"tensorflow"`, and Keras will use the new configuration next time you run any Keras code.
 
+You can also define the environment variable ``KERAS_BACKEND`` and this will
+override what is defined in your config file :
+
+```bash
+KERAS_BACKEND=tensorflow python -c "from keras import backend; print backend._BACKEND"
+Using TensorFlow backend.
+tensorflow
+```
+
 ## Using the abstract Keras backend to write new code
 
 If you want the Keras modules you write to be compatible with both Theano and TensorFlow, you have to write them via the abstract Keras backend API. Here's an intro.

diff --git a/docs/templates/faq.md b/docs/templates/faq.md
@@ -20,6 +20,8 @@
 
 [How can I record the training / validation loss / accuracy at each epoch?](#how-can-i-record-the-training-validation-loss-accuracy-at-each-epoch)
 
+[How can I use stateful RNNs?](#how-can-i-use-stateful-rnns)
+
 ---
 
 ### How can I run Keras on GPU?
@@ -105,22 +107,22 @@ You can build a Theano function that will return the output of a certain layer g
 
 ```python
 # with a Sequential model
-get_3rd_layer_output = theano.function([model.layers[0].input], 
+get_3rd_layer_output = theano.function([model.layers[0].input],
                                        model.layers[3].get_output(train=False))
 layer_output = get_3rd_layer_output(X)
 
 # with a Graph model
 get_conv_layer_output = theano.function([model.inputs[i].input for i in model.input_order],
-                                        model.outputs['conv'].get_output(train=False),
+                                        model.nodes['conv'].get_output(train=False),
                                         on_unused_input='ignore')
-conv_output = get_conv_output(input_data_dict)
+conv_output = get_conv_layer_output([input_data_dict[i] for i in model.input_order])
 ```
 
 ---
 
 ### Isn't there a bug with Merge or Graph related to input concatenation?
 
-Yes, there was a known bug with tensor concatenation in Thenao that was fixed early 2015. 
+Yes, there was a known bug with tensor concatenation in Theano that was fixed early 2015.
 Please upgrade to the latest version of Theano:
 
 ```bash
@@ -153,7 +155,7 @@ Find out more in the [callbacks documentation](callbacks.md).
 
 ### How is the validation split computed?
 
-If you set the `validation_split` arugment in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc.
+If you set the `validation_split` argument in `model.fit` to e.g. 0.1, then the validation data used will be the *last 10%* of the data. If you set it to 0.25, it will be the last 25% of the data, etc.
 
 
 ---
@@ -176,4 +178,52 @@ hist = model.fit(X, y, validation_split=0.2)
 print(hist.history)
 ```
 
----
+---
+
+### How can I use stateful RNNs?
+
+Making a RNN stateful means that the states for the samples of each batch will be reused as initial states for the samples in the next batch.
+
+When using stateful RNNs, it is therefore assumed that:
+
+- all batches have the same number of samples
+- If `X1` and `X2` are successive batches of samples, then `X2[i]` is the follow-up sequence to `X1[i]`, for every `i`.
+
+To use statefulness in RNNs, you need to:
+
+- explicitly specify the batch size you are using, by passing a `batch_input_shape` argument to the first layer in your model. It should be a tuple of integers, e.g. `(32, 10, 16)` for a 32-samples batch of sequences of 10 timesteps with 16 features per timestep.
+- set `stateful=True` in your RNN layer(s).
+
+To reset the states accumulated:
+
+- use `model.reset_states()` to reset the states of all layers in the model
+- use `layer.reset_states()` to reset the states of a specific stateful RNN layer
+
+Example:
+
+```python
+
+X  # this is our input data, of shape (32, 21, 16)
+# we will feed it to our model in sequences of length 10
+
+model = Sequential()
+model.add(LSTM(32, batch_input_shape=(32, 10, 16), stateful=True))
+model.add(Dense(16, activation='softmax'))
+
+model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
+
+# we train the network to predict the 11th timestep given the first 10:
+model.train_on_batch(X[:, :10, :], np.reshape(X[:, 10, :], (32, 16)))
+
+# the state of the network has changed. We can feed the follow-up sequences:
+model.train_on_batch(X[:, 10:20, :], np.reshape(X[:, 20, :], (32, 16)))
+
+# let's reset the states of the LSTM layer:
+model.reset_states()
+
+# another way to do it in this case:
+model.layers[0].reset_states()
+```
+
+Notes that the methods `predict`, `fit`, `train_on_batch`, `predict_classes`, etc. will *all* update the states of the stateful layers in a model. This allows you to do not only stateful training, but also stateful prediction.
+
diff --git a/docs/templates/index.md b/docs/templates/index.md
@@ -2,9 +2,10 @@
 
 ## You have just found Keras.
 
-Keras is a minimalist, highly modular neural networks library, written in Python and capable of running either on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
+Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either [TensorFlow](https://github.com/tensorflow/tensorflow) or [Theano](https://github.com/Theano/Theano). It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.
 
 Use Keras if you need a deep learning library that:
+
 - allows for easy and fast prototyping (through total modularity, minimalism, and extensibility).
 - supports both convolutional networks and recurrent networks, as well as combinations of the two.
 - supports arbitrary connectivity schemes (including multi-input and multi-output training).
@@ -34,7 +35,7 @@ Keras is compatible with: __Python 2.7-3.5__.
 
 ## Getting started: 30 seconds to Keras
 
-The core datastructure of Keras is a __model__, a way to organize layers. There are two types of models: [`Sequential`](/models/#sequential) and [`Graph`](/models/#graph).
+The core datastructure of Keras is a __model__, a way to organize layers. There are two types of models: [`Sequential`](http://keras.io/models/#sequential) and [`Graph`](http://keras.io/models/#graph).
 
 Here's the `Sequential` model (a linear pile of layers):
 
@@ -107,6 +108,7 @@ Keras uses the following dependencies:
 - Optional but recommended if you use CNNs: cuDNN.
 
 *When using the Theano backend:*
+
 - Theano
     - [See installation instructions](http://deeplearning.net/software/theano/install.html#install).
 
@@ -116,6 +118,7 @@ sudo pip install git+git://github.com/Theano/Theano.git
 ```
 
 *When using the TensorFlow backend:*
+
 - TensorFlow
     - [See installation instructions](https://github.com/tensorflow/tensorflow#download-and-setup).
 
@@ -157,4 +160,4 @@ Keras was initially developed as part of the research effort of project ONEIROS
 
 >_"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them."_ Homer, Odyssey 19. 562 ff (Shewring translation).
 
-------------------
+------------------
diff --git a/docs/templates/objectives.md b/docs/templates/objectives.md
@@ -27,3 +27,5 @@ For a few examples of such functions, check out the [objectives source](https://
 - __hinge__
 - __binary_crossentropy__: Also known as logloss. 
 - __categorical_crossentropy__: Also known as multiclass logloss. __Note__: using this objective requires that your labels are binary arrays of shape `(nb_samples, nb_classes)`.
+- __poisson__: mean of `(predictions - targets * log(predictions))`
+- __cosine_proximity__: the opposite (negative) of the mean cosine proximity between predictions and targets.
diff --git a/examples/antirectifier.py b/examples/antirectifier.py
@@ -0,0 +1,106 @@
+'''The example demonstrates how to write custom layers for Keras.
+
+We build a custom activation layer called 'Antirectifier',
+which modifies the shape of the tensor that passes through it.
+We need to specify two methods: `output_shape` and `get_output`.
+
+Note that the same result can also be achieved via a Lambda layer.
+
+Because our custom layer is written with primitives from the Keras
+backend (`K`), our code can run both on TensorFlow and Theano.
+'''
+
+from __future__ import print_function
+import numpy as np
+from keras.models import Sequential
+from keras.layers.core import Dense, Dropout, Layer, Activation
+from keras.datasets import mnist
+from keras import backend as K
+from keras.utils import np_utils
+
+
+class Antirectifier(Layer):
+    '''This is the combination of a sample-wise
+    L2 normalization with the concatenation of the
+    positive part of the input with the negative part
+    of the input. The result is a tensor of samples that are
+    twice as large as the input samples.
+
+    It can be used in place of a ReLU.
+
+    # Input shape
+        2D tensor of shape (samples, n)
+
+    # Output shape
+        2D tensor of shape (samples, 2*n)
+
+    # Theoretical justification
+        When applying ReLU, assuming that the distribution
+        of the previous output is approximately centered around 0.,
+        you are discarding half of your input. This is inefficient.
+
+        Antirectifier allows to return all-positive outputs like ReLU,
+        without discarding any data.
+
+        Tests on MNIST show that Antirectifier allows to train networks
+        with twice less parameters yet with comparable
+        classification accuracy as an equivalent ReLU-based network.
+    '''
+    @property
+    def output_shape(self):
+        shape = list(self.input_shape)
+        assert len(shape) == 2  # only valid for 2D tensors
+        shape[-1] *= 2
+        return tuple(shape)
+
+    def get_output(self, train):
+        x = self.get_input(train)
+        x -= K.mean(x, axis=1, keepdims=True)
+        x = K.l2_normalize(x, axis=1)
+        pos = K.relu(x)
+        neg = K.relu(-x)
+        return K.concatenate([pos, neg], axis=1)
+
+# global parameters
+batch_size = 128
+nb_classes = 10
+nb_epoch = 40
+
+# the data, shuffled and split between tran and test sets
+(X_train, y_train), (X_test, y_test) = mnist.load_data()
+
+X_train = X_train.reshape(60000, 784)
+X_test = X_test.reshape(10000, 784)
+X_train = X_train.astype('float32')
+X_test = X_test.astype('float32')
+X_train /= 255
+X_test /= 255
+print(X_train.shape[0], 'train samples')
+print(X_test.shape[0], 'test samples')
+
+# convert class vectors to binary class matrices
+Y_train = np_utils.to_categorical(y_train, nb_classes)
+Y_test = np_utils.to_categorical(y_test, nb_classes)
+
+# build the model
+model = Sequential()
+model.add(Dense(256, input_shape=(784,)))
+model.add(Antirectifier())
+model.add(Dropout(0.1))
+model.add(Dense(256))
+model.add(Antirectifier())
+model.add(Dropout(0.1))
+model.add(Dense(10))
+model.add(Activation('softmax'))
+
+# compile the model
+model.compile(loss='categorical_crossentropy', optimizer='rmsprop')
+
+# train the model
+model.fit(X_train, Y_train,
+          batch_size=batch_size, nb_epoch=nb_epoch,
+          show_accuracy=True, verbose=1,
+          validation_data=(X_test, Y_test))
+
+# next, compare with an equivalent network
+# with2x bigger Dense layers and ReLU