-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python3 compatibility #4
base: master
Are you sure you want to change the base?
Conversation
… Fix weights and keys initialization
Traceback (most recent call last): File "fancy_mlp.py", line 74, in <module> main() File "fancy_mlp.py", line 71, in main train(inputs, keys, layer_one_weights, layer_two_weights) File "fancy_mlp.py", line 53, in train layer_one_weights += np.dot(layer_one_prediction.T, layer_one_change_in_error) ValueError: operands could not be broadcast together with shapes (3,4) (4,4) (3,4) Increase training steps to 40000
|
||
# Figure out how wrong our output for layer_one was by seeing how wrong the layer_two_prediction was | ||
layer_one_error = np.dot(layer_two_change_in_error, layer_two_weights.T) | ||
# Figure out how wrong our output for layer_one was by seeing how wrong the layer_two_prediction was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the spacing change here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Woops, I have been trying out vim and messed up the spacing. Reverting it :)
# Just like in simple_mlp.py | ||
layer_one_change_in_error = layer_one_error * applySigmoid(layer_one_error, True) | ||
# Just like in simple_mlp.py | ||
layer_one_change_in_error = layer_one_error * applySigmoid(layer_one_prediction, True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, why did this change? This should be layer_one_error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this is inconsistent with simple_mlp.py
implementation and will lead to no convergence.
It is even inconsistent with the layer_two_change_in_error
definition above.
layer_*N*_change_in_error
should be defined as layer_*N*_error * applySigmoid(layer_*N*_prediction, True)
.
I refer to Wikipedia Backpropagation (See Phase 2: weight update in Algorithm in code) where for each weight:
- The weight's output delta and input activation are multiplied to find the gradient of the weight.
- A ratio of the weight's gradient is subtracted from the weight.
Give it a try on your own laptop, you will see the difference!
To check Python 3 compatibility you can create a conda
virtual environment with Python 3, but I bet you know about it ;)
print_data(iter, inputs, keys, weights, prediction) | ||
# adjust your weights accoridngly. | ||
layer_one_weights += np.dot(inputs.T, layer_one_change_in_error) | ||
layer_two_weights += np.dot(layer_one_prediction.T, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll look at this more later, but shouldn't this be
layer_one_weights += np.dot(layer_one_prediction.T, layer_one_change_in_error)
layer_two_weights += np.dot(layer_two_prediction.T, layer_two_change_in_error)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it's inconsistent with the simple_mlp.py
.
In there you update weights by doing: weights += np.dot(inputs.T, change_in_error)
, which works fine.
Have you tried running your fancy_mlp
implementation?
First off it throws an error since the shapes do not match (See PR commit message for details).
Now if I only change layer_one_weights += np.dot(layer_one_prediction.T, layer_one_change_in_error)
and layer_two_weights += np.dot(layer_two_prediction.T, layer_two_change_in_error)
it does not converge.
This is the training output after 40000 steps:
Output After Training:
[[0.49949126]
[0.49944283]
[0.49888797]
[0.50120373]]
This is why I also change layer_one_change_in_error
. Again, this is consistent with the simple_mlp.py
implementation.
And there is the result after 40000 steps:
Output After Training:
[[0.0052001 ]
[0.99379807]
[0.99356872]
[0.00481841]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need changes!
I refactored the current code base so that it works with Python 3 as well as Python 2 (mostly adding parenthesis to print statements and ditching
xrange
for built-inrange
). I updated theREADME.md
section accordingly.I tested both using
conda
virtual environments (plain Python 2.7/3.6 withnumpy
).I fixed the broken
fancy_mlp.py
implementation as it was throwing aValueError
.I also increased the number of training steps in
fancy_mlp.py
from20000
to40000
as convergence has proven to be slow.