Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python3 compatibility #4

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

azouaoui-cv
Copy link

I refactored the current code base so that it works with Python 3 as well as Python 2 (mostly adding parenthesis to print statements and ditching xrange for built-in range). I updated the README.md section accordingly.

I tested both using conda virtual environments (plain Python 2.7/3.6 with numpy).

I fixed the broken fancy_mlp.py implementation as it was throwing a ValueError.
I also increased the number of training steps in fancy_mlp.py from 20000 to 40000 as convergence has proven to be slow.

  File "fancy_mlp.py", line 74, in <module>
    main()
  File "fancy_mlp.py", line 71, in main
    train(inputs, keys, layer_one_weights, layer_two_weights)
  File "fancy_mlp.py", line 53, in train
    layer_one_weights +=  np.dot(layer_one_prediction.T, layer_one_change_in_error)
ValueError: operands could not be broadcast together with shapes (3,4) (4,4) (3,4)```


Traceback (most recent call last):
  File "fancy_mlp.py", line 74, in <module>
    main()
  File "fancy_mlp.py", line 71, in main
    train(inputs, keys, layer_one_weights, layer_two_weights)
  File "fancy_mlp.py", line 53, in train
    layer_one_weights +=  np.dot(layer_one_prediction.T, layer_one_change_in_error)
ValueError: operands could not be broadcast together with shapes (3,4) (4,4) (3,4)

Increase training steps to 40000

# Figure out how wrong our output for layer_one was by seeing how wrong the layer_two_prediction was
layer_one_error = np.dot(layer_two_change_in_error, layer_two_weights.T)
# Figure out how wrong our output for layer_one was by seeing how wrong the layer_two_prediction was
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did the spacing change here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops, I have been trying out vim and messed up the spacing. Reverting it :)

# Just like in simple_mlp.py
layer_one_change_in_error = layer_one_error * applySigmoid(layer_one_error, True)
# Just like in simple_mlp.py
layer_one_change_in_error = layer_one_error * applySigmoid(layer_one_prediction, True)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, why did this change? This should be layer_one_error

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, this is inconsistent with simple_mlp.py implementation and will lead to no convergence.
It is even inconsistent with the layer_two_change_in_error definition above.
layer_*N*_change_in_error should be defined as layer_*N*_error * applySigmoid(layer_*N*_prediction, True).

I refer to Wikipedia Backpropagation (See Phase 2: weight update in Algorithm in code) where for each weight:

  • The weight's output delta and input activation are multiplied to find the gradient of the weight.
  • A ratio of the weight's gradient is subtracted from the weight.

Give it a try on your own laptop, you will see the difference!
To check Python 3 compatibility you can create a conda virtual environment with Python 3, but I bet you know about it ;)

print_data(iter, inputs, keys, weights, prediction)
# adjust your weights accoridngly.
layer_one_weights += np.dot(inputs.T, layer_one_change_in_error)
layer_two_weights += np.dot(layer_one_prediction.T,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look at this more later, but shouldn't this be
layer_one_weights += np.dot(layer_one_prediction.T, layer_one_change_in_error)
layer_two_weights += np.dot(layer_two_prediction.T, layer_two_change_in_error)

Copy link
Author

@azouaoui-cv azouaoui-cv Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's inconsistent with the simple_mlp.py.
In there you update weights by doing: weights += np.dot(inputs.T, change_in_error), which works fine.

Have you tried running your fancy_mlp implementation?

First off it throws an error since the shapes do not match (See PR commit message for details).
Now if I only change layer_one_weights += np.dot(layer_one_prediction.T, layer_one_change_in_error) and layer_two_weights += np.dot(layer_two_prediction.T, layer_two_change_in_error) it does not converge.
This is the training output after 40000 steps:

Output After Training:
[[0.49949126]
 [0.49944283]
 [0.49888797]
 [0.50120373]]

This is why I also change layer_one_change_in_error. Again, this is consistent with the simple_mlp.py implementation.
And there is the result after 40000 steps:

Output After Training:
[[0.0052001 ]
 [0.99379807]
 [0.99356872]
 [0.00481841]]

Copy link
Owner

@farzaa farzaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need changes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants