Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Learning matches Nengo learning more closely #139

Merged
merged 5 commits into from
Jan 16, 2019
Merged

Learning matches Nengo learning more closely #139

merged 5 commits into from
Jan 16, 2019

Conversation

hunse
Copy link
Collaborator

@hunse hunse commented Nov 5, 2018

This improves both the match between emulator and Loihi learning, and emulator and Nengo learning. The result is that everything matches Nengo better.

Based off of #137. Replaces #98.

TODO: I haven't tested this in extreme cases where we actually see overflow in the weights. I think some of these cases were referenced in #98. I (or someone) should look at these cases.

@hunse hunse force-pushed the learn-nengo branch 2 times, most recently from 1bc9372 to 57ba8df Compare November 6, 2018 17:39
@hunse
Copy link
Collaborator Author

hunse commented Nov 6, 2018

Okay, I fixed the problem from #89 with overflowing (when initializing test_multiple_pes with lambda x: 0), and added a test for it. That doesn't actually test weight overflow; what we were encountering was other overflows (specifically in q0, the input accumulator). To get weights to overflow, I think the easiest way is to set the wgtExp for learning connections (which is always 4) down to e.g. 0. However, we currently don't have a way to set that locally for a specific test, so we couldn't add a unit test for that.

@hunse
Copy link
Collaborator Author

hunse commented Jan 9, 2019

I tried using the config system to set the weight exponent for learning connections. It wasn't pretty, and finally I gave up when core Nengo tests were failing because they added SPA-specific parameters which I then tried to copy but didn't exist in the new config objects because they weren't initialized. You can see the progress here.

So now I've switched to just having an attribute on the Model to set this (see the new test). We can revisit the config stuff later if we wish. Fingers crossed everything passes now.

Copy link
Member

@drasmuss drasmuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, mainly just documentation/testing questions! It'll definitely be nice to have the learning matching more closely (and nice to have the math worked out explicitly rather than relying on some approximations we worked out heuristically).

nengo_loihi/loihi_interface.py Outdated Show resolved Hide resolved
nengo_loihi/builder.py Outdated Show resolved Hide resolved
nengo_loihi/loihi_api.py Show resolved Hide resolved
nengo_loihi/loihi_api.py Show resolved Hide resolved
nengo_loihi/loihi_api.py Outdated Show resolved Hide resolved
nengo_loihi/loihi_api.py Outdated Show resolved Hide resolved
nengo_loihi/loihi_cx.py Show resolved Hide resolved
nengo_loihi/loihi_cx.py Show resolved Hide resolved
nengo_loihi/tests/test_learning.py Outdated Show resolved Hide resolved
nengo_loihi/tests/test_learning.py Outdated Show resolved Hide resolved
@hunse
Copy link
Collaborator Author

hunse commented Jan 10, 2019

Ok, I think I've addressed all of Dan's comments.

@drasmuss
Copy link
Member

Everything looks good on the emulator, but test_pes_comm_channel is failing for me on hardware.

@hunse
Copy link
Collaborator Author

hunse commented Jan 11, 2019

Ok, I've fixed test_pes_comm_channel on hardware.

As part of that, I removed the fudge factor in the learning overflow. This had originally been to make the emulator more like the chip, but I don't think it actually did. It made the chip learn significantly more slowly than the emulator. With it removed, the chip learns slightly faster. I'm still not sure why this is; it's possible it has to do with how stochastic rounding happens on the chip. Anyway, the emulator and chip are closer now, so overall it's a win I think.

The two new commits can probably get squashed into existing ones if we want. I just kept them separate for now.

Finally, as part of our testing for this, we should probably look at the adaptive control notebook and check it on the emulator and the chip.

Copy link
Member

@drasmuss drasmuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a commit to fix the adaptive control notebook, and added a changelog entry.

I can't connect to the hardware at the moment, but I trust you that tests are passing there now 😄 . For the same reason I couldn't test the adaptive control notebook on hardware.

I just had one more question up above about the change to wgtExp. Not blocking or anything, that's just for my own curiosity.

Assuming that all looks good to you, this looks good to me.

@drasmuss drasmuss removed their assignment Jan 11, 2019
@drasmuss
Copy link
Member

Confirmed that the tests pass and notebook looks good on hardware 👌

@drasmuss drasmuss self-assigned this Jan 11, 2019
@drasmuss drasmuss removed their assignment Jan 11, 2019
@hunse hunse force-pushed the learn-nengo branch 2 times, most recently from 673bd67 to 9cb8614 Compare January 14, 2019 14:51
@hunse
Copy link
Collaborator Author

hunse commented Jan 14, 2019

Ok, I think I've tested almost all the new added lines.

There's one emulator error for spikes being dropped in the trace that is not being tested. I'm not quite sure how to test this: I think we'd need some sort of input that's spiking really fast (possibly just having a number of neurons converge on the same input axon in the post population would do it). I'd definitely want to do some sort of test that makes sure the emulator matches the chip in this respect, so I think for now I'd recommend just leaving this untested.

There's also a line in discretize_weights for lossy_shift = False that isn't tested. I mostly added this in to do quick comparisons of what things would be like if the chip didn't do this. The current chip does do this, though, so it's not tested without anywhere. Again, I'm inclined to say that's fine.

Copy link
Member

@drasmuss drasmuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me now! I'll clean up the history and re-run the tests on hardware, then this should be good to merge.

@drasmuss drasmuss self-assigned this Jan 15, 2019
@drasmuss drasmuss force-pushed the learn-nengo branch 2 times, most recently from bd27bcd to d66394d Compare January 15, 2019 15:50
@drasmuss drasmuss removed their assignment Jan 15, 2019
@tbekolay tbekolay assigned tbekolay and unassigned tbekolay Jan 15, 2019
hunse and others added 5 commits January 15, 2019 18:46
Before, we were checking that the number of learning axons was
greater than the first learning index, which doesn't make sense.
By reducing `vth_max` for these connections.

This is most common on learning connections with zeroed
initial weights, so we'll test in `test_multiple_pes`.
This commit ensures that both the emulator and chip match Nengo
in terms of basic PES learning. The emulator now does learning
much more accurately, which allows us to better map PES learning
parameters (i.e., learning rate, synapse decay) to the chip.

This commit also makes several changes to learning tests:

- The PES learning test has been simplified and explicitly
  compares PES learning on Nengo Loihi with core Nengo.
- `test_multiple_pes` was fixed. It was not passing when using
  Nengo as the simulator, so it had been designed with the old
  Loihi learning rates in mind and needed fixing now that Loihi
  learning is closer to Nengo learning.
- Several test tolerances have been adjusted.
  Note that some tolerances were changed for the chip; in general,
  the chip seems to learn slightly faster than the emulator
  (I'm not quite sure why), but this difference seems to be less
  apparent for more dimensions/neurons.

And adds additional tests:

- Add test for PES error clipping.
- Add test for learning overflow.
- Add test for PES learning with 'simreal' emulator.
- Add test for trace increment clip warning.
- Add test for discretize_weights.lossy_shift.
- Add test for learning trace dropping

Co-authored-by: Daniel Rasmussen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants