-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for batching to AutoDiffCompostion in PyTorch mode. #3204
Open
davidt0x
wants to merge
58
commits into
devel
Choose a base branch
from
pytorch_batch
base: devel
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+551
−194
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When pytorch vmap is used on a function, the function is called to trace it with tenors passed as inputs which are not normal pytorch tensors. They are in fact BatchedImpl tensors that have no underlying storage. These cannot be converted to double() or numpy arrays so I needed to remove calls to convert these tensors in this special case.
_batch_inputs was not actually returning batches of size batch_size.
…Link into pytorch_batch
…Link into pytorch_batch
…Link into pytorch_batch
…Link into pytorch_batch
…Link into pytorch_batch
…Link into pytorch_batch
Not sure this works in general to be honest.
…Link into pytorch_batch
…Link into pytorch_batch
I made a test that implements an equivalent network in PyTorch and PNL, initializes them with the same weights, trains them both and then compares the losses. I think this is similar to some other tests in the file but this one tests batch size > 1 and it actually trains the pytorch network as well.
Now that all_output_values contains the batch dimension in the first dimension it is causing and error when passed to output_CIM.execute because it has the incorrect number of input ports (unless batch size happens to match number of input ports). Not sure what the best fix to this is but disabling this line seems to work and pass all tests. Need to ask Jon.
A bunch of fixes for things wrong with ragged processing.
…Link into pytorch_batch
Since we have a batch dimension now, all the default variable need to be forced to 2D.
Not sure this is the correct fix, I am getting errors in results, but I don't know how else to implement this.
…Link into pytorch_batch
I think I made this change in error, it was before I fixed issues with handling ragged structures. Lets see if tests pass.
…Link into pytorch_batch
Since we need torch now to setup inputs (before execute), I moved the check for it to learn.
Looks like old versions of numpy didn't throw ValueError
…Link into pytorch_batch
Looks like weights_only argument wasn't supported till then. That is a pretty ancient version of pytorch anyway. Also pushed the upper pin up.
…Link into pytorch_batch
…Link into pytorch_batch
This PR causes the following changes to the html docs (ubuntu-latest-3.11):
See CI logs for the full diff. |
This PR causes the following changes to the html docs (ubuntu-latest-3.11):
See CI logs for the full diff. |
This PR causes the following changes to the html docs (ubuntu-latest-3.11):
See CI logs for the full diff. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The implementation for batching in AutoDiffComposition PyTorch mode did not previously take advantage of batch execution parallelism within pytorch operations. The batch dimension was iterated over in a trial by trial manner. I have relaxed this contraint and implemented things so that when minibatch_size > 1 is passed to learn then batches of tensors will be processed in parallel by pytorch ops. In my preliminary testing, for a simple MNIST example on a multilayer perceptron network, this leads to a 11x speedup over devel (on CPU, batch size of 32).
The core logic behind most of the changes were to insert an explicit batch dimension to the front of all tensors, even in the case of minibatch_size equal to 1. Most of the torch ops we were using support a batch dimension already so did not require many changes. Some required specifying the second dimension as the operating dimension instead of the first as before. There were then some changes to how results for runs are collected. There is now batched_results argument to AutoDiffComposition.run that unrolls the batch dimension when set to False (its default, to match old behaviour). There was then a lot of trial and error in cleaning up issues with ragged arrays.
I also added some tests for training a network in PyTorch and PsyNeuLink and comparing their results end to end for different batch sizes.