Replies: 1 comment
-
A neural net is a graph (generic, can have cycles, even self edges). Synapses are edges, Neurons are vertices. Like in any graph the number of edges grows O(V^2) roughly, where V is number of vertices. Most of the time is spent in FmCa, because that depends on the number of synapses (edges). |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Just writing down my thoughts so far, in case anyone has related ideas.
Since machines nowadays come with multi core CPUs as well as GPUs the idea here is to try and utilize them in order to gain speed improvements.
The strategy is to first make use of CPU cores because that is a smaller step from what we currently have (all serial), but also forces us to think about organizing code into functions that can be run in parallel and about relations between those functions (what needs to happen before the next thing is OK to run). Utilizing cores in Go is pretty easy once you have pure functions, using goroutines. It literally just takes the keyword "go" in front of the call. The goroutine scheduler is also pretty well though out and tries to reuse as much as possible, so we don't have to worry about that part (at least not yet). Another obvious advantage is that the code is all still in Go so it is more portable than a GPU specific language.
There are 3 main parts:
I am still trying to figure out the order that things need to happen but roughly these are the main parts which can be found in axon/layer.go:
Long term the plan is to move to parallelizing using a GPU compute shader. For that to be efficient we will probably need to keep the entire network in GPU memory, otherwise the time spent moving things around would be too high.
Beta Was this translation helpful? Give feedback.
All reactions