Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Batched GEMV Speed #113

Open
wants to merge 20 commits into
base: master
Choose a base branch
from
Open

Improve Batched GEMV Speed #113

wants to merge 20 commits into from

Conversation

shaunren
Copy link
Contributor

@shaunren shaunren commented Jul 10, 2016

Using column-major instead of row-major matrices is found to improve the speed of GEMV in nearly all cases, with improvements in some practical scenarios, such as Spaun 2.0.

Also, the LIF kernel is updated to use the more accurate LIF model in Nengo 2.1.1. This is found to have negligible performance penalties.

The current, non autotuned colmun-major kernel is able to achieve a total of 31 s of Sim time speedup on Spaun 2.0 (202s -> 171s) on a GTX 970.

@mention-bot
Copy link

@shaunren, thanks for your PR! By analyzing the annotation information on this pull request, we identified @tbekolay to be a potential reviewer

@shaunren shaunren changed the title Improve Batched GEMV Speed [WIP] Improve Batched GEMV Speed Jul 10, 2016
def __init__(self, arrays, names=None, dtype=None, align=False):
arrays = [np.asarray(a) for a in arrays]
def __init__(self, arrays, names=None, dtype=None, align=False, order='C'):
assert order in 'CF'
Copy link
Collaborator

@hunse hunse Jul 13, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assert order in ('C', 'F') would be better, because 'CF' would also be accepted with what you have above, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I overlooked this bug. I'll change it in another commit.

@hunse
Copy link
Collaborator

hunse commented Jul 13, 2016

Very cool! I'm looking forward to trying this out when I get back.

What still needs to be done before this is ready for full review/merge?

@shaunren
Copy link
Contributor Author

There are still some kernels that need to be modified to work with column-major; and an autotuner should be added.

@hunse
Copy link
Collaborator

hunse commented Jul 16, 2016

The autotuner is a bigger job, and I think it should be a separate PR. Getting the other kernels work with column-major sounds like the main work needed for this PR.

This commit implements the more accurate LIF model, implemented in Nengo
2.1.1, in OpenCL.

A new boolean argument `fastlif' is also added in plan_lif, which
defaults to False.

See <nengo/nengo#975> for details regarding the
new LIF model.

Signed-off-by: Shaun Ren <[email protected]>
@shaunren shaunren force-pushed the fastgemv branch 2 times, most recently from 3295671 to 41e9e9d Compare August 1, 2016 21:24
@shaunren shaunren changed the title [WIP] Improve Batched GEMV Speed Improve Batched GEMV Speed Aug 1, 2016
@shaunren
Copy link
Contributor Author

shaunren commented Aug 1, 2016

This patchset is now ready for review/merge.

@drasmuss drasmuss mentioned this pull request Nov 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants