Improve Batched GEMV Speed #113

shaunren · 2016-07-10T07:14:42Z

Using column-major instead of row-major matrices is found to improve the speed of GEMV in nearly all cases, with improvements in some practical scenarios, such as Spaun 2.0.

Also, the LIF kernel is updated to use the more accurate LIF model in Nengo 2.1.1. This is found to have negligible performance penalties.

The current, non autotuned colmun-major kernel is able to achieve a total of 31 s of Sim time speedup on Spaun 2.0 (202s -> 171s) on a GTX 970.

mention-bot · 2016-07-10T07:14:44Z

@shaunren, thanks for your PR! By analyzing the annotation information on this pull request, we identified @tbekolay to be a potential reviewer

Signed-off-by: Shaun Ren <[email protected]>

In addition, run GEMV multiple times to improve the accuracy of the result. Signed-off-by: Shaun Ren <[email protected]>

Signed-off-by: Shaun Ren <[email protected]>

hunse · 2016-07-13T21:38:14Z

nengo_ocl/raggedarray.py

-    def __init__(self, arrays, names=None, dtype=None, align=False):
-        arrays = [np.asarray(a) for a in arrays]
+    def __init__(self, arrays, names=None, dtype=None, align=False, order='C'):
+        assert order in 'CF'


I think assert order in ('C', 'F') would be better, because 'CF' would also be accepted with what you have above, right?

You're right, I overlooked this bug. I'll change it in another commit.

hunse · 2016-07-13T21:42:37Z

Very cool! I'm looking forward to trying this out when I get back.

What still needs to be done before this is ready for full review/merge?

Signed-off-by: Shaun Ren <[email protected]>

shaunren · 2016-07-16T20:46:56Z

There are still some kernels that need to be modified to work with column-major; and an autotuner should be added.

Signed-off-by: Shaun Ren <[email protected]>

hunse · 2016-07-16T21:52:33Z

The autotuner is a bigger job, and I think it should be a separate PR. Getting the other kernels work with column-major sounds like the main work needed for this PR.

This commit implements the more accurate LIF model, implemented in Nengo 2.1.1, in OpenCL. A new boolean argument `fastlif' is also added in plan_lif, which defaults to False. See <nengo/nengo#975> for details regarding the new LIF model. Signed-off-by: Shaun Ren <[email protected]>

Signed-off-by: Shaun Ren <[email protected]>

shaunren · 2016-08-01T22:04:55Z

This patchset is now ready for review/merge.

shaunren changed the title ~~Improve Batched GEMV Speed~~ [WIP] Improve Batched GEMV Speed Jul 10, 2016

shaunren force-pushed the fastgemv branch from 0a1e530 to b7b792f Compare July 11, 2016 20:50

shaunren added 11 commits July 11, 2016 16:51

Refactor gemv_prog Geometry into a separate class

74fb5c2

Signed-off-by: Shaun Ren <[email protected]>

Compute bandwidth and FLOPS in test_clra_gemv

e5e445d

In addition, run GEMV multiple times to improve the accuracy of the result. Signed-off-by: Shaun Ren <[email protected]>

Print profiling stats when profiling is not 0

d4f5aa1

Signed-off-by: Shaun Ren <[email protected]>

Show profiling runtime to five decimal places

a496cf3

Signed-off-by: Shaun Ren <[email protected]>

Use list instead of map when generating profiling columns

a982813

Signed-off-by: Shaun Ren <[email protected]>

Support both row and column-major RaggedArray

e8c0759

Signed-off-by: Shaun Ren <[email protected]>

Descriptive block_impl NotImplementedError messages

162d7eb

Signed-off-by: Shaun Ren <[email protected]>

Add some comments to geometry for clarification

621f776

Signed-off-by: Shaun Ren <[email protected]>

Generate a_s0 or a_s1 in cl_geometry_and_textconf

6690d23

Signed-off-by: Shaun Ren <[email protected]>

Make clra_gemv reduce_impl column-major

3e97b6a

Signed-off-by: Shaun Ren <[email protected]>

Make clra_gemv block_impl column-major

7a1b166

Signed-off-by: Shaun Ren <[email protected]>

shaunren force-pushed the fastgemv branch from b7b792f to e996eb2 Compare July 11, 2016 20:52

hunse reviewed Jul 13, 2016
View reviewed changes

Add clra_gemv one_thread_per_row_impl

1308e04

Signed-off-by: Shaun Ren <[email protected]>

shaunren force-pushed the fastgemv branch from e996eb2 to b98872e Compare July 16, 2016 20:45

Fix RaggedArray stride and order assert

2c8d10f

Signed-off-by: Shaun Ren <[email protected]>

shaunren force-pushed the fastgemv branch from b98872e to 2c8d10f Compare July 16, 2016 20:48

shaunren force-pushed the fastgemv branch 2 times, most recently from 3295671 to 41e9e9d Compare August 1, 2016 21:24

shaunren added 4 commits August 1, 2016 17:55

Modify clra_nonlineralities to accept column-major

161a032

Signed-off-by: Shaun Ren <[email protected]>

Fix CLRaggedArray __setitem__ for column-major

92349e4

Signed-off-by: Shaun Ren <[email protected]>

Use column-major RaggedArray by default

e0c3297

Signed-off-by: Shaun Ren <[email protected]>

Add plan_pretuned_gemv

0c3b028

Signed-off-by: Shaun Ren <[email protected]>

shaunren added 2 commits August 1, 2016 17:59

Fix column-major view strides

2a33c21

Signed-off-by: Shaun Ren <[email protected]>

Use plan_one_thread_per_row_gemv in Simulator

d4fa2a4

Signed-off-by: Shaun Ren <[email protected]>

shaunren force-pushed the fastgemv branch from 41e9e9d to d4fa2a4 Compare August 1, 2016 21:59

shaunren changed the title ~~[WIP] Improve Batched GEMV Speed~~ Improve Batched GEMV Speed Aug 1, 2016

drasmuss mentioned this pull request Nov 26, 2019

kernel #46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve Batched GEMV Speed #113

Improve Batched GEMV Speed #113

shaunren commented Jul 10, 2016 •

edited

Loading

mention-bot commented Jul 10, 2016

hunse Jul 13, 2016 •

edited

Loading

shaunren Jul 16, 2016

hunse commented Jul 13, 2016

shaunren commented Jul 16, 2016

hunse commented Jul 16, 2016 •

edited

Loading

shaunren commented Aug 1, 2016

Improve Batched GEMV Speed #113

Are you sure you want to change the base?

Improve Batched GEMV Speed #113

Conversation

shaunren commented Jul 10, 2016 • edited Loading

mention-bot commented Jul 10, 2016

hunse Jul 13, 2016 • edited Loading

Choose a reason for hiding this comment

shaunren Jul 16, 2016

Choose a reason for hiding this comment

hunse commented Jul 13, 2016

shaunren commented Jul 16, 2016

hunse commented Jul 16, 2016 • edited Loading

shaunren commented Aug 1, 2016

shaunren commented Jul 10, 2016 •

edited

Loading

hunse Jul 13, 2016 •

edited

Loading

hunse commented Jul 16, 2016 •

edited

Loading