Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add C fused kernel for kernels/dense #81

Merged
merged 1 commit into from
Nov 6, 2023

Conversation

nazavode
Copy link
Collaborator

@nazavode nazavode commented Nov 6, 2023

No description provided.

@nazavode nazavode requested review from superlopuh and compor November 6, 2023 12:20
@nazavode
Copy link
Collaborator Author

nazavode commented Nov 6, 2023

Fully connected relu layer:

  • baseline: 3240 cycles
  • fused: 2970 cycles

I was wondering whether adding all of the kernels in dense/ would be useful?

Copy link

github-actions bot commented Nov 6, 2023

kernel size version cycles
relu 16x16xf64 baseline.x 1339
relu 16x16xf64 ssr.x 846
relu 16x16xf64 ssr_frep_unroll.x 334
relu 16x16xf64 snitch_stream.x 322
relu 16x16xf64 linalg.x 1337
relu 16x16xf64 ssr_frep.x 327
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 scf.x 1227
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
dsum 8x16xf32 noalias.x 1202
matmul 8x8xf64 baseline.x 4230
matmul 8x8xf64 linalg.x 6214

@compor
Copy link
Contributor

compor commented Nov 6, 2023

Not sure, it's just for internal organization ATM, right? Unless you have smt else in mind.
I'm impartial.

@superlopuh superlopuh merged commit 4eb5a48 into main Nov 6, 2023
2 checks passed
@superlopuh superlopuh deleted the nazavode/kernels-dense-fused branch November 6, 2023 13:34
Copy link

github-actions bot commented Nov 6, 2023

kernel size version cycles
relu 16x16xf64 baseline.x 1339
relu 16x16xf64 ssr.x 846
relu 16x16xf64 ssr_frep_unroll.x 334
relu 16x16xf64 snitch_stream.x 322
relu 16x16xf64 linalg.x 1337
relu 16x16xf64 ssr_frep.x 327
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 scf.x 1227
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
dsum 8x16xf32 noalias.x 1202
matmul 8x8xf64 baseline.x 4230
matmul 8x8xf64 linalg.x 6214

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants