Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add relu C kernels (ssr, frep, unroll) #55

Merged
merged 5 commits into from
Nov 2, 2023
Merged

Conversation

nazavode
Copy link
Collaborator

@nazavode nazavode commented Nov 1, 2023

No description provided.

@nazavode nazavode requested review from compor and superlopuh November 1, 2023 22:42
Copy link

github-actions bot commented Nov 1, 2023

kernel size version cycles
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 scf.x 3143
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 noalias.x 1202
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
matmul 8x8xf64 baseline.x 4230
relu 16x16xf64 baseline.x 1339
relu 16x16xf64 ssr_frep_unroll.x 334
relu 16x16xf64 ssr.x 846
relu 16x16xf64 ssr_frep.x 327

@nazavode
Copy link
Collaborator Author

nazavode commented Nov 1, 2023

Here a case where unrolling is absolutely useless since the only instruction is 1 cycle (this results table is amazing).

snrt_ssr_read(SNRT_SSR_DM0, SNRT_SSR_1D, x); // ft0
snrt_ssr_write(SNRT_SSR_DM1, SNRT_SSR_1D, y); // ft1

// BEWARE: even if we are using only 2 streams, all stream-mapped registers become
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thakn you for this comment

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely need to account for this in our lowering

Copy link

github-actions bot commented Nov 2, 2023

kernel size version cycles
relu 16x16xf64 baseline.x 1339
relu 16x16xf64 ssr.x 846
relu 16x16xf64 ssr_frep_unroll.x 334
relu 16x16xf64 linalg.x 1337
relu 16x16xf64 ssr_frep.x 327
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 scf.x 3143
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
dsum 8x16xf32 noalias.x 1202
matmul 8x8xf64 baseline.x 4230

@compor compor merged commit c35d6fb into main Nov 2, 2023
@compor compor deleted the nazavode/ew-relu-ssr-frep branch November 2, 2023 10:15
Copy link

github-actions bot commented Nov 2, 2023

kernel size version cycles
relu 16x16xf64 baseline.x 1339
relu 16x16xf64 ssr.x 846
relu 16x16xf64 ssr_frep_unroll.x 334
relu 16x16xf64 linalg.x 1337
relu 16x16xf64 ssr_frep.x 327
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 scf.x 3143
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
dsum 8x16xf32 noalias.x 1202
matmul 8x8xf64 baseline.x 4230

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants