Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple dgemm baseline #43

Merged
merged 1 commit into from
Oct 30, 2023
Merged

Add simple dgemm baseline #43

merged 1 commit into from
Oct 30, 2023

Conversation

compor
Copy link
Contributor

@compor compor commented Oct 30, 2023

This PR:

  • Adds a simple double precision gemm baseline.
  • Adds a gendata.py script that handles FP precision given as cmdline option (currently only for 32 and 64 bit floats)
  • Adds this kernel to the run.sh script.

@compor compor added the enhancement New feature or request label Oct 30, 2023
@compor compor self-assigned this Oct 30, 2023
@github-actions
Copy link

kernel size version cycles
ssum 14x26xf32 ssr2d.x 354
ssum 14x26xf32 baseline.x 6153
ssum 14x26xf32 ssr1d_frep1d_unroll.x 249
ssum 14x26xf32 scf.x 8804
ssum 14x26xf32 ssr1d_frep1d.x 247
ssum 14x26xf32 noalias.x 6153
ssum 14x26xf32 ssr1d.x 608
ssum 14x26xf32 linalg.x 2975
ssum 8x16xf32 vector.x 881
ssum 8x16xf32 pres_0_llvm.x 1189
ssum 8x16xf32 ssr2d.x 169
ssum 8x16xf32 baseline.x 1189
ssum 8x16xf32 pres_1_llvm_clean.x 1189
ssum 8x16xf32 scf.x 12340
ssum 8x16xf32 pres_2_vectorized.x 613
ssum 8x16xf32 pres_3_ssr_loop.x 248
ssum 8x16xf32 ssr1d_frep1d.x 122
ssum 8x16xf32 noalias.x 1189
ssum 8x16xf32 ssr1d.x 150
ssum 8x16xf32 pres_4_ssr_frep.x 121
ssum 8x16xf32 linalg.x 1077
dsum 8x16xf32 pres_0_llvm.x 1202
dsum 8x16xf32 pres_2_ssr_loop.x 441
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 pres_1_llvm_clean.x 1202
dsum 8x16xf32 scf.x 3143
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 noalias.x 1202
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
dsum 8x16xf32 pres_3_ssr_frep.x 183
matmul 16x16xf64 baseline.x 33305
saxpy 64xf32 baseline.x 646
saxpy 64xf32 ssr_frep_unroll.x 149
saxpy 64xf32 ssr.x 223
saxpy 64xf32 noalias.x 646
saxpy 64xf32 linalg.x 764
saxpy 64xf32 ssr_frep.x 242

@compor compor merged commit 71b5eca into main Oct 30, 2023
2 checks passed
@compor compor deleted the christos/add-dgemm-baseline branch October 30, 2023 17:07
@github-actions
Copy link

kernel size version cycles
ssum 14x26xf32 ssr2d.x 354
ssum 14x26xf32 baseline.x 6153
ssum 14x26xf32 ssr1d_frep1d_unroll.x 249
ssum 14x26xf32 scf.x 8804
ssum 14x26xf32 ssr1d_frep1d.x 247
ssum 14x26xf32 noalias.x 6153
ssum 14x26xf32 ssr1d.x 608
ssum 14x26xf32 linalg.x 2975
ssum 8x16xf32 vector.x 881
ssum 8x16xf32 pres_0_llvm.x 1189
ssum 8x16xf32 ssr2d.x 169
ssum 8x16xf32 baseline.x 1189
ssum 8x16xf32 pres_1_llvm_clean.x 1189
ssum 8x16xf32 scf.x 12340
ssum 8x16xf32 pres_2_vectorized.x 613
ssum 8x16xf32 pres_3_ssr_loop.x 248
ssum 8x16xf32 ssr1d_frep1d.x 122
ssum 8x16xf32 noalias.x 1189
ssum 8x16xf32 ssr1d.x 150
ssum 8x16xf32 pres_4_ssr_frep.x 121
ssum 8x16xf32 linalg.x 1077
dsum 8x16xf32 pres_0_llvm.x 1202
dsum 8x16xf32 pres_2_ssr_loop.x 441
dsum 8x16xf32 ssr2d.x 273
dsum 8x16xf32 baseline.x 1202
dsum 8x16xf32 pres_1_llvm_clean.x 1202
dsum 8x16xf32 scf.x 3143
dsum 8x16xf32 ssr1d_frep1d.x 187
dsum 8x16xf32 noalias.x 1202
dsum 8x16xf32 ssr1d.x 253
dsum 8x16xf32 linalg.x 1089
dsum 8x16xf32 pres_3_ssr_frep.x 183
matmul 16x16xf64 baseline.x 33305
saxpy 64xf32 baseline.x 646
saxpy 64xf32 ssr_frep_unroll.x 149
saxpy 64xf32 ssr.x 223
saxpy 64xf32 noalias.x 646
saxpy 64xf32 linalg.x 764
saxpy 64xf32 ssr_frep.x 242

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants