Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Temporarily update Dockerfile to run python/comps.py #237

Closed
wants to merge 8 commits into from

Conversation

jeancochrane
Copy link
Contributor

@jeancochrane jeancochrane commented Apr 30, 2024

This PR is a companion to #236, intended to benchmark the current performance of the comps algorithm using numba. I don't plan to merge it and instead will close it once benchmarking is complete.

Findings

  • CUDA doesn't seem to make much of a difference, and is counterproductive if anything. This makes me wonder whether the algorithm needs to be redesigned to make better use of the GPU, but I'm considering that question out of scope for now.
  • There are big performance gains to be had by simply bumping the instance type with the existing numba code. If the numbers below hold, we could speed up the comps code by 2x if we switched to c5.24xlarge instances, which are about twice as expensive as the m4.10xlarge instances we use now, so we'd probably break even on the change.
  • At small scales (20k observations/10k comparisons), taichi appears to outperform numba, but this improvement disappears if we scale up the size of the data. At a large scale (100k observations/50k comparisons), they perform about the same.

20k observations, 10k comparisons

framework instance type arch time logs
taichi g5.12xlarge x86 2.36s link
taichi g5.12xlarge CUDA 4.33s link
taichi m4.10xlarge x86 4.44s link
numba g5.12xlarge x86 6.07s link
numba m4.10xlarge x86 10.52s link

100k observations, 50k comparisons

framework instance type arch time logs
numba g5.12xlarge x86 31.87s link
taichi c5.24xlarge x86 31.93s link
taichi m4.10xlarge x86 34.09s link
numba c5.24xlarge x86 37.31s link
taichi g5.12xlarge x86 37.75s link
taichi g5.12xlarge CUDA 43.58s link
numba m4.10xlarge x86 64.19s link

@jeancochrane
Copy link
Contributor Author

Closing, see #236 for full results.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant