Skip to content

JHucker/candle_mask_minrep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

candle_mask_minrep

Masked fill ops timings for tch vs candle

Results

name, compute_cap, driver_version
NVIDIA GeForce RTX 4090, 8.9, 560.35.03
| batch_size | tch_μs_average | cdl_μs_average |
|------------|----------------|----------------|
|          1 |             14 |             10 |
|          2 |             13 |             10 |
|          4 |             13 |             10 |
|          8 |             13 |             13 |
|         16 |             16 |             20 |
|         32 |             18 |             30 |
|         64 |             21 |             60 |
|        128 |             29 |             91 |
|        256 |             51 |            447 |
|        512 |            201 |            911 |
|       1024 |            515 |           1278 |
|       2048 |           1027 |           2007 |

About

Masked fill ops timings for tch vs candle

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages