Skip to content

keyboardAnt/hf-bench

Repository files navigation

hf-bench

Setup

Create a new conda environment:

conda env create -f environment.yml

Activate the environment:

conda activate hf-bench-env

Please include an .env file in the root directory with the following variables. The models and datasets are downloaded to the HF_HOME directory, unless they are already stored there.

HF_ACCESS_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
HF_HOME=/path/to/hf_home

To update the environment after changing the environment.yml file:

conda env update -f environment.yml --prune

CLI Usage

Benchmark

[!NOTE] To run the benchmark, you must login to Weights & Biases and often also Hugging Face.

Below are commands for running the benchmark on

  1. Machine with GPUs

    python -m hf_bench.benchmark --num_of_examples=1
  2. Cluster

    ./hf_bench/submit/lsf.sh --num_of_examples=1

After running the above sanity check with one example and the default experiment config, you can run the benchmark with 30 examples and a custom experiment config:

python -m hf_bench.benchmark --experiment_config deepseek-r1-qwen-32b

or

./hf_bench/submit/lsf.sh --experiment_config deepseek-r1-qwen-32b

To adjust the hardware request to the cluster, edit the submit script.

Monitoring

After loading the models, you can monitor the progress of the benchmark here: https://wandb.ai/generating-faster/hf-bench.

Results

The results are stored in the results branch:

To add new results, add the results CSV to the benchmark_results directory. GitHub Actions will automatically update results_all.csv, results_summary.csv, and results_max_speedup.csv files.

Citation

If you use our algorithms (or the code in this repo), please cite our paper (https://arxiv.org/abs/2502.05202):

@article{timor2025accelerating,
  title={Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies},
  author={Timor, Nadav and Mamou, Jonathan and Korat, Daniel and Berchansky, Moshe and Pereg, Oren and Jain, Gaurav and Schwartz, Roy and Wasserblat, Moshe and Harel, David},
  journal={arXiv preprint arXiv:2502.05202},
  year={2025}
}

About

Benchmark TTFT, TPOT, T/s, Speedup

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published