A dynamic, continuously-updated benchmark to evaluate LLM forecasting capabilities. More at www.forecastbench.org.
Leaderboards and datasets are updated nightly and available at github.com/forecastingresearch/forecastbench-datasets.
Instructions for how to submit your model to the benchmark can be found here: How-to-submit-to-ForecastBench.
Dig into the details of ForecastBench on the wiki.
@inproceedings{karger2025forecastbench,
title={ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities},
author={Ezra Karger and Houtan Bastani and Chen Yueh-Han and Zachary Jacobs and Danny Halawi and Fred Zhang and Philip E. Tetlock},
year={2025},
booktitle={International Conference on Learning Representations (ICLR)},
url={https://iclr.cc/virtual/2025/poster/28507}
}
git clone --recurse-submodules <repo-url>.git
cd forecastbench
cp variables.example.mk variables.mk
and set the values accordingly- Setup your Python virtual environment
make setup-python-env
source .venv/bin/activate
cd directory/containing/cloud/function
eval $(cat path/to/variables.mk | xargs) python main.py
Before creating a pull request:
- run
make lint
and fix any errors and warnings - ensure code has been deployed to Google Cloud Platform and tested (only for our devs, for others, we're happy you're contributing and we'll test this on our end).
- fork the repo
- reference the issue number (if one exists) in the commit message
- push to the fork on a branch other than
main
- create a pull request