GenLM Backend is a high-performance backend for language model probabilistic programs in the GenLM ecosystem. It provides essential tools and functions that serve as building blocks for more complex applications. See our documentation.
Key Features:
- Asynchronous LLM Interfaces: Asynchronous computation of next-token probabilities with
vllm
andtransformer
language models. - Tokenizer Vocabulary Decoding: Decode Hugging Face tokenizer vocabularies into their byte and string representations.
- Token-Character Tries: Efficient conversion from token distributions to byte-level distributions using a trie datastructure.
Clone the repository:
git clone [email protected]:probcomp/genlm-backend.git
cd genlm_backend
and install with pip:
pip install .
This installs the package without development dependencies. For development, install in editable mode with:
pip install -e ".[test,docs]"
which also installs the dependencies needed for testing (test) and documentation (docs).
- Python >= 3.10
- The core dependencies listed in the
setup.py
file of the repository.
Note vLLM is not supported on macOS. On macOS systems, only CPU-based functionality (
AsyncTransformer
) will be available. GPU-accelerated features requiring vLLM (AsyncVirtualLM
) will not work.
When test dependencies are installed, the test suite can be run via:
pytest tests
Documentation is generated using mkdocs and hosted on GitHub Pages. To build the documentation, run:
mkdocs build
To serve the documentation locally, run:
mkdocs serve
Performance benchmarks comparing different configurations can be found in our benchmarks directory.