Skip to content

Latest commit

 

History

History
28 lines (21 loc) · 1.08 KB

File metadata and controls

28 lines (21 loc) · 1.08 KB

Non-Persistent Pipeline Examples

The pipeline.py script can be used to run any of the supported models. Provide the HuggingFace model name, maximum generated tokens, and prompt(s). The generated responses will be printed in the terminal:

$ python pipeline.py --model "mistralai/Mistral-7B-v0.1" --max-new-tokens 128 --prompts "DeepSpeed is" "Seattle is"

Tensor-parallelism can be controlled using the deepspeed launcher and setting --num_gpus:

$ deepspeed --num_gpus 2 pipeline.py

Model-Specific Examples

For convenience, we also provide a set of scripts to quickly test the MII Pipeline with some popular text-generation models:

Model Launch command
meta-llama/Llama-2-7b-hf $ python llama2.py
tiiuae/falcon-7b $ python falcon.py
mistralai/Mixtral-8x7B-v0.1 $ deepspeed --num_gpus 2 mixtral.py