Skip to content

run-ai/runai-model-streamer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

80 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Run:ai Model Streamer

Overview

The Run:ai Model Streamer is a Python SDK designed to facilitate the streaming of tensors from tensors files to GPU memory with concurrency and streaming. It provides an API for loading SafeTensors files and building AI models, allowing loading models seamlessly.

For documentation click here

For benchmarks click here

Usage

Install the package

pip install runai_model_streamer

And stream tensors

from runai_model_streamer import SafetensorsStreamer

file_path = "model.safetensors"

with SafetensorsStreamer() as streamer:
    streamer.stream_file(file_path)
    for name, tensor in streamer.get_tensors():
        gpu_tensor = tensor.to('CUDA:0')

Development

Our repository is built using devcontainer (Further reading)

The following commands should run inside the dev container

Note

You can use devcontainer-cli tool (Further reading) by installing it, and run every command with the following prefix devcontainer exec --workspace-folder . [COMMAND]

Build

make build

Note

We build libstreamers3.so and statically link it to libssl, libcrypto, and libcurl. if you would like to use your system libraries by dynamic link to them, run USE_SYSTEM_LIBS=1 make build

Note

On successful build, the .whl file would be at py/runai_model_streamer/dist/<PACKAGE_FILE> and py/runai_model_streamer_s3/dist/<PACKAGE_FILE>

Run tests

make test

Install locally

pip3 install py/runai_model_streamer py/runai_model_streamer_s3

Important

In order to the CPP to run, you need to install libcurl4 and libssl1.1_1