- Git clone the repository.
git clone repo_url
- Optional: download uv and install it in your environment. You can also use pip to install the requirements.
pip install uv
- Create a virtual environment and activate it.
uv venv # or python -m venv .venv
source .venv/bin/activate
- Compile requirements based on your environment.
uv pip compile builder/requirements.in -o builder/requirements.txt # uv is optional, but recommended
- Install the requirements.
uv pip sync builder/requirements.txt
-
Download the models from huggingface and save them in the
models_hub
directory before building. See src/download_models.py for more details. -
Run the service locally using the following command.
python3 src/handler.py --rp_serve_api
- The Embedding service is now running on
http://localhost:8000/
. You can test it using the following command.
curl --request POST \
--url http://localhost:8000/runsync \
--header 'Content-Type: application/json' \
--data '{"input": {"task": "query","input_data": ["hello"]}}'
The service is deplyed via Docker. The Dockerfile is located in the root directory. Due to GPU requirements, the service works well either on a mac chips with MPS backend (local development) or on a linux machine with CUDA installed (production).
- Test locally without docker (recommended on MacOS)
python src/handler.py --rp_serve_api
- Build and publish image version is usually date e.g. 20240930 or 20240930
./build_publish.sh {org} {repo}
- Push image to docker hub manually (instead of using buld_publish.sh - aka, you want a specific version)
docker push {org}/{repo}:version
We use runpod serverless endpoint. The Docker image is hosted on Docker Hub under: tjmlabs/colivare