A sandbox repository for studying Triton Inference Server features.
All stages of the algorithm are implemented as Triton models:
- Detection Preprocessing (Python backend);
- Detection (onnxruntime backend);
- Detection Postprocessing (Python backend);
- Tracking (Python backend).
The Detection model is based on YOLOv9-c detector.
The Tracking model is stateful which allows it to differentiate inference requests from multiple clients using provided CORRELATION ID
.
Navigate to tracking-by-detection
folder.
cd tracking-by-detection
Launch a tritonserver
docker container.
docker run --gpus=all -it --shm-size=256m --rm -p8000:8000 -p8001:8001 -p8002:8002 -v ${PWD}:/workspace/ -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:24.01-py3
Install dependencies for our Python backend scripts.
pip install -r /models/tracking/1/ocsort/requirements.txt
Launch Triton.
tritonserver --model-repository=/models
Run the client application.
python client.py --video test_data/MOT17-04-SDP-raw.webm