Skip to content

A template for serving zipformer on Triton Inference Server.

Notifications You must be signed in to change notification settings

ZQuang2202/Zipformer_Triton

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Inference Serving Best Practice for Zipformer Transducer ASR

I provide a template triton config for zipformer model and a client api to evaluate the performance of serving pipline.

Prepare Environment

Build the server docker image:

cd triton
docker build . -f Dockerfile/Dockerfile.server -t sherpa_triton_server:latest --network host

Start the docker container:

docker run --gpus all -rm -v $PWD:/workspace/sherpa --name sherpa_server --net host --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 -it sherpa_triton_server

Now, you should enter into the container successfully. Inside the container,run server:

bash run_server.sh

Test client

cd triton/client/Triton-ASR-client
bash run_client.sh

About

A template for serving zipformer on Triton Inference Server.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published