From ff7771c325b4f41cb5779fb6b85fb7fb588f159b Mon Sep 17 00:00:00 2001 From: Ean Garvey <87458719+monorimet@users.noreply.github.com> Date: Mon, 18 Nov 2024 16:07:49 -0600 Subject: [PATCH] (docs) Add section to user guide regarding load balancing. (#564) --- docs/user_guide.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/user_guide.md b/docs/user_guide.md index aedc9f546..13caa2acc 100644 --- a/docs/user_guide.md +++ b/docs/user_guide.md @@ -77,6 +77,10 @@ python -m shortfin_apps.sd.simple_client --interactive Congratulations!!! At this point you can play around with the server and client based on your usage. +### Note: Server implementation scope + +The SDXL server's implementation does not account for extremely large client batches. Normally, for heavy workloads, services would be composed under a load balancer to ensure each service is fed with requests optimally. For most cases outside of large-scale deployments, the server's internal batching/load balancing is sufficient. + ### Update flags Please see --help for both the server and client for usage instructions. Here's a quick snapshot.