Releases: distantmagic/paddler
v0.7.0
Requires at least b3606 llama.cpp release.
Breaking Changes
-
Adjusted to handle breaking changes in llama.cpp
/health
endpoint: ggml-org/llama.cpp#9056Instead of using the
/health
endpoint to monitor slot statuses, starting from this version, Paddler uses the/slots
endpoint to monitor llama.cpp instances.
Paddler's/health
endpoint remains unchanged.
v0.6.0
Latest supported llama.cpp release: b3604
Features
- Assign names to Paddler agents (https://github.com/distantmagic/paddler/discussions/12)
Fixes
- Agent host formatting in dashboard
v0.6.0-rc1
Features
- Assign names to Paddler agents (#15)
v0.5.0
Fixes
- Management server crashed in some scenarios due to concurrency issues
v0.4.0
Thank you, @ScottMcNaught, for the help with debugging the issues! :)
Fixes
- OpenAI compatible endpoint is now properly balanced (
/v1/chat/completions
) - Balancer's reverse proxy
panic
ked in some scenarios when the underlyingllama.cpp
instance was abruptly closed during the generation of completion tokens - Added mutex in the targets collection for better internal slots data integrity
v0.3.0
v0.1.0
Aggregated Health Status Responses
Paddler aggregates all the underlying llama.cpp health statuses. When you check the /health
endpoint, it reports aggregated results, making it a drop-in replacement for the llama.cpp server itself (in a sense that you can start making requests to Paddler instead of llama.cpp and things will work the same way).