-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: implement slim openai api fallback #169
base: master
Are you sure you want to change the base?
Conversation
@@ -20,6 +21,7 @@ | |||
|
|||
def initialize(): | |||
if not use_vllm: | |||
app.include_router(slim_router) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we also want to do it if there is a local llama? I'd only add it if use_oci
is true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
having it doesn't prevent you from trying to test the completions api exposed by ollama, but it comes with the advantage of being able to test locally what we're exposing on our /openai
route
skynet/modules/ttt/openai_api/app.py
Outdated
|
||
__all__ = ['app', 'initialize', 'is_ready'] | ||
__all__ = ['app', 'initialize', 'is_ready'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird indent
|
||
|
||
@router.post('/v1/chat/completions') | ||
async def create_chat_completion(chat_request: ChatCompletionRequest, request: Request): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work with streaming requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, and i don't think our ainvoke
does either since langchain exposes an astream
method, i'm planning that for future times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My worry is that if the consumers of this are using it it will break for them, won't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we only have 1 consumer and I already studied how they're using it, thus the small factor for now
No description provided.