Make all methods `async def` again; add completion() for meta-reference #270

ashwinb · 2024-10-19T03:05:34Z

PR #201 had made several changes while trying to fix issues with getting the stream=False branches of inference and agents API working. As part of this, it made a change which was slightly gratuitous. Namely, making chat_completion() and brethren "def" instead of "async def".

The rationale was that this allowed the user (within llama-stack) of this to use it as:

async for chunk in api.chat_completion(params)

However, it causes unnecessary confusion for several folks. Given that clients (e.g., llama-stack-apps) anyway use the SDK methods (which are completely isolated) this choice was not ideal. Let's revert back so the call now looks like:

async for chunk in await api.chat_completion(params)

Bonus
Added a completion() implementation for the meta-reference provider.

Test Plan
Ran all the tests in providers/tests/ with various inference provider choices. (We will soon add them to CI.)

…ce (#270) PR #201 had made several changes while trying to fix issues with getting the stream=False branches of inference and agents API working. As part of this, it made a change which was slightly gratuitous. Namely, making chat_completion() and brethren "def" instead of "async def". The rationale was that this allowed the user (within llama-stack) of this to use it as: ``` async for chunk in api.chat_completion(params) ``` However, it causes unnecessary confusion for several folks. Given that clients (e.g., llama-stack-apps) anyway use the SDK methods (which are completely isolated) this choice was not ideal. Let's revert back so the call now looks like: ``` async for chunk in await api.chat_completion(params) ``` Bonus: Added a completion() implementation for the meta-reference provider. Technically should have been another PR :)

* docker compose ollama * comment * update compose file * readme for distributions * readme * move distribution folders * move distribution/templates to distributions/ * rename * kill distribution/templates * readme * readme * build/developer cookbook/new api provider * developer cookbook * readme * readme * [bugfix] fix case for agent when memory bank registered without specifying provider_id (#264) * fix case where memory bank is registered without provider_id * memory test * agents unit test * Add an option to not use elastic agents for meta-reference inference (#269) * Allow overridding checkpoint_dir via config * Small rename * Make all methods `async def` again; add completion() for meta-reference (#270) PR #201 had made several changes while trying to fix issues with getting the stream=False branches of inference and agents API working. As part of this, it made a change which was slightly gratuitous. Namely, making chat_completion() and brethren "def" instead of "async def". The rationale was that this allowed the user (within llama-stack) of this to use it as: ``` async for chunk in api.chat_completion(params) ``` However, it causes unnecessary confusion for several folks. Given that clients (e.g., llama-stack-apps) anyway use the SDK methods (which are completely isolated) this choice was not ideal. Let's revert back so the call now looks like: ``` async for chunk in await api.chat_completion(params) ``` Bonus: Added a completion() implementation for the meta-reference provider. Technically should have been another PR :) * Improve an important error message * update ollama for llama-guard3 * Add vLLM inference provider for OpenAI compatible vLLM server (#178) This PR adds vLLM inference provider for OpenAI compatible vLLM server. * Create .readthedocs.yaml Trying out readthedocs * Update event_logger.py (#275) spelling error * vllm * build templates * delete templates * tmp add back build to avoid merge conflicts * vllm * vllm --------- Co-authored-by: Ashwin Bharambe <[email protected]> Co-authored-by: Ashwin Bharambe <[email protected]> Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: raghotham <[email protected]> Co-authored-by: nehal-a2z <[email protected]>

ashwinb added 2 commits October 18, 2024 19:52

Make all API methods async def again

627edaf

Get the agents method also

bcaf639

ashwinb requested review from yanxi0830, hardikjshah, dltn and raghotham as code owners October 19, 2024 03:05

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 19, 2024

raghotham approved these changes Oct 19, 2024

View reviewed changes

ashwinb added 2 commits October 18, 2024 20:41

Add completion() impl for meta-reference

072d1b7

Fix

fedc11b

ashwinb changed the title ~~Make all methods async def again~~ Make all methods async def again; add completion() for meta-reference Oct 19, 2024

Update spec since we changed some types for completion API

9fddbdf

ashwinb merged commit 2089427 into main Oct 19, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make all methods `async def` again; add completion() for meta-reference #270

Make all methods `async def` again; add completion() for meta-reference #270

ashwinb commented Oct 19, 2024 •

edited

Loading

Make all methods async def again; add completion() for meta-reference #270

Make all methods async def again; add completion() for meta-reference #270

Conversation

ashwinb commented Oct 19, 2024 • edited Loading

Make all methods `async def` again; add completion() for meta-reference #270

Make all methods `async def` again; add completion() for meta-reference #270

ashwinb commented Oct 19, 2024 •

edited

Loading