[API] Deprecate engine.generate() #541

CharlieFRuan · 2024-08-12T19:00:01Z

This PR deprecates generate() from all MLCEngineInterface. Its usage can be completely covered by engine.chat.completions() for conversation-style generation, and engine.completions() for raw text completion.

Specifically for using the above two OpenAI APIs for multi-round chat with KV reuse, see examples/multi-round-chat.

We deprecate this because future changes on the engine (e.g. allowing multiple models to be loaded in engine) would break this generate() API and require extra effort to maintain.

Tested with streaming/non-streaming on MLCEngine/WebWorkerMLCEngine to ensure other APIs are not affected.

### Changes - #541 - #542 - When single model loaded, no change in behavior - When multiple models loaded, some APIs need to specify which model it is targeting - For more, see PR description (the user-facing section) - Also see `examples/multi-models` ### TVMjs Still compiled at apache/tvm@1fcb620, no change

[Deprecate] Deprecate engine.generate()

df6266b

CharlieFRuan changed the title ~~[Deprecate] Deprecate engine.generate()~~ [API] Deprecate engine.generate() Aug 12, 2024

CharlieFRuan merged commit 4e018b9 into mlc-ai:main Aug 12, 2024
1 check passed

CharlieFRuan mentioned this pull request Aug 13, 2024

[Version] Bump version to 0.2.59 #543

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[API] Deprecate engine.generate() #541

[API] Deprecate engine.generate() #541

CharlieFRuan commented Aug 12, 2024 •

edited

Loading

[API] Deprecate engine.generate() #541

[API] Deprecate engine.generate() #541

Conversation

CharlieFRuan commented Aug 12, 2024 • edited Loading

CharlieFRuan commented Aug 12, 2024 •

edited

Loading