Open
Description
Description
With asynchronous flows user might ask 2 questions in succession before model has time to answer the first one. This ends up in the following chat pattern:
user: multiply 2 by 3
user: then divide by -1
assistant: 6
assistant: -6
Unfortunately, for some reason ChatSession
prevents this while LLAMA 3 instruct models handle such scenario without any issues.
Reproduction Steps
var history = new ChatHistory();
history.AddMessage(AuthorRole.User, "multiply 2 by 3");
history.AddMessage(AuthorRole.User, "then divide by -1");
history.AddMessage(AuthorRole.Assistant, "6");
history.AddMessage(AuthorRole.Assistant, "-6");
Throws ArgumentException
twice
Environment & Configuration
- Operating system: Win 11
- .NET runtime version: 8
- LLamaSharp version: 0.14
- CUDA version (if you are using cuda backend): 12
- CPU & GPU device: N/A
Known Workarounds
No response