You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello! I would like to know if there are any unseen errors or limitations when prompting a model on mobile compared to a PC/laptop.
Specifically, we are testing a RAG system where we provide the model with context and ask it to generate a response based on that context. Our goal is to list profiles of selected lawyers and a user-supplied legal issue, then ask the model to justify why these lawyers are suitable for the user's legal problem.
We tested several smaller models optimized for mobile (e.g., Phi 1.5/2/3, RedPajama, Tinyllama). These models work well with simple prompts (e.g., "List three states in the USA"). However, with more complex prompts (like the one described above), PCs/laptops provide coherent responses, while mobile devices produce gibberish, even with the same prompts and settings. Is it perhaps something with the Requested maxStorageBufferBindingSize exceeding 128MB?
Below is an example using the RedPajama-INCITE-Chat-3B-v1-q4f32_1-MLC-1k model, showing the model's response on different platforms to the same prompt as well as the prompt and code itself. As you can see the response on PC/laptop is much better and what we are hoping to achieve.
I would appreciate any advice on improving the output on mobile! Thank you :)
The text was updated successfully, but these errors were encountered:
JohnReginaldShutler
changed the title
Inconsistent and unreliable outputs on mobile as opposed to on pc/laptop
Inconsistent and unreliable outputs on mobile as opposed to on pc/laptop for -1k models
Jun 20, 2024
Hello! I would like to know if there are any unseen errors or limitations when prompting a model on mobile compared to a PC/laptop.
Specifically, we are testing a RAG system where we provide the model with context and ask it to generate a response based on that context. Our goal is to list profiles of selected lawyers and a user-supplied legal issue, then ask the model to justify why these lawyers are suitable for the user's legal problem.
We tested several smaller models optimized for mobile (e.g., Phi 1.5/2/3, RedPajama, Tinyllama). These models work well with simple prompts (e.g., "List three states in the USA"). However, with more complex prompts (like the one described above), PCs/laptops provide coherent responses, while mobile devices produce gibberish, even with the same prompts and settings. Is it perhaps something with the Requested maxStorageBufferBindingSize exceeding 128MB?
Below is an example using the RedPajama-INCITE-Chat-3B-v1-q4f32_1-MLC-1k model, showing the model's response on different platforms to the same prompt as well as the prompt and code itself. As you can see the response on PC/laptop is much better and what we are hoping to achieve.
I would appreciate any advice on improving the output on mobile! Thank you :)
@customautosys
The text was updated successfully, but these errors were encountered: