Skip to content

Inconsistent and unreliable outputs on mobile as opposed to on pc/laptop for -1k models #485

Open
@JohnReginaldShutler

Description

@JohnReginaldShutler

Hello! I would like to know if there are any unseen errors or limitations when prompting a model on mobile compared to a PC/laptop.

Specifically, we are testing a RAG system where we provide the model with context and ask it to generate a response based on that context. Our goal is to list profiles of selected lawyers and a user-supplied legal issue, then ask the model to justify why these lawyers are suitable for the user's legal problem.

We tested several smaller models optimized for mobile (e.g., Phi 1.5/2/3, RedPajama, Tinyllama). These models work well with simple prompts (e.g., "List three states in the USA"). However, with more complex prompts (like the one described above), PCs/laptops provide coherent responses, while mobile devices produce gibberish, even with the same prompts and settings. Is it perhaps something with the Requested maxStorageBufferBindingSize exceeding 128MB?

Below is an example using the RedPajama-INCITE-Chat-3B-v1-q4f32_1-MLC-1k model, showing the model's response on different platforms to the same prompt as well as the prompt and code itself. As you can see the response on PC/laptop is much better and what we are hoping to achieve.

I would appreciate any advice on improving the output on mobile! Thank you :)

  1. Output on windows

image

  1. Output on android

image

  1. Screenshot of code and prompt

image

@customautosys

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions