[Feat]: Estimate tokens per second before downloading #193

ChristopherKing42 · 2025-01-30T05:21:29Z

Description
Okay, this is very ambitious, but if there was a way to estimate the tokens per second that would be pretty cool. Like, when searching in hugging face each search result has a token per second estimate.

This will probably require a statistical model or even a NN. It would take the phones specs into account.

a-ghorbani · 2025-01-30T05:39:38Z

This dataset is expanding: https://huggingface.co/spaces/a-ghorbani/ai-phone-leaderboard
We maybe able to utilize it to estimate tg/s and memory usage etc

ChristopherKing42 · 2025-01-30T16:44:57Z

Yeah I'm thinking you have a small model (basically just a statistical model) taking phone specs and LLM specs as input and outputs a normal distribution (i.e. mean and variance) and you optimize the cross entropy.

ChristopherKing42 added the enhancement New feature or request label Jan 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feat]: Estimate tokens per second before downloading #193

[Feat]: Estimate tokens per second before downloading #193

ChristopherKing42 commented Jan 30, 2025

a-ghorbani commented Jan 30, 2025

ChristopherKing42 commented Jan 30, 2025

[Feat]: Estimate tokens per second before downloading #193

[Feat]: Estimate tokens per second before downloading #193

Comments

ChristopherKing42 commented Jan 30, 2025

a-ghorbani commented Jan 30, 2025

ChristopherKing42 commented Jan 30, 2025