Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve model download speed, progress display, etc #58

Open
0xdevalias opened this issue Aug 26, 2024 · 5 comments
Open

Improve model download speed, progress display, etc #58

0xdevalias opened this issue Aug 26, 2024 · 5 comments

Comments

@0xdevalias
Copy link

I was reading the node-llama-cpp docs, and they mention that the ipull package can be useful for improved model download speeds:


I can see that the current download command calls downloadModel:

import { downloadModel, MODELS } from "../local-models.js";
import { cli } from "../cli.js";
export function download() {
const command = cli()
.name("download")

Which is defined here, and seems to just use fetch currently, as well as implementing its own download progress tracking in showProgress:

export async function downloadModel(model: string) {
await ensureModelDirectory();
const url = MODELS[model].url;
if (url === undefined) {
err(`Model ${model} not found`);
}
const path = getModelPath(model);
if (existsSync(path)) {
console.log(`Model "${model}" already downloaded`);
return;
}
const response = await fetch(url);
if (!response.ok || !response.body) {
err(`Failed to download model ${model}`);
}
const tmpPath = `${path}.part`;
const fileStream = createWriteStream(tmpPath);
const readStream = Readable.fromWeb(response.body);
showProgress(readStream);
await finished(readStream.pipe(fileStream));
await fs.rename(tmpPath, path);
process.stdout.clearLine?.(0);
console.log(`Model "${model}" downloaded to ${path}`);
}

export function showProgress(stream: Readable) {
let bytes = 0;
let i = 0;
stream.on("data", (data) => {
bytes += data.length;
if (i++ % 1000 !== 0) return;
process.stdout.clearLine?.(0);
process.stdout.write(`\rDownloaded ${formatBytes(bytes)}`);
});
}


I wonder if using iPull might make sense, both in increased download speed, as well as better download progress visibility/etc.

@0xdevalias
Copy link
Author

0xdevalias commented Aug 26, 2024

Partially related context:

I can see that the instructions are in the README here:

Which suggests I need to run humanify download 2b first.

I wonder if it might make more sense to have the local model download as a sub-command of humanify local, as that's where I was first looking for help for how to download the models, and it didn't even occur to me to check the root level command, since things local things seemed to be 'scoped' under the local command:

⇒ npx humanifyjs local -h
(node:97623) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Usage: humanify local [options] <input>

Use a local LLM to unminify code

Arguments:
  input                     The input minified Javascript file

Options:
  -m, --model <model>       The model to use (default: "2b")
  -o, --outputDir <output>  The output directory (default: "output")
  -s, --seed <seed>         Seed for the model to get reproduceable results (leave out for random seed)
  --disableGpu              Disable GPU acceleration
  --verbose                 Show verbose output
  -h, --help                display help for command

There also seems to be very minimal information output during the download. It might be nice to know a bit more about which model is being downloaded, from where, where it's being saved, how large it is, etc:

 ⇒ npx humanifyjs download 2b
(node:97932) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Downloaded 1.63 GB

I guess it does provide slightly more info when the download is completed:

⇒ npx humanifyjs download 2b
(node:97932) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
                  Model "2b" downloaded to /Users/devalias/.humanifyjs/models/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf

I can see it's downloaded here:

⇒ ls ~/.humanifyjs/models
Phi-3.1-mini-4k-instruct-Q4_K_M.gguf

And the code for that is here:

const MODEL_DIRECTORY = join(homedir(), ".humanifyjs", "models");
type ModelDefinition = { url: URL; wrapper?: ChatWrapper };
export const MODELS: { [modelName: string]: ModelDefinition } = {
"2b": {
url: url`https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF/resolve/main/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf?download=true`
},
"8b": {
url: url`https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf?download=true`,
wrapper: new Llama3_1ChatWrapper()
}
};

I also notice that MODEL_DIRECTORY is hardcoded currently. I wonder if that would be something useful to be able to specify/customize via a CLI arg/env variable/etc.

It seems the humanify local command uses getModelPath:

const model = await llama.loadModel({
modelPath: getModelPath(opts?.model),
gpuLayers: (opts?.disableGPU ?? IS_CI) ? 0 : undefined
});

Which only seems to work for model aliases defined in MODELS:

export function getModelPath(model: string) {
if (!(model in MODELS)) {
err(`Model ${model} not found`);
}
const filename = basename(MODELS[model].url.pathname);
return `${MODEL_DIRECTORY}/${filename}`;
}

Even though the error text for humanify download sounds as though it would be capable of downloading any named model:

export function getEnsuredModelPath(model: string) {
const path = getModelPath(model);
if (!existsSync(path)) {
err(
`Model "${model}" not found. Run "humanify download ${model}" to download the model.`
);
}
return path;
}

And usually for LLM apps, the --model param would let us specify arbitrary models from huggingface or similar.

@neoOpus
Copy link

neoOpus commented Sep 4, 2024

I ve been downloading the 8b for almost an hour now... So yeah any improvement in this regard will be great. Or maybe allow to just download the models then place them (I don't know where the model will be located as I didn't analyse the source code yet) But having them shared with other softwares like Ollama would be great (Symbolic link)

@0xdevalias
Copy link
Author

I don't know where the model will be located as I didn't analyse the source code yet

~/.humanifyjs/models

const MODEL_DIRECTORY = join(homedir(), ".humanifyjs", "models");
type ModelDefinition = { url: URL; wrapper?: ChatWrapper };
export const MODELS: { [modelName: string]: ModelDefinition } = {
"2b": {
url: url`https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF/resolve/main/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf?download=true`
},
"8b": {
url: url`https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf?download=true`,
wrapper: new Llama3_1ChatWrapper()
}
};

@neoOpus
Copy link

neoOpus commented Sep 12, 2024

I don't know where the model will be located as I didn't analyse the source code yet

~/.humanifyjs/models

const MODEL_DIRECTORY = join(homedir(), ".humanifyjs", "models");
type ModelDefinition = { url: URL; wrapper?: ChatWrapper };
export const MODELS: { [modelName: string]: ModelDefinition } = {
"2b": {
url: url`https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF/resolve/main/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf?download=true`
},
"8b": {
url: url`https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf?download=true`,
wrapper: new Llama3_1ChatWrapper()
}
};

Yeah I figured that out and updated to Phi 3.5 but the PR shows an error (unrelated I guess as many dependabot PRs are rejected as well)

My mistake (I am extremely exhausted lately and I cannot focus)... Still I want to make some progress with this and create a workflow that would allow me to puruse some new venues in the near future by learning from some extensions how they operate internally... alter some and analyse some other for malwares.

This is the error of the PR, I thought it would be a drop from 3.1 to 3.5 but I guess I have to learn more about the difference between the tokenization of both.

# [2024-09-12 04:24:09]  Loading model with options {
#   modelPath: '/Users/runner/.humanifyjs/models/Phi-3.5-mini-instruct-Q4_K_M.gguf',
#   gpuLayers: 0
# }
# [node-llama-cpp] Using this model ("~/.humanifyjs/models/Phi-3.5-mini-instruct-Q4_K_M.gguf") to tokenize text with special tokens and then detokenize it resulted in a different text. There might be an issue with the model or the tokenizer implementation. Using this model may not work as intended
# Subtest: /Users/runner/work/humanify/humanify/src/test/e2e.geminitest.ts
not ok 1 - /Users/runner/work/humanify/humanify/src/test/e2e.geminitest.ts
  ---

@0xdevalias
Copy link
Author

0xdevalias commented Sep 30, 2024

Some more relevant links/functions/etc that could be used here:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants