Improve model download speed, progress display, etc #58

0xdevalias · 2024-08-26T02:35:04Z

I was reading the node-llama-cpp docs, and they mention that the ipull package can be useful for improved model download speeds:

https://withcatai.github.io/node-llama-cpp/guide/#getting-a-model-file
- For improved download speeds, you can use ipull to download the model
https://github.com/ido-pluto/ipull
- iPull
  Super fast file downloader with multiple connections
- Features
  - Download using parallels connections
  - Pausing and resuming downloads
  - Node.js and browser support
  - Smart retry on fail
  - CLI Progress bar
  - Download statistics (speed, time left, etc.)

I can see that the current download command calls downloadModel:

humanify/src/commands/download.ts

Lines 1 to 6 in 14e4ae5

    
           import { downloadModel, MODELS } from "../local-models.js"; 
        
           import { cli } from "../cli.js"; 
        
           export function download() { 
        
             const command = cli() 
        
               .name("download")

Which is defined here, and seems to just use fetch currently, as well as implementing its own download progress tracking in showProgress:

humanify/src/local-models.ts

Lines 38 to 65 in 14e4ae5

    
           export async function downloadModel(model: string) { 
        
             await ensureModelDirectory(); 
        
             const url = MODELS[model].url; 
        
             if (url === undefined) { 
        
               err(`Model ${model} not found`); 
        
             } 
        
             const path = getModelPath(model); 
        
             if (existsSync(path)) { 
        
               console.log(`Model "${model}" already downloaded`); 
        
               return; 
        
             } 
        
             const response = await fetch(url); 
        
             if (!response.ok || !response.body) { 
        
               err(`Failed to download model ${model}`); 
        
             } 
        
             const tmpPath = `${path}.part`; 
        
             const fileStream = createWriteStream(tmpPath); 
        
             const readStream = Readable.fromWeb(response.body); 
        
             showProgress(readStream); 
        
             await finished(readStream.pipe(fileStream)); 
        
             await fs.rename(tmpPath, path); 
        
             process.stdout.clearLine?.(0); 
        
             console.log(`Model "${model}" downloaded to ${path}`); 
        
           }

humanify/src/progress.ts

Lines 4 to 13 in 14e4ae5

    
           export function showProgress(stream: Readable) { 
        
             let bytes = 0; 
        
             let i = 0; 
        
             stream.on("data", (data) => { 
        
               bytes += data.length; 
        
               if (i++ % 1000 !== 0) return; 
        
               process.stdout.clearLine?.(0); 
        
               process.stdout.write(`\rDownloaded ${formatBytes(bytes)}`); 
        
             }); 
        
           }

I wonder if using iPull might make sense, both in increased download speed, as well as better download progress visibility/etc.

The text was updated successfully, but these errors were encountered:

0xdevalias · 2024-08-26T02:36:14Z

Partially related context:

I can see that the instructions are in the README here:

https://github.com/jehna/humanify#local-mode

Which suggests I need to run humanify download 2b first.

I wonder if it might make more sense to have the local model download as a sub-command of humanify local, as that's where I was first looking for help for how to download the models, and it didn't even occur to me to check the root level command, since things local things seemed to be 'scoped' under the local command:
⇒ npx humanifyjs local -h
(node:97623) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Usage: humanify local [options] <input>

Use a local LLM to unminify code

Arguments:
  input                     The input minified Javascript file

Options:
  -m, --model <model>       The model to use (default: "2b")
  -o, --outputDir <output>  The output directory (default: "output")
  -s, --seed <seed>         Seed for the model to get reproduceable results (leave out for random seed)
  --disableGpu              Disable GPU acceleration
  --verbose                 Show verbose output
  -h, --help                display help for command
There also seems to be very minimal information output during the download. It might be nice to know a bit more about which model is being downloaded, from where, where it's being saved, how large it is, etc:
 ⇒ npx humanifyjs download 2b
(node:97932) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
Downloaded 1.63 GB
I guess it does provide slightly more info when the download is completed:
⇒ npx humanifyjs download 2b
(node:97932) [DEP0040] DeprecationWarning: The `punycode` module is deprecated. Please use a userland alternative instead.
(Use `node --trace-deprecation ...` to show where the warning was created)
                  Model "2b" downloaded to /Users/devalias/.humanifyjs/models/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf
I can see it's downloaded here:
⇒ ls ~/.humanifyjs/models
Phi-3.1-mini-4k-instruct-Q4_K_M.gguf
And the code for that is here:

humanify/src/local-models.ts

Lines 13 to 25 in 85d17e7

const MODEL_DIRECTORY = join(homedir(), ".humanifyjs", "models");

type ModelDefinition = { url: URL; wrapper?: ChatWrapper };

export const MODELS: { [modelName: string]: ModelDefinition } = {

"2b": {

url: url`https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF/resolve/main/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf?download=true`

},

"8b": {

url: url`https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf?download=true`,

wrapper: new Llama3_1ChatWrapper()

}

};

I also notice that MODEL_DIRECTORY is hardcoded currently. I wonder if that would be something useful to be able to specify/customize via a CLI arg/env variable/etc.

It seems the humanify local command uses getModelPath:

humanify/src/plugins/local-llm-rename/llama.ts

Lines 19 to 22 in 85d17e7

const model = await llama.loadModel({

modelPath: getModelPath(opts?.model),

gpuLayers: (opts?.disableGPU ?? IS_CI) ? 0 : undefined

});

Which only seems to work for model aliases defined in MODELS:

humanify/src/local-models.ts

Lines 69 to 75 in 85d17e7

export function getModelPath(model: string) {

if (!(model in MODELS)) {

err(`Model ${model} not found`);

}

const filename = basename(MODELS[model].url.pathname);

return `${MODEL_DIRECTORY}/${filename}`;

}

Even though the error text for humanify download sounds as though it would be capable of downloading any named model:

humanify/src/local-models.ts

Lines 77 to 85 in 85d17e7

export function getEnsuredModelPath(model: string) {

const path = getModelPath(model);

if (!existsSync(path)) {

err(

`Model "${model}" not found. Run "humanify download ${model}" to download the model.`

);

}

return path;

}

And usually for LLM apps, the --model param would let us specify arbitrary models from huggingface or similar.

neoOpus · 2024-09-04T19:18:42Z

I ve been downloading the 8b for almost an hour now... So yeah any improvement in this regard will be great. Or maybe allow to just download the models then place them (I don't know where the model will be located as I didn't analyse the source code yet) But having them shared with other softwares like Ollama would be great (Symbolic link)

0xdevalias · 2024-09-12T02:32:50Z

I don't know where the model will be located as I didn't analyse the source code yet

~/.humanifyjs/models

humanify/src/local-models.ts

Lines 13 to 25 in 14e4ae5

    
           const MODEL_DIRECTORY = join(homedir(), ".humanifyjs", "models"); 
        
           type ModelDefinition = { url: URL; wrapper?: ChatWrapper }; 
        
           export const MODELS: { [modelName: string]: ModelDefinition } = { 
        
             "2b": { 
        
               url: url`https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF/resolve/main/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf?download=true` 
        
             }, 
        
             "8b": { 
        
               url: url`https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf?download=true`, 
        
               wrapper: new Llama3_1ChatWrapper() 
        
             } 
        
           };

neoOpus · 2024-09-12T07:47:33Z

I don't know where the model will be located as I didn't analyse the source code yet

~/.humanifyjs/models

humanify/src/local-models.ts

Lines 13 to 25 in 14e4ae5

const MODEL_DIRECTORY = join(homedir(), ".humanifyjs", "models");

type ModelDefinition = { url: URL; wrapper?: ChatWrapper };

export const MODELS: { [modelName: string]: ModelDefinition } = {

"2b": {

url: url`https://huggingface.co/bartowski/Phi-3.1-mini-4k-instruct-GGUF/resolve/main/Phi-3.1-mini-4k-instruct-Q4_K_M.gguf?download=true`

},

"8b": {

url: url`https://huggingface.co/lmstudio-community/Meta-Llama-3.1-8B-Instruct-GGUF/resolve/main/Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf?download=true`,

wrapper: new Llama3_1ChatWrapper()

}

};

Yeah I figured that out and updated to Phi 3.5 but the PR shows an error (unrelated I guess as many dependabot PRs are rejected as well)

My mistake (I am extremely exhausted lately and I cannot focus)... Still I want to make some progress with this and create a workflow that would allow me to puruse some new venues in the near future by learning from some extensions how they operate internally... alter some and analyse some other for malwares.

This is the error of the PR, I thought it would be a drop from 3.1 to 3.5 but I guess I have to learn more about the difference between the tokenization of both.

# [2024-09-12 04:24:09]  Loading model with options {
#   modelPath: '/Users/runner/.humanifyjs/models/Phi-3.5-mini-instruct-Q4_K_M.gguf',
#   gpuLayers: 0
# }
# [node-llama-cpp] Using this model ("~/.humanifyjs/models/Phi-3.5-mini-instruct-Q4_K_M.gguf") to tokenize text with special tokens and then detokenize it resulted in a different text. There might be an issue with the model or the tokenizer implementation. Using this model may not work as intended
# Subtest: /Users/runner/work/humanify/humanify/src/test/e2e.geminitest.ts
not ok 1 - /Users/runner/work/humanify/humanify/src/test/e2e.geminitest.ts
  ---

0xdevalias · 2024-09-30T02:31:45Z

Some more relevant links/functions/etc that could be used here:

https://node-llama-cpp.withcat.ai/guide/downloading-models
- Downloading Models
  node-llama-cpp is equipped with solutions to download models to use them in your project.
- https://node-llama-cpp.withcat.ai/guide/downloading-models#cli
  - Using the CLI
    node-llama-cpp is equipped with a model downloader you can use to download models and their related files easily and at high speed (using ipull).
    
    It's recommended to add a models:pull script to your package.json to download all the models used by your project to a local models folder.
    
    It's also recommended to ensure all the models are automatically downloaded after running npm install by setting up a postinstall script
- https://node-llama-cpp.withcat.ai/guide/downloading-models#programmatic
  - Programmatically Downloading Models
    You can also download models programmatically using the createModelDownloader method, and combineModelDownloaders to combine multiple model downloaders.
    
    This option is recommended for more advanced use cases, such as downloading models based on user input.
  - https://node-llama-cpp.withcat.ai/api/functions/createModelDownloader
    - Function: createModelDownloader()
      Create a model downloader to download a model from a URL. Uses ipull to download a model file as fast as possible with parallel connections and other optimizations.
  - https://node-llama-cpp.withcat.ai/api/functions/combineModelDownloaders
    - Function: combineModelDownloaders()
      Combine multiple models downloaders to a single downloader to download everything using as much parallelism as possible.
      
      You can check each individual model downloader for its download progress, but only the onProgress passed to the combined downloader will be called during the download.
- https://node-llama-cpp.withcat.ai/guide/downloading-models#inspecting-remote-models
  - Inspecting Remote Models
    You can inspect the metadata of a remote model without downloading it by either using the inspect gguf command with a URL, or using the readGgufFileInfo method with a URL
- https://node-llama-cpp.withcat.ai/guide/downloading-models#detecting-the-compatibility-of-remote-models
  - Detecting the Compatibility of Remote Models
    It's handy to check the compatibility of a remote model with your current machine hardware before downloading it, so you won't waste time downloading a model that won't work on your machine.
    
    You can do so using the inspect estimate command with a URL

0xdevalias mentioned this issue Aug 26, 2024

Better error handling/user guidance for missing local models #53

Open

This was referenced Sep 12, 2024

Bump typescript from 5.5.4 to 5.6.2 #89

Merged

Updated to Phi-3.5 #93

Open

This was referenced Sep 26, 2024

Error: Failed to stringify code #111

Closed

Better error handling when humanify local ran on empty file (Error: Failed to stringify code) #54

Open

fix: empty code error #134

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve model download speed, progress display, etc #58

Improve model download speed, progress display, etc #58

0xdevalias commented Aug 26, 2024

0xdevalias commented Aug 26, 2024 •

edited

Loading

neoOpus commented Sep 4, 2024

0xdevalias commented Sep 12, 2024

neoOpus commented Sep 12, 2024 •

edited

Loading

0xdevalias commented Sep 30, 2024 •

edited

Loading

Improve model download speed, progress display, etc #58

Improve model download speed, progress display, etc #58

Comments

0xdevalias commented Aug 26, 2024

0xdevalias commented Aug 26, 2024 • edited Loading

neoOpus commented Sep 4, 2024

0xdevalias commented Sep 12, 2024

neoOpus commented Sep 12, 2024 • edited Loading

0xdevalias commented Sep 30, 2024 • edited Loading

0xdevalias commented Aug 26, 2024 •

edited

Loading

neoOpus commented Sep 12, 2024 •

edited

Loading

0xdevalias commented Sep 30, 2024 •

edited

Loading