Skip to content

Node.js module providing inference APIs for large language models, with simple CLI.

License

Notifications You must be signed in to change notification settings

frost-beta/llm.js

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llm.js

Node.js module providing inference APIs for large language models, with simple CLI.

Powered by node-mlx, a machine learning framework for Node.js.

Supported platforms

GPU support:

  • Macs with Apple Silicon

CPU support:

  • x64 Macs
  • x64/arm64 Linux

Note: Models using data types other than float32 require GPU support.

Supported models

You can also find quantized versions of the models at MLX Community.

APIs

import { core as mx, nn } from '@frost-beta/mlx';

/**
 * Wraps language models with or without vision.
 */
export class LLM {
    /**
     * Encode text with images into embeddings.
     */
    async encode(text?: string): Promise<mx.array>;
    /**
     * Convert the messages to embeddings, with images parsed.
     */
    async applyChatTemplate(messages: Message[], options?: ChatTemplateOptions): Promise<mx.array>;
    /**
     * Predict next tokens using the embeddings of prompt.
     */
    async *generate(promptEmbeds: mx.array, options?: LLMGenerateOptions): AsyncGenerator<string[], void, unknown>;
}

/**
 * Create a LLM instance by loading from directory.
 */
export async function loadLLM(dir: string): Promise<LLM>;

/**
 * Options for chat template.
 */
export interface ChatTemplateOptions {
    trimSystemPrompt?: boolean;
}

/**
 * Options for the LLM.generate method.
 */
export interface LLMGenerateOptions {
    maxTokens?: number;
    topP?: number;
    temperature?: number;
}

Check chat.ts and generate.ts for examples.

CLI

First download weights with any tool you like:

$ npm install -g @frost-beta/huggingface
$ huggingface download --to weights mlx-community/Meta-Llama-3-8B-Instruct-8bit

Then start chating:

$ npm install -g @frost-beta/llm
$ llm-chat ./weights
You> Who are you?
Assistant> I am Qwen, a large language model created by Alibaba Cloud.

Or do text generation:

$ llm-generate ./weights 'Write a short story'
In a small village, there lived a girl named Eliza.

For vision models, put images in the format of <image:pathOrUrl>:

$ huggingface download mlx-community/llava-1.5-7b-4bit
$ llm-chat llava-1.5-7b-4bit --temperature=0
You> What is in this image? <image:https://www.techno-edge.net/imgs/zoom/20089.jpg>
Assistant> The image features a man wearing glasses, holding a book in his hands.

License

MIT

About

Node.js module providing inference APIs for large language models, with simple CLI.

Topics

Resources

License

Stars

Watchers

Forks