Skip to content

Commit

Permalink
add intro
Browse files Browse the repository at this point in the history
  • Loading branch information
ngxson committed Nov 30, 2024
1 parent 3a2d70e commit 9f8cdb3
Show file tree
Hide file tree
Showing 5 changed files with 113 additions and 14 deletions.
3 changes: 1 addition & 2 deletions .github/workflows/verify-generated-code.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,7 @@ on:
workflow_dispatch:

jobs:
# Build job
build:
verify:
runs-on: ubuntu-latest
steps:
- name: Checkout
Expand Down
96 changes: 96 additions & 0 deletions guides/intro-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Introducing Wllama V2.0

## What's new

V2.0 introduces significant improvements in model management and caching. Key features include:

- Completely rewritten model downloader with service worker
- New `ModelManager` class providing comprehensive model handling and caching capabilities
- Enhanced testing system built on the `vitest` framework

## Added `ModelManager`

The new `ModelManager` class provides a robust interface for handling model files:

```typescript
// Example usage
const modelManager = new ModelManager();

// List all models in cache
const cachedModels = await modelManager.getModels();

// Add a new model
const model = await modelManager.downloadModel('https://example.com/model.gguf');

// Check if model is valid (i.e. it is not corrupted)
// If status === ModelValidationStatus.VALID, you can use the model
// Otherwise, call model.refresh() to re-download it
const status = await model.validate();

// Re-download if needed (useful when remote model file has changed)
await model.refresh();

// Remove model from cache
await model.remove();

// Get model files as Blobs
// Then, you can load the blobs into Wllama instance
const blobs = await model.open();
const wllama = new Wllama(CONFIG_PATHS);
await wllama.loadModel(blobs);
```

Key features of `ModelManager`:
- Automatic handling of split GGUF models
- Built-in model validation
- Parallel downloads of model shards
- Cache management with refresh and removal options

## Migration to v2.0

### Simplified `new Wllama()` constructor

In v2.0, the configuration paths have been simplified. You now only need to specify the `*.wasm` files, as the `*.js` files are no longer required.

Previously in v1.x:

```js
const CONFIG_PATHS = {
'single-thread/wllama.js' : '../../esm/single-thread/wllama.js',
'single-thread/wllama.wasm' : '../../esm/single-thread/wllama.wasm',
'multi-thread/wllama.js' : '../../esm/multi-thread/wllama.js',
'multi-thread/wllama.wasm' : '../../esm/multi-thread/wllama.wasm',
'multi-thread/wllama.worker.mjs': '../../esm/multi-thread/wllama.worker.mjs',
};
const wllama = new Wllama(CONFIG_PATHS);
```

From v2.0:

```js
// You only need to specify 2 files
const CONFIG_PATHS = {
'single-thread/wllama.wasm': '../../esm/single-thread/wllama.wasm',
'multi-thread/wllama.wasm' : '../../esm/multi-thread/wllama.wasm',
};
const wllama = new Wllama(CONFIG_PATHS);
```

In addition to this, the constructor of `Wllama` also accepts second param of type `WllamaConfig`.

NOTE: Most of these parameters are moved from `DownloadModelConfig` which is used by `wllama.loadModelFromUrl`

Example:

```js
const wllama = new Wllama(CONFIG_PATHS, {
parallelDownloads: 5, // maximum files to be downloaded at the same time
allowOffline: false,
});
```

### `Wllama.loadModelFromUrl`

As mentioned earlier, some options are moved to `Wllama` constructor, including:
- `parallelDownloads`
- `allowOffline`
6 changes: 3 additions & 3 deletions src/model-manager.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,9 @@ export type DownloadProgressCallback = (opts: {
}) => any;

export enum ModelValidationStatus {
VALID,
INVALID,
DELETED,
VALID = 'valid',
INVALID = 'invalid',
DELETED = 'deleted',
}

export interface ModelManagerParams {
Expand Down
6 changes: 3 additions & 3 deletions src/wllama.ts
Original file line number Diff line number Diff line change
Expand Up @@ -10,9 +10,7 @@ import {
padDigits,
} from './utils';
import CacheManager, { DownloadOptions } from './cache-manager';
import { DownloadProgressCallback, ModelManager } from './model-manager';

const noop = () => {};
import { ModelManager } from './model-manager';

export interface WllamaLogger {
debug: typeof console.debug;
Expand All @@ -33,6 +31,8 @@ export interface WllamaConfig {
logger?: WllamaLogger;
/**
* Maximum number of parallel files to be downloaded
*
* Default: parallelDownloads = 3
*/
parallelDownloads?: number;
/**
Expand Down
Loading

0 comments on commit 9f8cdb3

Please sign in to comment.