Skip to content

Commit 896b012

Browse files
[Config] Enhance ModelRecord (#435)
There are three changes to `ModelRecord` this PR brings: ### 1. Update model ids to match HF repo name We rename `modelId` in `webllm.prebuiltAppConfig` to be the exact same as the HF repo name. For most models, that means we simply append `-MLC` to the `modelId`. For the low-context version of the model, we would have `{HF-repo}-1k`, suggesting 1k context length. As a result, we rename Phi2 and phi1.5 models since their `modelId` did not match with the repo name - `Phi2-q4f32_1` → `phi-2-q4f32_1-MLC` - `Phi1.5-q4f16_1` → `phi-1_5-q4f16_1-MLC` ### 2. Rename `model_url` and `model_lib_url` to `model` and `model_lib` To better match with other platforms of MLC-LLM (e.g. iOS, Android), we rename the `ModelRecord` fields. ### 3. Remove `resolve/main` from `model` URL Instead of `"https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/resolve/main/"`, we now make it `"https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/"`; note the trailing `/` will be appended by us if it is not there. ### Example As an example, we would have: ```typescript { model: "https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC", model_id: "Llama-3-8B-Instruct-q4f16_1-MLC", model_lib: "path/to/Llama-3-8B-Instruct-q4f16_1-ctx1k_cs1k-webgpu.wasm", }, ``` instead of ```typescript { model_url: "https://huggingface.co/mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC/resolve/main/", model_id: "Llama-3-8B-Instruct-q4f16_1", model_lib_url: "path/to/Llama-3-8B-Instruct-q4f16_1-ctx4k_cs1k-webgpu.wasm", }, ``` --------- Co-authored-by: Nestor Qin <[email protected]>
1 parent c995caa commit 896b012

File tree

24 files changed

+788
-718
lines changed

24 files changed

+788
-718
lines changed

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ async function main() {
5252
const label = document.getElementById("init-label");
5353
label.innerText = report.text;
5454
};
55-
const selectedModel = "Llama-3-8B-Instruct-q4f32_1";
55+
const selectedModel = "Llama-3-8B-Instruct-q4f32_1-MLC";
5656
const engine: webllm.MLCEngineInterface = await webllm.CreateMLCEngine(
5757
selectedModel,
5858
/*engineConfig=*/ { initProgressCallback: initProgressCallback },
@@ -96,7 +96,7 @@ async function main() {
9696
const initProgressCallback = (report) => {
9797
console.log(report.text);
9898
};
99-
const selectedModel = "TinyLlama-1.1B-Chat-v0.4-q4f16_1-1k";
99+
const selectedModel = "TinyLlama-1.1B-Chat-v0.4-q4f16_1-MLC-1k";
100100
const engine = await webllm.CreateMLCEngine(selectedModel, {
101101
initProgressCallback: initProgressCallback,
102102
});
@@ -247,8 +247,8 @@ on how to add new model weights and libraries to WebLLM.
247247

248248
Here, we go over the high-level idea. There are two elements of the WebLLM package that enables new models and weight variants.
249249

250-
- `model_url`: Contains a URL to model artifacts, such as weights and meta-data.
251-
- `model_lib_url`: A URL to the web assembly library (i.e. wasm file) that contains the executables to accelerate the model computations.
250+
- `model`: Contains a URL to model artifacts, such as weights and meta-data.
251+
- `model_lib`: A URL to the web assembly library (i.e. wasm file) that contains the executables to accelerate the model computations.
252252

253253
Both are customizable in the WebLLM.
254254

@@ -257,9 +257,9 @@ async main() {
257257
const appConfig = {
258258
"model_list": [
259259
{
260-
"model_url": "/url/to/my/llama",
260+
"model": "/url/to/my/llama",
261261
"model_id": "MyLlama-3b-v1-q4f32_0"
262-
"model_lib_url": "/url/to/myllama3b.wasm",
262+
"model_lib": "/url/to/myllama3b.wasm",
263263
}
264264
],
265265
};

examples/cache-usage/src/cache_usage.ts

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,16 +24,19 @@ async function main() {
2424
}
2525

2626
// 1. This triggers downloading and caching the model with either Cache or IndexedDB Cache
27-
const selectedModel = "Phi2-q4f16_1"
27+
const selectedModel = "phi-2-q4f16_1-MLC";
2828
const engine: webllm.MLCEngineInterface = await webllm.CreateMLCEngine(
29-
"Phi2-q4f16_1",
30-
{ initProgressCallback: initProgressCallback, appConfig: appConfig }
29+
selectedModel,
30+
{ initProgressCallback: initProgressCallback, appConfig: appConfig },
3131
);
3232

3333
const request: webllm.ChatCompletionRequest = {
3434
stream: false,
3535
messages: [
36-
{ "role": "user", "content": "Write an analogy between mathematics and a lighthouse." },
36+
{
37+
role: "user",
38+
content: "Write an analogy between mathematics and a lighthouse.",
39+
},
3740
],
3841
n: 1,
3942
};
@@ -60,7 +63,9 @@ async function main() {
6063
modelCached = await webllm.hasModelInCache(selectedModel, appConfig);
6164
console.log("After deletion, hasModelInCache: ", modelCached);
6265
if (modelCached) {
63-
throw Error("Expect hasModelInCache() to be false, but got: " + modelCached);
66+
throw Error(
67+
"Expect hasModelInCache() to be false, but got: " + modelCached,
68+
);
6469
}
6570

6671
// 5. If we reload, we should expect the model to start downloading again

examples/chrome-extension-webgpu-service-worker/src/popup.ts

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -47,8 +47,8 @@ const initProgressCallback = (report: InitProgressReport) => {
4747
};
4848

4949
const engine: MLCEngineInterface = await CreateExtensionServiceWorkerMLCEngine(
50-
"Mistral-7B-Instruct-v0.2-q4f16_1",
51-
{ initProgressCallback: initProgressCallback }
50+
"Mistral-7B-Instruct-v0.2-q4f16_1-MLC",
51+
{ initProgressCallback: initProgressCallback },
5252
);
5353
const chatHistory: ChatCompletionMessageParam[] = [];
5454

@@ -150,7 +150,7 @@ function updateAnswer(answer: string) {
150150
function fetchPageContents() {
151151
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
152152
if (tabs[0]?.id) {
153-
var port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
153+
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
154154
port.postMessage({});
155155
port.onMessage.addListener(function (msg) {
156156
console.log("Page contents:", msg.contents);
Lines changed: 118 additions & 99 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,17 @@
11
/* eslint-disable @typescript-eslint/no-non-null-assertion */
2-
'use strict';
2+
"use strict";
33

44
// This code is partially adapted from the openai-chatgpt-chrome-extension repo:
55
// https://github.com/jessedi0n/openai-chatgpt-chrome-extension
66

7-
import './popup.css';
7+
import "./popup.css";
88

9-
import { MLCEngineInterface, InitProgressReport, CreateMLCEngine, ChatCompletionMessageParam } from "@mlc-ai/web-llm";
9+
import {
10+
MLCEngineInterface,
11+
InitProgressReport,
12+
CreateMLCEngine,
13+
ChatCompletionMessageParam,
14+
} from "@mlc-ai/web-llm";
1015
import { ProgressBar, Line } from "progressbar.js";
1116

1217
const sleep = (ms: number) => new Promise((r) => setTimeout(r, ms));
@@ -21,135 +26,149 @@ fetchPageContents();
2126

2227
(<HTMLButtonElement>submitButton).disabled = true;
2328

24-
const progressBar: ProgressBar = new Line('#loadingContainer', {
25-
strokeWidth: 4,
26-
easing: 'easeInOut',
27-
duration: 1400,
28-
color: '#ffd166',
29-
trailColor: '#eee',
30-
trailWidth: 1,
31-
svgStyle: { width: '100%', height: '100%' }
29+
const progressBar: ProgressBar = new Line("#loadingContainer", {
30+
strokeWidth: 4,
31+
easing: "easeInOut",
32+
duration: 1400,
33+
color: "#ffd166",
34+
trailColor: "#eee",
35+
trailWidth: 1,
36+
svgStyle: { width: "100%", height: "100%" },
3237
});
3338

3439
const initProgressCallback = (report: InitProgressReport) => {
35-
console.log(report.text, report.progress);
36-
progressBar.animate(report.progress, {
37-
duration: 50
38-
});
39-
if (report.progress == 1.0) {
40-
enableInputs();
41-
}
40+
console.log(report.text, report.progress);
41+
progressBar.animate(report.progress, {
42+
duration: 50,
43+
});
44+
if (report.progress == 1.0) {
45+
enableInputs();
46+
}
4247
};
4348

44-
// const selectedModel = "TinyLlama-1.1B-Chat-v0.4-q4f16_1-1k";
45-
const selectedModel = "Mistral-7B-Instruct-v0.2-q4f16_1";
46-
const engine: MLCEngineInterface = await CreateMLCEngine(
47-
selectedModel,
48-
{ initProgressCallback: initProgressCallback }
49-
);
49+
// const selectedModel = "TinyLlama-1.1B-Chat-v0.4-q4f16_1-MLC-1k";
50+
const selectedModel = "Mistral-7B-Instruct-v0.2-q4f16_1-MLC";
51+
const engine: MLCEngineInterface = await CreateMLCEngine(selectedModel, {
52+
initProgressCallback: initProgressCallback,
53+
});
5054
const chatHistory: ChatCompletionMessageParam[] = [];
5155

5256
isLoadingParams = true;
5357

5458
function enableInputs() {
55-
if (isLoadingParams) {
56-
sleep(500);
57-
(<HTMLButtonElement>submitButton).disabled = false;
58-
const loadingBarContainer = document.getElementById("loadingContainer")!;
59-
loadingBarContainer.remove();
60-
queryInput.focus();
61-
isLoadingParams = false;
62-
}
59+
if (isLoadingParams) {
60+
sleep(500);
61+
(<HTMLButtonElement>submitButton).disabled = false;
62+
const loadingBarContainer = document.getElementById("loadingContainer")!;
63+
loadingBarContainer.remove();
64+
queryInput.focus();
65+
isLoadingParams = false;
66+
}
6367
}
6468

6569
// Disable submit button if input field is empty
6670
queryInput.addEventListener("keyup", () => {
67-
if ((<HTMLInputElement>queryInput).value === "") {
68-
(<HTMLButtonElement>submitButton).disabled = true;
69-
} else {
70-
(<HTMLButtonElement>submitButton).disabled = false;
71-
}
71+
if ((<HTMLInputElement>queryInput).value === "") {
72+
(<HTMLButtonElement>submitButton).disabled = true;
73+
} else {
74+
(<HTMLButtonElement>submitButton).disabled = false;
75+
}
7276
});
7377

7478
// If user presses enter, click submit button
7579
queryInput.addEventListener("keyup", (event) => {
76-
if (event.code === "Enter") {
77-
event.preventDefault();
78-
submitButton.click();
79-
}
80+
if (event.code === "Enter") {
81+
event.preventDefault();
82+
submitButton.click();
83+
}
8084
});
8185

8286
// Listen for clicks on submit button
8387
async function handleClick() {
84-
// Get the message from the input field
85-
const message = (<HTMLInputElement>queryInput).value;
86-
console.log("message", message);
87-
// Clear the answer
88-
document.getElementById("answer")!.innerHTML = "";
89-
// Hide the answer
90-
document.getElementById("answerWrapper")!.style.display = "none";
91-
// Show the loading indicator
92-
document.getElementById("loading-indicator")!.style.display = "block";
93-
94-
// Generate response
95-
let inp = message;
96-
if (context.length > 0) {
97-
inp = "Use only the following context when answering the question at the end. Don't use any other knowledge.\n" + context + "\n\nQuestion: " + message + "\n\nHelpful Answer: ";
98-
}
99-
console.log("Input:", inp);
100-
chatHistory.push({ "role": "user", "content": inp });
101-
102-
let curMessage = "";
103-
const completion = await engine.chat.completions.create({ stream: true, messages: chatHistory });
104-
for await (const chunk of completion) {
105-
const curDelta = chunk.choices[0].delta.content;
106-
if (curDelta) {
107-
curMessage += curDelta;
108-
}
109-
updateAnswer(curMessage);
88+
// Get the message from the input field
89+
const message = (<HTMLInputElement>queryInput).value;
90+
console.log("message", message);
91+
// Clear the answer
92+
document.getElementById("answer")!.innerHTML = "";
93+
// Hide the answer
94+
document.getElementById("answerWrapper")!.style.display = "none";
95+
// Show the loading indicator
96+
document.getElementById("loading-indicator")!.style.display = "block";
97+
98+
// Generate response
99+
let inp = message;
100+
if (context.length > 0) {
101+
inp =
102+
"Use only the following context when answering the question at the end. Don't use any other knowledge.\n" +
103+
context +
104+
"\n\nQuestion: " +
105+
message +
106+
"\n\nHelpful Answer: ";
107+
}
108+
console.log("Input:", inp);
109+
chatHistory.push({ role: "user", content: inp });
110+
111+
let curMessage = "";
112+
const completion = await engine.chat.completions.create({
113+
stream: true,
114+
messages: chatHistory,
115+
});
116+
for await (const chunk of completion) {
117+
const curDelta = chunk.choices[0].delta.content;
118+
if (curDelta) {
119+
curMessage += curDelta;
110120
}
111-
const response = await engine.getMessage();
112-
chatHistory.push({ "role": "assistant", "content": await engine.getMessage() });
113-
console.log("response", response);
121+
updateAnswer(curMessage);
122+
}
123+
const response = await engine.getMessage();
124+
chatHistory.push({ role: "assistant", content: await engine.getMessage() });
125+
console.log("response", response);
114126
}
115127
submitButton.addEventListener("click", handleClick);
116128

117129
// Listen for messages from the background script
118130
chrome.runtime.onMessage.addListener(({ answer, error }) => {
119-
if (answer) {
120-
updateAnswer(answer);
121-
}
131+
if (answer) {
132+
updateAnswer(answer);
133+
}
122134
});
123135

124136
function updateAnswer(answer: string) {
125-
// Show answer
126-
document.getElementById("answerWrapper")!.style.display = "block";
127-
const answerWithBreaks = answer.replace(/\n/g, '<br>');
128-
document.getElementById("answer")!.innerHTML = answerWithBreaks;
129-
// Add event listener to copy button
130-
document.getElementById("copyAnswer")!.addEventListener("click", () => {
131-
// Get the answer text
132-
const answerText = answer;
133-
// Copy the answer text to the clipboard
134-
navigator.clipboard.writeText(answerText)
135-
.then(() => console.log("Answer text copied to clipboard"))
136-
.catch((err) => console.error("Could not copy text: ", err));
137-
});
138-
const options: Intl.DateTimeFormatOptions = { month: 'short', day: '2-digit', hour: '2-digit', minute: '2-digit', second: '2-digit' };
139-
const time = new Date().toLocaleString('en-US', options);
140-
// Update timestamp
141-
document.getElementById("timestamp")!.innerText = time;
142-
// Hide loading indicator
143-
document.getElementById("loading-indicator")!.style.display = "none";
137+
// Show answer
138+
document.getElementById("answerWrapper")!.style.display = "block";
139+
const answerWithBreaks = answer.replace(/\n/g, "<br>");
140+
document.getElementById("answer")!.innerHTML = answerWithBreaks;
141+
// Add event listener to copy button
142+
document.getElementById("copyAnswer")!.addEventListener("click", () => {
143+
// Get the answer text
144+
const answerText = answer;
145+
// Copy the answer text to the clipboard
146+
navigator.clipboard
147+
.writeText(answerText)
148+
.then(() => console.log("Answer text copied to clipboard"))
149+
.catch((err) => console.error("Could not copy text: ", err));
150+
});
151+
const options: Intl.DateTimeFormatOptions = {
152+
month: "short",
153+
day: "2-digit",
154+
hour: "2-digit",
155+
minute: "2-digit",
156+
second: "2-digit",
157+
};
158+
const time = new Date().toLocaleString("en-US", options);
159+
// Update timestamp
160+
document.getElementById("timestamp")!.innerText = time;
161+
// Hide loading indicator
162+
document.getElementById("loading-indicator")!.style.display = "none";
144163
}
145164

146165
function fetchPageContents() {
147-
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
148-
var port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
149-
port.postMessage({});
150-
port.onMessage.addListener(function (msg) {
151-
console.log("Page contents:", msg.contents);
152-
context = msg.contents
153-
});
166+
chrome.tabs.query({ currentWindow: true, active: true }, function (tabs) {
167+
const port = chrome.tabs.connect(tabs[0].id, { name: "channelName" });
168+
port.postMessage({});
169+
port.onMessage.addListener(function (msg) {
170+
console.log("Page contents:", msg.contents);
171+
context = msg.contents;
154172
});
173+
});
155174
}

0 commit comments

Comments
 (0)