Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm optimization? #494

Open
137591 opened this issue Jul 2, 2024 · 1 comment
Open

wasm optimization? #494

137591 opened this issue Jul 2, 2024 · 1 comment

Comments

@137591
Copy link

137591 commented Jul 2, 2024

When I use the web-llm instance (path: /web-llm/examples/simple-chat), and observe the source file (@mlc-ai/web-llm/lib/index.js), I notice that there is a lot of interaction with wasm files, which makes reading the source code somewhat difficult. I would be very grateful if you could inform me of the logical content of all the wasm files! Additionally, I have observed that there seems to be room for optimization in the implementation of model files (for example: "model_lib_url": modelLibURLPrefix + modelVersion + "/Llama-3-8B-Instruct-q4f32_1-ctx1k_cs1k-webgpu.wasm"). May I inquire if I should optimize through modifying the TVM compilation process?

@CharlieFRuan
Copy link
Contributor

Thanks for the question! The wasm is composed of various parts, including the kernel of the model (in WGSL), and runtime support (C++ code compiled into WASM).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants