You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I use the web-llm instance (path: /web-llm/examples/simple-chat), and observe the source file (@mlc-ai/web-llm/lib/index.js), I notice that there is a lot of interaction with wasm files, which makes reading the source code somewhat difficult. I would be very grateful if you could inform me of the logical content of all the wasm files! Additionally, I have observed that there seems to be room for optimization in the implementation of model files (for example: "model_lib_url": modelLibURLPrefix + modelVersion + "/Llama-3-8B-Instruct-q4f32_1-ctx1k_cs1k-webgpu.wasm"). May I inquire if I should optimize through modifying the TVM compilation process?
The text was updated successfully, but these errors were encountered:
Thanks for the question! The wasm is composed of various parts, including the kernel of the model (in WGSL), and runtime support (C++ code compiled into WASM).
When I use the web-llm instance (path: /web-llm/examples/simple-chat), and observe the source file (@mlc-ai/web-llm/lib/index.js), I notice that there is a lot of interaction with wasm files, which makes reading the source code somewhat difficult. I would be very grateful if you could inform me of the logical content of all the wasm files! Additionally, I have observed that there seems to be room for optimization in the implementation of model files (for example: "model_lib_url": modelLibURLPrefix + modelVersion + "/Llama-3-8B-Instruct-q4f32_1-ctx1k_cs1k-webgpu.wasm"). May I inquire if I should optimize through modifying the TVM compilation process?
The text was updated successfully, but these errors were encountered: