-
Notifications
You must be signed in to change notification settings - Fork 62
Default device if no gpu? #116
Comments
WONNX itself requires a GPU for inference, and does not implement CPU inference itself. However, on some systems without a hardware GPU, a 'software emulated GPU' may be available (look up Lavapipe/LLVMpipe) which WONNX can use (and that would effectively be CPU inference). Additionally the WONNX CLI can fall back to CPU inference using the |
Hi there, I just came across this crate while looking for a Rust-native solution for inference on the web. Completely understand the focus on wgpu-driven inference here, but do you have any plans to / know of anyone who is offering a wasm/web-targeted wrapper that provides a consistent interface that defaults to inference with wonnx if available, or CPU inference (presumably with tract) otherwise? |
So, I don't know if I just want to clarify that, most modern laptops has an integrated graphic card. It might not be a powerful NVIDIA card, but it might have some integrated INTEL GPU. So, |
Yep, agreed on GPU availability - I'm not too concerned about that, it's more WebGPU itself (which has, in my experience, been pretty hard to actually use on the web due to its instability). I assumed that wonnx wouldn't work on the WebGL2 backend because of the lack of compute shader functionality - if it does actually work, then I don't need the CPU fallback. |
I see! So, Yes, even if a fallback were to exist, it would probably not work, as So, I had a big discussion on another tract thread (#104) where we discussed integrations of I genuinely don't know when |
As far as I am ware WebGPU in browsers is still stuck on security hesitations / implementation of proper sandboxing. The spec (esp. regarding WGSL syntax) has matured over the past few months and is actually usable in the browser when turned on... Using tract instead of wonnx for inference is actually quite easily done (see wonnx-cli, which can fall back to tract). Further integration is useful for e.g. being able to execute one op in wonnx and fall back to tract for another (in case it hasn't been implemented). Such refactoring could also allow other backends (ORT comes to mind but it could also make sense to implement WebGL2 fallback for some often-used ops if there is demand). |
NB, I am curious whether browsers will be implementing software emulated WebGPU in absence of hardware GPU (e.g. based on Lavapipe?). In that case wonnx would universally run (albeit a lot slower, but I wonder how efficient GPU emulation of wonnx ops is when compared to CPU-code running in e.g. WASM. It might be quite efficient!). |
Thought about this some more, and I'm going to walk this back: I think it's better if That'd also make it easier to tackle WebNN and vendor-specific NN accelerators in future, too - |
That's a good idea. I was planning to create some sort of wrapper anyway for the two crates before I used them in my project. |
The |
Yeah, I saw that - that's awesome, that's already getting us quite far. Are there any plans to extract that into a crate of its own? |
No but it shouldn’t be too difficult. I’d be happy to review a PR! |
Hi there,
I read that wonnx can use gpu through graphics apis like metal and vulkan. Just wondering, does it default to cpu inference if there is no gpu?
Thanks
The text was updated successfully, but these errors were encountered: