Open
Description
https://huggingface.co/blog/optimum-nvidia
They're seeing up to 128x faster inference when using Optimum-NVIDIA. We should also check whether this can be used in tandem with BetterTransformer.
Metadata
Metadata
Assignees
Type
Projects
Status
📋 Backlog