You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision.
However, I wasn't able to deploy this. Is there a guide for weight conversion / sample configuration, for example to serve the reward model on a single 8x80GB node using FP8 precision?
The text was updated successfully, but these errors were encountered:
Hello,
The web page for Nemotron-4-340B states:
https://research.nvidia.com/publication/2024-06_nemotron-4-340b
These models perform competitively to open access models on a wide range of evaluation benchmarks, and were sized to fit on a single DGX H100 with 8 GPUs when deployed in FP8 precision
.However, I wasn't able to deploy this. Is there a guide for weight conversion / sample configuration, for example to serve the reward model on a single 8x80GB node using FP8 precision?
The text was updated successfully, but these errors were encountered: