diff --git a/inference-dgx-cloud.md b/inference-dgx-cloud.md index fbe7bb1e15..ca0091524d 100644 --- a/inference-dgx-cloud.md +++ b/inference-dgx-cloud.md @@ -8,6 +8,8 @@ authors: # Serverless Inference with Hugging Face and NVIDIA NIM +> **Update:** This service is deprecated and no longer available as of April 10th, 2025. For an alternative, you should consider [Inference Providers](https://huggingface.co/docs/inference-providers/en/index) + Today, we are thrilled to announce the launch of **Hugging Face** **NVIDIA NIM API (serverless)**, a new service on the Hugging Face Hub, available to Enterprise Hub organizations. This new service makes it easy to use open models with the accelerated compute platform, of [NVIDIA DGX Cloud](https://www.nvidia.com/en-us/data-center/dgx-cloud) accelerated compute platform for inference serving. We built this solution so that Enterprise Hub users can easily access the latest NVIDIA AI technology in a serverless way to run inference on popular Generative AI models including Llama and Mistral, using standardized APIs and a few lines of code within the[ Hugging Face Hub](https://huggingface.co/models). @@ -25,7 +27,7 @@ NVIDIA NIM API (serverless) complements [Train on DGX Cloud](https://huggingface ## How it works -Running serverless inference with Hugging Face models has never been easier. Here’s a step-by-step guide to get you started: +Running serverless inference with Hugging Face models has never been easier. Here's a step-by-step guide to get you started: _Note: You need access to an Organization with a [Hugging Face Enterprise Hub](https://huggingface.co/enterprise) subscription to run Inference._ @@ -36,7 +38,7 @@ Before you begin, ensure you meet the following requirements: ### Create a Fine-Grained Token -Fine-grained tokens allow users to create tokens with specific permissions for precise access control to resources and namespaces. First, go to[ Hugging Face Access Tokens](https://huggingface.co/settings/tokens) and click on “Create new Token” and select “fine-grained”. +Fine-grained tokens allow users to create tokens with specific permissions for precise access control to resources and namespaces. First, go to[ Hugging Face Access Tokens](https://huggingface.co/settings/tokens) and click on "Create new Token" and select "fine-grained".