Skip to content

Commit 136da43

Browse files
authored
Add inventory of Gen AI model examples to the README (#109)
Signed-off-by: Shashank Verma <[email protected]>
1 parent bdbafce commit 136da43

File tree

1 file changed

+13
-0
lines changed

1 file changed

+13
-0
lines changed

README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,19 @@ Enterprise RAG examples also support local and remote inference with [TensorRT-L
5656
| ------- | ----------- | ---------- | -------------------------------------------------------------------------- | --------- | ---------- | ------- | ---------------- | ------ | --------------- |
5757
| llama-2 | NV-Embed-QA | LlamaIndex | Chat bot, Kubernetes deployment [[README](./docs/developer-llm-operator/)] | No | No | Yes | No | Yes | Milvus |
5858

59+
60+
### Generative AI Model Examples
61+
62+
The generative AI model examples include end-to-end steps for pre-training, customizing, aligning and running inference on state-of-the-art generative AI models leveraging the [NVIDIA NeMo Framework](https://github.com/NVIDIA/NeMo)
63+
64+
| Model | Resources(s) | Framework | Description |
65+
| ------- | ----------- | ----------- | ----------- |
66+
| gemma | [Docs](./models/Gemma/), [LoRA](./models/Gemma/lora.ipynb), [SFT](./models/Gemma/sft.ipynb) | NeMo | Aligning and customizing Gemma, and exporting to TensorRT-LLM format for inference |
67+
| codegemma | [Docs](./models/Codegemma/), [LoRA](./models/Codegemma/lora.ipynb) | NeMo | Customizing Codegemma, and exporting to TensorRT-LLM format for inference |
68+
| starcoder-2 | [LoRA](./models/StarCoder2/lora.ipynb), [Inference](./models/StarCoder2/inference.ipynb) | NeMo | Customizing Starcoder-2 with NeMo Framework, optimizing with NVIDIA TensorRT-LLM, and deploying with NVIDIA Triton Inference Server |
69+
| small language models (SLMs) | [Docs](./models/NeMo/slm/), [Pre-training and SFT](./models/NeMo/slm/slm_pretraining_sft.ipynb), [Eval](./models/NeMo/slm/megatron_gpt_eval_server.ipynb) | NeMo | Training, alignment, and running evaluation on SLMs using various techniques |
70+
71+
5972
## Tools
6073

6174
Example tools and tutorials to enhance LLM development and productivity when using NVIDIA RAG pipelines.

0 commit comments

Comments
 (0)