From 445d23f81eb1e872764e2dc3b1cb9aa56ae55ce9 Mon Sep 17 00:00:00 2001
From: Chitoku YATO <cyato@nvidia.com>
Date: Wed, 29 Nov 2023 08:59:26 -0800
Subject: [PATCH] Have a proper tutorial introduction

---
 docs/tutorial-intro.md | 63 +++++++++++++++++++++++++++++++++++++++++-
 mkdocs.yml             |  4 +--
 2 files changed, 64 insertions(+), 3 deletions(-)

diff --git a/docs/tutorial-intro.md b/docs/tutorial-intro.md
index 36931a20..8e176b18 100644
--- a/docs/tutorial-intro.md
+++ b/docs/tutorial-intro.md
@@ -1,4 +1,65 @@
-# Tutorial - Intro
+# Tutorial - Introduction
+
+## Overview
+
+Our tutorials are divided into categories roughly based on model modality, the type of data to be processed or generated.
+
+
+### Text (LLM)
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[text-generation-webui](./tutorial_text-generation.md)** | Interact with a local AI assistant by running a LLM with oobabooga's text-generaton-webui |
+| **[llamaspeak](./tutorial_llamaspeak.md)** | Talk live with Llama using Riva ASR/TTS, and chat about images with Llava! |
+
+### Text + Vision (VLM)
+
+Give your locally running LLM an access to vision!
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[Mini-GPT4](./tutorial_minigpt4.md)** | [Mini-GPT4](https://minigpt-4.github.io/), an open-source model that demonstrate vision-language capabilities.|
+| **[LLaVA](./tutorial_llava.md)** | [Large Language and Vision Assistant](https://llava-vl.github.io/), multimodal model that combines a vision encoder and Vicuna LLM for general-purpose visual and language understanding. |
+
+### Image Generation
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[Stable Diffusion](./tutorial_stable-diffusion.md)** | Run AUTOMATIC1111's [`stable-diffusion-webui`](https://github.com/AUTOMATIC1111/stable-diffusion-webui) to generate images from prompts |
+| **[Stable Diffusion XL](./tutorial_stable-diffusion-xl.md)** | A newer ensemble pipeline consisting of a base model and refiner that results in significantly enhanced and detailed image generation capabilities.|
+
+### Vision Transformers (ViT)
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[EfficientVIT](./tutorial_efficientvit.md)** | MIT Han Lab's [EfficientViT](https://github.com/mit-han-lab/efficientvit), Multi-Scale Linear Attention for High-Resolution Dense Prediction |
+| **[NanoSAM](./tutorial_nanosam.md)** | [NanoSAM](https://github.com/NVIDIA-AI-IOT/nanosam), SAM model variant capable of running in real-time on Jetson |
+| **[NanoOWL](./tutorial_nanoowl.md)** | [OWL-ViT](https://huggingface.co/docs/transformers/model_doc/owlvit) optimized to run real-time on Jetson with NVIDIA TensorRT |
+| **[SAM](./tutorial_sam.md)** | Meta's [SAM](https://github.com/facebookresearch/segment-anything), Segment Anything model |
+| **[TAM](./tutorial_tam.md)** | [TAM](https://github.com/gaomingqi/Track-Anything), Track-Anything model, is an interactive tool for video object tracking and segmentation |
+
+### Vector Database
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[NanoDB](./tutorial_nanodb.md)** | Interactive demo to witness the impact of Vector Database that handles multimodal data |
+
+
+### Audio
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| **[AudioCraft](./tutorial_audiocraft.md)** | Meta's [AudioCraft](https://github.com/facebookresearch/audiocraft), to produce high-quality audio and music |
+| **[Whisper](./tutorial_whisper.md)** | OpenAI's [Whisper](https://github.com/openai/whisper), pre-trained model for automatic speech recognition (ASR) |
+
+## Tips
+
+|      |                     |
+| :---------- | :----------------------------------- |
+| Knowledge Distillation | |
+| SSD + Docker | |
+| Memory optimization | |
+
 
 ## About NVIDIA Jetson
 
diff --git a/mkdocs.yml b/mkdocs.yml
index 33f7b48c..fb529a87 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -66,7 +66,7 @@ extra_css:
 nav:
   - Home: index.md
   - Tutorials:
-    - About NVIDIA Jetson: tutorial-intro.md
+    - Introduction: tutorial-intro.md
     - Text (LLM):
       - text-generation-webui: tutorial_text-generation.md 
       - llamaspeak 🆕: tutorial_llamaspeak.md
@@ -87,7 +87,7 @@ nav:
     - Vector Database:
       - NanoDB: tutorial_nanodb.md
     - Audio:
-      - Audiocraft 🆕: tutorial_audiocraft.md
+      - AudioCraft 🆕: tutorial_audiocraft.md
       - Whisper 🆕: tutorial_whisper.md
     # - Tools:
     #   - LangChain: tutorial_distillation.md