You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: CHANGELOG.md
+33Lines changed: 33 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -3,6 +3,39 @@ All notable changes to this project will be documented in this file.
3
3
4
4
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
5
5
6
+
## [0.6.0] - 2024-05-07
7
+
8
+
### Added
9
+
- Ability to switch between [API Catalog](https://build.nvidia.com/explore/discover) models to on-prem models using [NIM-LLM](https://docs.nvidia.com/ai-enterprise/nim-llm/latest/index.html).
10
+
- New API endpoint
11
+
-`/health` - Provides a health check for the chain server.
12
+
- Containerized [evaluation application](./tools/evaluation/) for RAG pipeline accuracy measurement.
13
+
- Observability support for langchain based examples.
14
+
- New Notebooks
15
+
- Added [Chat with NVIDIA financial data](./notebooks/12_Chat_wtih_nvidia_financial_reports.ipynb) notebook.
- A [simple rag example template](https://nvidia.github.io/GenerativeAIExamples/latest/simple-examples.html) showcasing how to build an example from scratch.
18
+
19
+
### Changed
20
+
- Renamed example `csv_rag` to [structured_data_rag](./RetrievalAugmentedGeneration/examples/structured_data_rag/)
21
+
- Model Engine name update
22
+
-`nv-ai-foundation` and `nv-api-catalog` llm engine are renamed to `nvidia-ai-endpoints`
23
+
-`nv-ai-foundation` embedding engine is renamed to `nvidia-ai-endpoints`
24
+
- Embedding model update
25
+
-`developer_rag` example uses [UAE-Large-V1](https://huggingface.co/WhereIsAI/UAE-Large-V1) embedding model.
26
+
- Using `ai-embed-qa-4` for api catalog examples instead of `nvolveqa_40k` as embedding model
27
+
- Ingested data now persists across multiple sessions.
28
+
- Updated langchain-nvidia-endpoints to version 0.0.11, enabling support for models like llama3.
29
+
- File extension based validation to throw error for unsupported files.
30
+
- The default output token length in the UI has been increased from 250 to 1024 for more comprehensive responses.
31
+
- Stricter chain-server API validation support to enhance API security
32
+
- Updated version of llama-index, pymilvus.
33
+
- Updated pgvector container to `pgvector/pgvector:pg16`
34
+
- LLM Model Updates
35
+
-[Multiturn Chatbot](./RetrievalAugmentedGeneration/examples/multi_turn_rag/) now uses `ai-mixtral-8x7b-instruct` model for response generation.
36
+
-[Structured data rag](./RetrievalAugmentedGeneration/examples/structured_data_rag/) now uses `ai-llama3-70b` for response and code generation.
37
+
38
+
6
39
## [0.5.0] - 2024-03-19
7
40
8
41
This release adds new dedicated RAG examples showcasing state of the art usecases, switches to the latest [API catalog endpoints from NVIDIA](https://build.nvidia.com/explore/discover) and also refactors the API interface of chain-server. This release also improves the developer experience by adding github pages based documentation and streamlining the example deployment flow using dedicated compose files.
| mixtral_8x7b |nvolveqa_40k| LangChain | NVIDIA API Catalog endpoints chat bot [[code](./RetrievalAugmentedGeneration/examples/nvidia_api_catalog/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/api-catalog.html)]| No | No | Yes | Yes | Milvus or pgvector |
36
-
| llama-2 |e5-large-v2| LlamaIndex | Canonical QA Chatbot [[code](./RetrievalAugmentedGeneration/examples/developer_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/local-gpu.html)]|[Yes](https://nvidia.github.io/GenerativeAIExamples/latest/multi-gpu.html)| Yes | No | Yes | Milvus or pgvector |
35
+
| mixtral_8x7b |ai-embed-qa-4| LangChain | NVIDIA API Catalog endpoints chat bot [[code](./RetrievalAugmentedGeneration/examples/nvidia_api_catalog/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/api-catalog.html)]| No | No | Yes | Yes | Milvus or pgvector |
36
+
| llama-2 |UAE-Large-V1| LlamaIndex | Canonical QA Chatbot [[code](./RetrievalAugmentedGeneration/examples/developer_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/local-gpu.html)]|[Yes](https://nvidia.github.io/GenerativeAIExamples/latest/multi-gpu.html)| Yes | No | Yes | Milvus or pgvector |
37
37
| llama-2 | all-MiniLM-L6-v2 | LlamaIndex | Chat bot, GeForce, Windows [[repo](https://github.com/NVIDIA/trt-llm-rag-windows/tree/release/1.0)]| No | Yes | No | No | FAISS |
38
-
| llama-2 |nvolveqa_40k| LangChain | Chat bot with query decomposition agent [[code](./RetrievalAugmentedGeneration/examples/query_decomposition_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/query-decomposition.html)]| No | No | Yes | Yes | Milvus or pgvector |
39
-
| mixtral_8x7b |nvolveqa_40k| LangChain | Minimilastic example: RAG with NVIDIA AI Foundation Models [[code](./examples/5_mins_rag_no_gpu/), [README](./examples/README.md#rag-in-5-minutes-example)]| No | No | Yes | Yes | FAISS |
40
-
| mixtral_8x7b<br>Deplot<br>Neva-22b |nvolveqa_40k| Custom | Chat bot with multimodal data [[code](./RetrievalAugmentedGeneration/examples/multimodal_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/multimodal-data.html)]| No | No | Yes | No | Milvus or pvgector |
41
-
| llama-2 |e5-large-v2| LlamaIndex | Chat bot with quantized LLM model [[docs](https://nvidia.github.io/GenerativeAIExamples/latest/quantized-llm-model.html)]| Yes | Yes | No | Yes | Milvus or pgvector |
42
-
|mixtral_8x7b| none | PandasAI | Chat bot with structured data [[code](./RetrievalAugmentedGeneration/examples/structured_data_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/structured-data.html)]| No | No | Yes | No | none |
43
-
| llama-2 |nvolveqa_40k| LangChain | Chat bot with multi-turn conversation [[code](./RetrievalAugmentedGeneration/examples/multi_turn_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/multi-turn.html)]| No | No | Yes | No | Milvus or pgvector |
38
+
| llama-2 |ai-embed-qa-4| LangChain | Chat bot with query decomposition agent [[code](./RetrievalAugmentedGeneration/examples/query_decomposition_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/query-decomposition.html)]| No | No | Yes | Yes | Milvus or pgvector |
39
+
| mixtral_8x7b |ai-embed-qa-4| LangChain | Minimilastic example: RAG with NVIDIA AI Foundation Models [[code](./examples/5_mins_rag_no_gpu/), [README](./examples/README.md#rag-in-5-minutes-example)]| No | No | Yes | Yes | FAISS |
40
+
| mixtral_8x7b<br>Deplot<br>Neva-22b |ai-embed-qa-4| Custom | Chat bot with multimodal data [[code](./RetrievalAugmentedGeneration/examples/multimodal_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/multimodal-data.html)]| No | No | Yes | No | Milvus or pvgector |
41
+
| llama-2 |UAE-Large-V1| LlamaIndex | Chat bot with quantized LLM model [[docs](https://nvidia.github.io/GenerativeAIExamples/latest/quantized-llm-model.html)]| Yes | Yes | No | Yes | Milvus or pgvector |
42
+
|llama3-70b| none | PandasAI | Chat bot with structured data [[code](./RetrievalAugmentedGeneration/examples/structured_data_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/structured-data.html)]| No | No | Yes | No | none |
43
+
| llama-2 |ai-embed-qa-4| LangChain | Chat bot with multi-turn conversation [[code](./RetrievalAugmentedGeneration/examples/multi_turn_rag/), [docs](https://nvidia.github.io/GenerativeAIExamples/latest/multi-turn.html)]| No | No | Yes | No | Milvus or pgvector |
0 commit comments