Local RAG Chat

A chat application with an advanced, agentic RAG system. The RAG architecture implements the Adaptive-RAG framework with multi-query translation, intelligent workflow routing, self-reflection capabilities, and web search—all while keeping your data local and private.

Overview

rag_chat_demo.mp4

Video 1: Demonstration of the local RAG chat and how queries are routed in the adaptive workflow. The first query utilizes the advanced RAG pipeline. The second query utilizes web search. And, the final query is generated directly without retrieval.

Advance RAG Workflow

Unlike a basic RAG implementation, this system features:

Adaptive Retrieval-Augmented Generation: Dynamically chooses between vectorstore, web search, or direct generation based on query type
Multi-Query Translation: Creates multiple semantic variations of your question for better retrieval.
Agentic Self-Reflection: Agents grades the system's retrievals and generations, discarding irrelevant documents and regenerating hallucinated responses
Self-Correction: The system will reformulate the original query to a better one when retrieval knowledge is insufficient.
Online Search: Surf the Internet for sources on current news.

Figure 1: The advanced RAG workflow schematic. The adaptive RAG system can route the question to be answered directly, answered with web search, or answered with retrieval. This system also implements agentic self-reflection for better generation.

This system implements a full LangGraph workflow with decision nodes, conditional routing, and self-correction capabilities that significantly improve over standard RAG patterns.

Features

Local LLM: Uses Ollama to run models like DeepSeek-R1 and Mistral locally on your machine.
Document Quality Assessment: Automatically evaluates and filters retrieved documents for relevance
Question Reformulation: Rewrites questions that don't yield good results to improve retrieval
Hallucination Detection: Verifies generated responses against source documents
Web Search Integration: Falls back to Tavily API for real-time information
Sources: Answers with retrieval will cite the retrieved documentation chunks for better transparency.
Document Upload: Supports PDF, TXT, and DOCX files with automatic chunking and embedding
Persistent Conversation History: Complete chat history with SQLite backend

Tech Stack

Backend: FastAPI, LangChain, LangGraph, ChromaDB, SQLite
Frontend: React + TypeScript, Tailwind CSS
LLM Serving: Ollama (DeepSeek-R1 for chat, Mistral for RAG)
Vector Database: ChromaDB with local persistence
Embedding Model: Nomic AI's text embeddings running locally
Web Search: Tavily API integration

Setup Instructions

Prerequisites

Python 3.9+
Node.js 16+
Ollama installed
Tavily API key

Backend Setup

Clone the repository

git clone https://github.com/vulong2505/RAG-Chat.git
cd rag-chat

Create and activate a virtual environment

cd backend
py -m venv venv
venv\Scripts\activate  # On Windows

Install dependencies:

pip install -r requirements.txt

Initialize the database

py -m app.database.init_db

Add your Tavily API key to .env:

# backend/.env
echo "TAVILY_API_KEY=your-api-key-here" > .env

Start Ollama service and pull the required models:

ollama serve & # runs Ollama as a background process
ollama pull deepseek-r1:7b
ollama pull mistral

Start the backend server:

# Run backend/run.py
py run.py

Frontend Setup

Open a new terminal and navigate to the frontend directory:

# From RAG_Chat/
cd frontend

Install dependencies

npm install

Start the development server.

npm run dev

Open your browser and visit http://localhost:5173

Reset the Database

To reset the ChromaDB vectorstore and SQLite database:

# from RAG_Chat/backend/
py -m app.database.reset_db

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
backend		backend
frontend		frontend
node_modules		node_modules
notebook		notebook
readme_data		readme_data
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Local RAG Chat

Table of Contents

Overview

Advance RAG Workflow

Features

Tech Stack

Setup Instructions

Prerequisites

Backend Setup

Frontend Setup

Reset the Database

About

Releases

Packages

Languages

vulong2505/RAG-Chat

Folders and files

Latest commit

History

Repository files navigation

Local RAG Chat

Table of Contents

Overview

Advance RAG Workflow

Features

Tech Stack

Setup Instructions

Prerequisites

Backend Setup

Frontend Setup

Reset the Database

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages