Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API Installation Options and Documentation Improvements #507

Merged
merged 4 commits into from
Dec 26, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ ignore_this.txt
gui/
*.log
.vscode
inputs
rag_storage
.env
venv/
examples/input/
Expand Down
288 changes: 288 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -1019,6 +1019,294 @@ def extract_queries(file_path):
└── test.py
```

## Install with API Support

LightRAG provides optional API support through FastAPI servers that add RAG capabilities to existing LLM services. You can install LightRAG with API support in two ways:

### 1. Installation from PyPI

```bash
pip install "lightrag-hku[api]"
```

### 2. Installation from Source (Development)

```bash
# Clone the repository
git clone https://github.com/ParisNeo/lightrag.git

# Change to the repository directory
cd lightrag

# Install in editable mode with API support
pip install -e ".[api]"
```

### Prerequisites

Before running any of the servers, ensure you have the corresponding backend service running:

#### For LoLLMs Server
- LoLLMs must be running and accessible
- Default connection: http://localhost:9600
- Configure using --lollms-host if running on a different host/port

#### For Ollama Server
- Ollama must be running and accessible
- Default connection: http://localhost:11434
- Configure using --ollama-host if running on a different host/port

#### For OpenAI Server
- Requires valid OpenAI API credentials set in environment variables
- OPENAI_API_KEY must be set

### Configuration Options

Each server has its own specific configuration options:

#### LoLLMs Server Options

| Parameter | Default | Description |
|-----------|---------|-------------|
| --host | 0.0.0.0 | RAG server host |
| --port | 9621 | RAG server port |
| --model | mistral-nemo:latest | LLM model name |
| --embedding-model | bge-m3:latest | Embedding model name |
| --lollms-host | http://localhost:9600 | LoLLMS backend URL |
| --working-dir | ./rag_storage | Working directory for RAG |
| --max-async | 4 | Maximum async operations |
| --max-tokens | 32768 | Maximum token size |
| --embedding-dim | 1024 | Embedding dimensions |
| --max-embed-tokens | 8192 | Maximum embedding token size |
| --input-file | ./book.txt | Initial input file |
| --log-level | INFO | Logging level |

#### Ollama Server Options

| Parameter | Default | Description |
|-----------|---------|-------------|
| --host | 0.0.0.0 | RAG server host |
| --port | 9621 | RAG server port |
| --model | mistral-nemo:latest | LLM model name |
| --embedding-model | bge-m3:latest | Embedding model name |
| --ollama-host | http://localhost:11434 | Ollama backend URL |
| --working-dir | ./rag_storage | Working directory for RAG |
| --max-async | 4 | Maximum async operations |
| --max-tokens | 32768 | Maximum token size |
| --embedding-dim | 1024 | Embedding dimensions |
| --max-embed-tokens | 8192 | Maximum embedding token size |
| --input-file | ./book.txt | Initial input file |
| --log-level | INFO | Logging level |

#### OpenAI Server Options

| Parameter | Default | Description |
|-----------|---------|-------------|
| --host | 0.0.0.0 | RAG server host |
| --port | 9621 | RAG server port |
| --model | gpt-4 | OpenAI model name |
| --embedding-model | text-embedding-3-large | OpenAI embedding model |
| --working-dir | ./rag_storage | Working directory for RAG |
| --max-tokens | 32768 | Maximum token size |
| --max-embed-tokens | 8192 | Maximum embedding token size |
| --input-dir | ./inputs | Input directory for documents |
| --log-level | INFO | Logging level |

### Example Usage

#### LoLLMs RAG Server

```bash
# Custom configuration with specific model and working directory
lollms-lightrag-server --model mistral-nemo --port 8080 --working-dir ./custom_rag

# Using specific models (ensure they are installed in your LoLLMs instance)
lollms-lightrag-server --model mistral-nemo:latest --embedding-model bge-m3 --embedding-dim 1024
```

#### Ollama RAG Server

```bash
# Custom configuration with specific model and working directory
ollama-lightrag-server --model mistral-nemo:latest --port 8080 --working-dir ./custom_rag

# Using specific models (ensure they are installed in your Ollama instance)
ollama-lightrag-server --model mistral-nemo:latest --embedding-model bge-m3 --embedding-dim 1024
```

#### OpenAI RAG Server

```bash
# Using GPT-4 with text-embedding-3-large
openai-lightrag-server --port 9624 --model gpt-4 --embedding-model text-embedding-3-large
```

**Important Notes:**
- For LoLLMs: Make sure the specified models are installed in your LoLLMs instance
- For Ollama: Make sure the specified models are installed in your Ollama instance
- For OpenAI: Ensure you have set up your OPENAI_API_KEY environment variable

For help on any server, use the --help flag:
```bash
lollms-lightrag-server --help
ollama-lightrag-server --help
openai-lightrag-server --help
```

Note: If you don't need the API functionality, you can install the base package without API support using:
```bash
pip install lightrag-hku
```

## API Endpoints

All servers (LoLLMs, Ollama, and OpenAI) provide the same REST API endpoints for RAG functionality.

### Query Endpoints

#### POST /query
Query the RAG system with options for different search modes.

```bash
curl -X POST "http://localhost:9621/query" \
-H "Content-Type: application/json" \
-d '{"query": "Your question here", "mode": "hybrid"}'
```

#### POST /query/stream
Stream responses from the RAG system.

```bash
curl -X POST "http://localhost:9621/query/stream" \
-H "Content-Type: application/json" \
-d '{"query": "Your question here", "mode": "hybrid"}'
```

### Document Management Endpoints

#### POST /documents/text
Insert text directly into the RAG system.

```bash
curl -X POST "http://localhost:9621/documents/text" \
-H "Content-Type: application/json" \
-d '{"text": "Your text content here", "description": "Optional description"}'
```

#### POST /documents/file
Upload a single file to the RAG system.

```bash
curl -X POST "http://localhost:9621/documents/file" \
-F "file=@/path/to/your/document.txt" \
-F "description=Optional description"
```

#### POST /documents/batch
Upload multiple files at once.

```bash
curl -X POST "http://localhost:9621/documents/batch" \
-F "files=@/path/to/doc1.txt" \
-F "files=@/path/to/doc2.txt"
```

#### DELETE /documents
Clear all documents from the RAG system.

```bash
curl -X DELETE "http://localhost:9621/documents"
```

### Utility Endpoints

#### GET /health
Check server health and configuration.

```bash
curl "http://localhost:9621/health"
```

## Development

### Running in Development Mode

For LoLLMs:
```bash
uvicorn lollms_lightrag_server:app --reload --port 9621
```

For Ollama:
```bash
uvicorn ollama_lightrag_server:app --reload --port 9621
```

For OpenAI:
```bash
uvicorn openai_lightrag_server:app --reload --port 9621
```

### API Documentation

When any server is running, visit:
- Swagger UI: http://localhost:9621/docs
- ReDoc: http://localhost:9621/redoc

### Testing API Endpoints

You can test the API endpoints using the provided curl commands or through the Swagger UI interface. Make sure to:
1. Start the appropriate backend service (LoLLMs, Ollama, or OpenAI)
2. Start the RAG server
3. Upload some documents using the document management endpoints
4. Query the system using the query endpoints

### Important Features

#### Automatic Document Vectorization
When starting any of the servers with the `--input-dir` parameter, the system will automatically:
1. Scan the specified directory for documents
2. Check for existing vectorized content in the database
3. Only vectorize new documents that aren't already in the database
4. Make all content immediately available for RAG queries

This intelligent caching mechanism:
- Prevents unnecessary re-vectorization of existing documents
- Reduces startup time for subsequent runs
- Preserves system resources
- Maintains consistency across restarts

### Example Usage

#### LoLLMs RAG Server

```bash
# Start server with automatic document vectorization
# Only new documents will be vectorized, existing ones will be loaded from cache
lollms-lightrag-server --input-dir ./my_documents --port 8080
```

#### Ollama RAG Server

```bash
# Start server with automatic document vectorization
# Previously vectorized documents will be loaded from the database
ollama-lightrag-server --input-dir ./my_documents --port 8080
```

#### OpenAI RAG Server

```bash
# Start server with automatic document vectorization
# Existing documents are retrieved from cache, only new ones are processed
openai-lightrag-server --input-dir ./my_documents --port 9624
```

**Important Notes:**
- The `--input-dir` parameter enables automatic document processing at startup
- Documents already in the database are not re-vectorized
- Only new documents in the input directory will be processed
- This optimization significantly reduces startup time for subsequent runs
- The working directory (`--working-dir`) stores the vectorized documents database

## Star History

Expand Down
Loading
Loading