This directory contains example files for using MySQL 9's native vector capabilities with LLM Chain.
MySQL 9 introduces built-in vector support through the VECTOR
data type and functions like VECTOR_COSINE_DISTANCE
for similarity operations. This allows using MySQL as a vector store for RAG (Retrieval-Augmented Generation) applications without requiring a separate vector database service.
- PHP 8.2 or higher
- MySQL 9.0.0 or higher
- PDO PHP extension with MySQL driver
- llm-chain library
- Ensure you have Docker and Docker Compose installed
- Start MySQL 9:
docker-compose up -d
- Wait for the MySQL server to be ready (check with
docker-compose logs -f
) - Configure your
.env
file:MYSQL_DSN=mysql:host=localhost;port=3306;dbname=llm_chain;charset=utf8mb4 MYSQL_USERNAME=root MYSQL_PASSWORD=password OPENAI_API_KEY=sk-your-openai-key
- Run the example:
php ../store-mysql-similarity-search.php
If you're not using Docker, you'll need to:
- Install MySQL 9
- Create a database:
CREATE DATABASE llm_chain;
- The example will automatically create the necessary table with a VECTOR column
The MySQL 9 Store implementation:
- Automatically creates a table with the necessary structure when first used
- Converts vector data from JSON to MySQL's native VECTOR type during storage
- Uses MySQL's
VECTOR_COSINE_DISTANCE
function for similarity search - Converts distance scores to similarity scores (1 - distance) for compatibility with other vector stores
The automatically created table has the following structure:
CREATE TABLE vector_documents (
id VARCHAR(36) PRIMARY KEY,
vector_data JSON NOT NULL,
metadata JSON,
VECTOR USING vector_data(1536) -- dimensions is configurable
);
You can customize the Store behavior through constructor parameters:
$store = new Store(
$pdo, // PDO connection
'custom_table_name', // Custom table name (default: vector_documents)
'embedding_vector', // Custom vector column name (default: vector_data)
'document_metadata', // Custom metadata column name (default: metadata)
[], // Additional options
768, // Vector dimensions (default: 1536 for OpenAI)
5 // Default query result limit (default: 3)
);
For production use:
- Consider adding indexes based on your specific query patterns
- Monitor memory usage, especially with large vector collections
- Adjust MySQL server configuration for vector operations