Semantic Search is an application that allows users to perform semantic searches on textual data using the Ollama language model and Pinecone for serverless vector database. This application extracts text from a given URL, splits it into chunks, and indexes the chunks in Pinecone for fast and efficient semantic searching. It then allows users to ask questions related to the content, with Ollama providing contextualized answers based on the indexed text.
- Extracts text from a URL
- Splits the text into chunks for indexing
- Indexes the text chunks in Pinecone for semantic searching
- Provides a user interface for asking questions related to the content
- Ollama provides contextualized answers based on the indexed text
-
Clone the repository:
git clone https://github.com/yourusername/semantic-search.git
-
Install the required dependencies:
pip install -r requirements.txt
-
Set up environment variables:
-
Set your Pinecone API key:
export PINECONE_API_KEY=your_pinecone_api_key
-
Set your Cohere API key:
export COHERE_API_KEY= your_cohere_api_key
-
-
Run the main script:
python main.py --url <URL> --model <ollama_model>
Replace
<URL>
with the URL you want to extract text from and<ollama_model>
with the Ollama model you want to use. -
Once the script finishes indexing the text and setting up the server, you can start asking questions. Enter your question when prompted. To exit, type
bye
.
python main.py --url http://example.com --model mistral