Chatbot with Conversation History Awareness based on Contextual RAG with Hybrid Search and Reranking, fully OpenSource.
The idea was to create a fully functional, completely open source Contextual RAG with Hybrid Search and Reranking application.
Todo
Front is tragic, but I don't really care, because it's not a front project.
- Download and install Ollama
- (Download the Llama3.1 model) In command line type: ollama run llama3.1
- In command line type: Ollama serve
- Download Docker and set it up.
- Open Docker Desktop
- In cmd type: docker run -d --name elasticsearch -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" -e "xpack.security.enabled=false" elasticsearch:8.8.0
- Good to go. Let's evaluate the retrieving capabilities running evaluate.py!
- Or run the app.py and use it. I would suggest you to trimm this codebase json to only a few chunks, create a database and upload your pdf to test the app. Remember to change the name of the database!!
The current code is capable of evaluating the approach using data structurized like the data prepared by Anthropic.
I evaluated the retrieving accuracy at each stage. Pass@n - represents the accuracy of getting the 'golden chunk' (most relevant chunk for the query) within the top-n (top5 and top20) retrieved chunks.
VectorDB (only semantic search):
Pass@5: 63.76%
Pass@20: 79.66%
ContextualVectorDB:
Pass@5: 69.84%
Pass@20: 83.13%
ContextualVectorDB + BM25 (adding BM25 creates Hybrid Search):
Pass@5: 76.53%
Pass@20: 87.37%
ContextualVectorDB + BM25 + Reranker:
Pass@5: 81.32%
Pass@20: 90.83%
(Context created by GPT-4o-mini)
Contextual + BM25 + reranker:
Pass@5: 81.99%
Pass@20: 93.75%
https://github.com/anthropics/anthropic-cookbook/blob/main/skills/contextual-embeddings/guide.ipynb