Skip to content

syedzaidi-kiwi/ColRAG

Repository files navigation

ColRAG

PyPI version Python Versions License: MIT Downloads GitHub stars GitHub forks GitHub issues GitHub pull-requests GitHub contributors GitHub Workflow Status codecov Documentation Status Maintenance made-with-python Open Source Love svg1 PRs Welcome

ColRAG is a powerful RAG (Retrieval-Augmented Generation) pipeline using ColBERT via RAGatouille. It provides an efficient and effective way to implement retrieval-augmented generation in your projects.

🌟 Features

  • 📚 Efficient document indexing
  • 🚀 Fast and accurate retrieval with reranking as an optional parameter
  • 🔗 Seamless integration with ColBERT and RAGatouille
  • 📄 Support for multiple file formats (PDF, CSV, XLSX, DOCX, HTML, JSON, JSONL, TXT)
  • ⚙️ Customizable retrieval parameters

🛠️ Installation

You can install ColRAG using pip:

pip install colrag --upgrade

You can also install ColRAG using poetry (recommended):

Using Poetry

If you're using Poetry to manage your project dependencies, you can add ColRAG to your project with:

poetry add colrag

Or if you want to add it to your pyproject.toml manually, you can add the following line under [tool.poetry.dependencies]:

colrag = "^0.1.0"  # Replace with the latest version

Then run:

poetry install

🚀 Quick Start

Here's a simple example to get you started:

from colrag import index_documents, retrieve_and_rerank_documents

# Index your documents
index_path = index_documents("/path/to/your/documents", "my_index")

# Retrieve documents
query = "What is the capital of France?"
results = retrieve_and_rerank_documents(index_path, query)

for result in results:
    print(f"Score: {result['score']}, Content: {result['content'][:100]}...")

📖 Documentation

For more detailed information about ColRAG's features and usage, please refer to our documentation.

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for more details.

📄 License

ColRAG is released under the MIT License. See the LICENSE file for more details.

📚 Citation

If you use ColRAG in your research, please cite it as follows:

@software{colrag,
  author = {Syed Asad},
  title = {ColRAG: A RAG pipeline using ColBERT via RAGatouille},
  year = {2024},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/syedzaidi-kiwi/ColRAG.git}}
}

📬 Contact

For any questions or feedback, please open an issue on our GitHub repository.

🙏 Acknowledgements


Built with ❤️ by your username