Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibilities to Enhance Vector Store Retrieval to Minimize Token Usage #29

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

OscarAgreda
Copy link

Pull Request Title:

Enhance Vector Store Retrieval to Minimize Token Usage

Pull Request Description:

Summary

This pull request enhances the current vector store retrieval mechanism by introducing an additional method to minimize token usage when querying the language model. The new approach focuses on pre-processing and filtering relevant data locally, ensuring efficient query processing and reducing the overall token count sent to the LLM.

Key Changes

  1. Added a New Function get_results_minimized_tokens:

    • This function utilizes the Neo4jVector's similarity_search method to filter and retrieve only the most relevant data based on the user's query.
    • Constructs the context locally, reducing the size of the context passed to the language model.
  2. Updated Retrieval Mechanism:

    • Ensures that the language model receives a concise and precise context, minimizing the token count while maintaining response quality.
    • Adds citations for sources in the response to provide clarity and references.
  3. Retry Mechanism:

    • Included a retry mechanism to handle transient errors and ensure robustness in the retrieval process.

Benefits

  • Reduced Token Usage: By filtering and summarizing relevant data locally, this approach significantly reduces the number of tokens sent to the LLM, making the process more efficient.
  • Improved Performance: Enhances the speed and efficiency of generating responses by minimizing unnecessary context.
  • Robust and Reliable: The retry mechanism ensures the process is robust against transient errors.

Example Usage

# Using the vector store directly with minimized token usage
response = get_results_minimized_tokens("What are the key points in the recent SEC filing?")
print(response)

This enhancement will help in managing token limits effectively while providing accurate and concise responses based on the vector store's context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant