-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: Error performing search in AstraDBVectorStore: 'content' #3786
Comments
Hello, @sabatale! I'm here to help you with any bugs, questions, or contributions you have. Let's get to work! To resolve the "ValueError: Error performing search in AstraDBVectorStore: 'content'" error, you need to ensure that the documents retrieved from the AstraDB collection contain the "content" field. The error suggests that the code is trying to access the "content" field in the documents, but it is not present. Here's the relevant part of the code that processes the documents: def _astradb_collection_to_data(self, collection):
data = []
data_dict = collection.find()
if data_dict and "data" in data_dict:
data_dict = data_dict["data"].get("documents", [])
for item in data_dict:
data.append(Data(content=item["content"]))
return data To fix this, you should verify that the documents in your AstraDB collection have the "content" field. If they do not, you need to modify the documents to include this field or adjust the code to handle documents without the "content" field. Here is a modified version of the def _astradb_collection_to_data(self, collection):
data = []
data_dict = collection.find()
if data_dict and "data" in data_dict:
data_dict = data_dict["data"].get("documents", [])
for item in data_dict:
if "content" in item:
data.append(Data(content=item["content"]))
else:
logger.warning(f"Document missing 'content' field: {item}")
return data This modification will log a warning if a document is missing the "content" field and skip adding it to the |
Hi @sabatale, Thank you for your feedback! It would be great if you could share the Regarding the issue, I tested with Ollama using JINA embeddings and AstraDB VectorStore, and it works as expected. Could you please provide more details about your data ingestion pipeline? Is the data ingested through Langflow? For reference, I have attached a screenshot of the working flow. Also, if the ingestion is not done through Langflow, please ensure that the Vector DB has the required fields. A sample data point from AstraDB ingested using Langflow will have the following structure: Top-Level Keys:
Keys within
Overall JSON Structure: {
"_id": "string",
"content": "string",
"$vector": [number, number, ...],
"metadata": {
"source": "string",
"title": "string",
"language": "string"
}
} Please ensure your data matches this structure so that it can be processed correctly. Let me know if you have any questions or need further assistance. |
Could this information about the required DataStax schema for LangFlow please be added to the documentation? |
This situation can be managed if the ingest pipeline utilizes the Langflow AstraDB component. I will create a separate issue to add this to the documentation. @cystema Thank you for the feedback. |
@edwinjosechittilappilly Thanks for the template! It differs from the one we get when loading data outside of Langflow, which causes the error:
|
Bug Description
Hello,
When trying to setup the RAG template with AstraDB, we get the following:
ValueError: Error performing search in AstraDBVectorStore: 'content'
It looks like it's trying to get a key that doesn't exist.
Traceback (most recent call last):
Reproduction
Expected behavior
Data should be returned from AstraDB.
Who can help?
No response
Operating System
Windows
Langflow Version
Latest
Python Version
3.11
Screenshot
No response
Flow File
No response
The text was updated successfully, but these errors were encountered: