Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Cannot create 'Knowledge' #1859

Open
fbomb111 opened this issue Jan 6, 2025 · 5 comments
Open

[BUG] Cannot create 'Knowledge' #1859

fbomb111 opened this issue Jan 6, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@fbomb111
Copy link

fbomb111 commented Jan 6, 2025

Description

I am following the documentation here: https://docs.crewai.com/concepts/knowledge#text-file-knowledge-source

Steps to Reproduce

my_knowledge = Knowledge(
collection_name="my_knowledge",
sources=[source1, source2, source3]
)

I assume that my sources are valid because I tried with fake files and I get a file not found error. When I use real files, this error goes away.

I have tried with both docling and json source classes.

Expected behavior

I would expect no error.

Screenshots/Code snippets

See above.

Operating System

Ubuntu 20.04

Python Version

3.11

crewAI Version

0.95.0

crewAI Tools Version

0.25.8

Virtual Environment

Poetry

Evidence

my_knowledge = Knowledge(
^^^^^^^^^^
File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/knowledge.py", line 46, in init
source.add()
File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/source/crew_docling_source.py", line 88, in add
self._save_documents()
File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/source/base_knowledge_source.py", line 50, in _save_documents
self.storage.save(self.chunks)
File "/home/myproject/.venv/lib/python3.11/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 161, in save
self.collection.upsert(
File "/home/myproject/.venv/lib/python3.11/site-packages/chromadb/api/models/Collection.py", line 334, in upsert
upsert_request = self._validate_and_prepare_upsert_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/myproject/.venv/lib/python3.11/site-packages/chromadb/api/models/CollectionCommon.py", line 93, in wrapper
raise type(e)(msg).with_traceback(e.traceback)
^^^^^^^^^^^^
TypeError: APIStatusError.init() missing 2 required keyword-only arguments: 'response' and 'body'

Possible Solution

None

Additional context

None. Thanks for the help!

@fbomb111 fbomb111 added the bug Something isn't working label Jan 6, 2025
@bhancockio
Copy link
Collaborator

@lorenzejay - Is this similar to the issue you were talking about yesterday?

@fbomb111
Copy link
Author

fbomb111 commented Jan 8, 2025

Some more info if it's helpful, my AI provider is Azure Open AI. The docs say it will use the same provider as what the agents are configured to, but perhaps I need to provide a specific embedder to the crew? I can try this tomorrow.

If you have any ideas I'd be happy to dig in/test them out and report back. Thanks!

@rupakg
Copy link

rupakg commented Jan 10, 2025

@fbomb111 See this issue: #769 You have to add an embedder for a non-OpenAI provider.

@fbomb111
Copy link
Author

fbomb111 commented Jan 18, 2025

@rupakg - Thanks for reply. I added the embedder but it doesn't make a difference. Because the code doesn't fail when it gets to the crew, it fails when trying to create the knowledge. Here's some pseudocode to demonstrate:

  1. create embedder config
  2. create knowledge source (failure is here)
  3. create crew with embedder and knowledge

Here is how I'm creating the embedder:

def _create_embedder(self):
	endpoint = os.getenv("AZUREAI_API_ENDPOINT")
	api_key = os.getenv("AZUREAI_API_KEY")
	embedder_config = {
		"provider": "azure",
		"config": {
			"endpoint": endpoint,
			"api_key": api_key,
			"model": "text-embedding-3-large", 
			"api_version": "2023-03-15-preview"
		}
	}
	return embedder_config

And the knowledge

def _create_knowledge_sources(self):
	mission_source = CrewDoclingSource(
		file_paths=["mission_guidelines.md"]
	)

	theme_source = JSONKnowledgeSource(
		file_paths=["theme.json"]
	)
	return mission_source, theme_source
	
def _create_knowledge(self):

	mission_source, theme_source = self. _create_knowledge_sources()
	research_source = CrewDoclingSource(
		file_paths=["research_results.md"]
	)

	knowledge = Knowledge(
		collection_name="myknowledge",
		sources=[research_source, mission_source, theme_source]
	)

	return knowledge

And the crew....

embedder= self._create_embedder()
knowledge  = self._create_knowledge()
crew = MyCrew(
	knowledge_sources= knowledge,
	embedder= embedder
)

It creates the sources okay, but errors when creating the Knowledge.

@rupakg
Copy link

rupakg commented Jan 19, 2025

I have crewAI version 0.95.0.

I have my knowledge source as follows: (I have tried both .txt and .md files)

			company_pr_source = TextFileKnowledgeSource(
					file_paths=[
						"test.md"
					],
			)

I have added the embedder_config to my Agent like so:

		return Agent(
			config=self.agents_config['pr_specialist'],
			verbose=True,
			embedder_config={
				"provider": "ollama",
				"config": {
					"model": "nomic-embed-text"
				},
			},
			memory=True, 
			knowledge_sources=[company_pr_source]  # Agent-specific knowledge
		)

I am seeing an exception while creating the Knowledge collection - Exception: Failed to create or get collection failing at this line: https://github.com/crewAIInc/crewAI/blob/main/src/crewai/knowledge/storage/knowledge_storage.py#L107

I have tried TextFileKnowledgeSource.

The stack trace below:

  File "<path>/src/crewai_support_agent/crew.py", line 54, in pr_specialist
    return Agent(
           ^^^^^^
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 140, in post_init_setup
    self._set_knowledge()
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/agent.py", line 246, in _set_knowledge
    self._knowledge = Knowledge(
                      ^^^^^^^^^^
  File "<path>crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/knowledge.py", line 43, in __init__
    self.storage.initialize_knowledge_storage()
  File "<path>/crewai_support_agent/.venv/lib/python3.12/site-packages/crewai/knowledge/storage/knowledge_storage.py", line 107, in initialize_knowledge_storage
    raise Exception("Failed to create or get collection")
Exception: Failed to create or get collection

PS: If I try to use CrewDoclingSource as knowledge source, then I get a different error as described in #1846

If anyone can help, I would greatly appreciate it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants