Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed at attempting to add a custom data source to 01_create_ingest_documents_test_kb_multi_ds.ipynb #477

Open
jlunn4 opened this issue Feb 6, 2025 · 1 comment

Comments

@jlunn4
Copy link

jlunn4 commented Feb 6, 2025

amazon-bedrock-samples/rag/knowledge-bases/features-examples/01-rag-concepts/01_create_ingest_documents_test_kb_multi_ds.ipynb

I'm attempting to use a custom datasource with the existing code. I added the json to the data sources list.

                {"type": "S3", "bucket_name": data_bucket_name}, 
                {"type": "CUSTOM", "name": "jacks_ds" }```. <-- added this


when I run this cell

```knowledge_base = BedrockKnowledgeBase(
    kb_name=f'{knowledge_base_name}',
    kb_description=knowledge_base_description,
    data_sources=data_sources,
    chunking_strategy = "FIXED_SIZE", 
    suffix = f'{suffix}-f'
)```

The execution fails with the error shown below:   

I attempted to turn off retries by commenting out the annotation.   I don't see a reference to ds_name in the code anywhere so this is confusing.

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[9], line 1
----> 1 knowledge_base = BedrockKnowledgeBase(
      2     kb_name=f'{knowledge_base_name}',
      3     kb_description=knowledge_base_description,
      4     data_sources=data_sources,
      5     chunking_strategy = "FIXED_SIZE", 
      6     suffix = f'{suffix}-f'
      7 )

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/utils/knowledge_base.py:113, in BedrockKnowledgeBase.__init__(self, kb_name, kb_description, data_sources, multi_modal, parser, intermediate_bucket_name, lambda_function_name, embedding_model, generation_model, reranking_model, chunking_strategy, suffix)
    110 self.vector_store_name = f'bedrock-sample-rag-{self.suffix}'
    111 self.index_name = f"bedrock-sample-rag-index-{self.suffix}"
--> 113 self._setup_resources()

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/utils/knowledge_base.py:166, in BedrockKnowledgeBase._setup_resources(self)
    164 print("========================================================================================")
    165 print(f"Step 7 - Creating Knowledge Base")
--> 166 self.knowledge_base, self.data_source = self.create_knowledge_base(self.data_sources)
    167 print("========================================================================================")

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/.venv/lib/python3.10/site-packages/retrying.py:56, in retry.<locals>.wrap.<locals>.wrapped_f(*args, **kw)
     54 @six.wraps(f)
     55 def wrapped_f(*args, **kw):
---> 56     return Retrying(*dargs, **dkw).call(f, *args, **kw)

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/.venv/lib/python3.10/site-packages/retrying.py:266, in Retrying.call(self, fn, *args, **kwargs)
    263 if self.stop(attempt_number, delay_since_first_attempt_ms):
    264     if not self._wrap_exception and attempt.has_exception:
    265         # get() on an attempt with an exception should cause it to be raised, but raise just in case
--> 266         raise attempt.get()
    267     else:
    268         raise RetryError(attempt)

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/.venv/lib/python3.10/site-packages/retrying.py:301, in Attempt.get(self, wrap_exception)
    299         raise RetryError(self)
    300     else:
--> 301         six.reraise(self.value[0], self.value[1], self.value[2])
    302 else:
    303     return self.value

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/.venv/lib/python3.10/site-packages/six.py:724, in reraise(tp, value, tb)
    722     if value.__traceback__ is not tb:
    723         raise value.with_traceback(tb)
--> 724     raise value
    725 finally:
    726     value = None

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/.venv/lib/python3.10/site-packages/retrying.py:251, in Retrying.call(self, fn, *args, **kwargs)
    248     self._before_attempts(attempt_number)
    250 try:
--> 251     attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
    252 except:
    253     tb = sys.exc_info()

File ~/code/amazon-bedrock-samples/rag/knowledge-bases/features-examples/utils/knowledge_base.py:708, in BedrockKnowledgeBase.create_knowledge_base(self, data_sources)
    706     ds = get_ds_response["dataSource"]
    707     pp.pprint(ds)
--> 708 return kb, ds_list

UnboundLocalError: local variable 'ds_list' referenced before assignment


@saimadib
Copy link

saimadib commented Feb 8, 2025

After reviewing the code—specifically the section after line 840—it turns out that the implementation only supports a fixed set of data source types. Currently, the supported types are:

S3
CONFLUENCE
SHAREPOINT
SALESFORCE
WEB

Unfortunately, there isn’t any configuration for a data source of type "CUSTOM", which is why adding it leads to an error. Please update your data source configuration to use one of the supported types listed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants