Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cache Needs to be inited #619

Open
theinhumaneme opened this issue Mar 22, 2024 · 7 comments
Open

[Bug]: Cache Needs to be inited #619

theinhumaneme opened this issue Mar 22, 2024 · 7 comments

Comments

@theinhumaneme
Copy link

Current Behavior

i get a stack trace

False
True
Traceback (most recent call last):
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 129, in <module>
    execution_time = timeit.timeit(lambda: llm.invoke("Tell me a joke"), number=1)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/timeit.py", line 237, in timeit
    return Timer(stmt, setup, timer, globals).timeit(number)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/timeit.py", line 180, in timeit
    timing = self.inner(it, self.timer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<timeit-src>", line 6, in inner
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 129, in <lambda>
    execution_time = timeit.timeit(lambda: llm.invoke("Tell me a joke"), number=1)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 153, in invoke
    self.generate_prompt(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 546, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 407, in generate
    raise e
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 397, in generate
    self._generate_with_cache(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 579, in _generate_with_cache
    cache_val = llm_cache.lookup(prompt, llm_string)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 813, in lookup
    res = get(prompt, cache_obj=_gptcache)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/gptcache/adapter/api.py", line 124, in get
    res = adapt(
          ^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/gptcache/adapter/adapter.py", line 33, in adapt
    raise NotInitError()
gptcache.utils.error.NotInitError: The cache should be inited before using

Expected Behavior

I should be able to use the cache normally

Steps To Reproduce

import os
from langchain_openai import ChatOpenAI
from langchain_openai import OpenAI
os.environ["OPENAI_API_KEY"] = ""
CONNECTION_STRING = "postgresql+psycopg2://postgres:postgres@localhost:5432/postgres"

import hashlib
import timeit
from gptcache import Cache, cache
from gptcache.adapter.api import init_similar_cache
from langchain.globals import set_llm_cache
from langchain_community.cache import GPTCache, SQLiteCache
from langchain_openai import OpenAIEmbeddings
from gptcache.manager import get_data_manager, CacheBase, VectorBase
from langchain.globals import set_llm_cache
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation


def get_content_func(data, **_):
    return data.get("prompt").split("Question")[-1]


openai_embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
cache_base = CacheBase(
    "postgresql",
    sql_url="postgresql+psycopg2://postgres:postgres@127.0.0.1:5432/postgres",
)
vector_base = VectorBase(
    "pgvector",
    host="127.0.0.1",
    port="5432",
    user="postgres",
    password="postgres",
    dimension=1536,
)
data_manager = get_data_manager(cache_base, vector_base)

# cache.init(
#     pre_embedding_func=get_content_func,
#     embedding_func=OpenAIEmbeddings(model="text-embedding-3-small").embed_query,
#     data_manager=data_manager,
#     similarity_evaluation=SearchDistanceEvaluation(),
# )
def init_gptcache(cache_obj: Cache, llm: str):
    print(cache.has_init)
    cache.init(
        pre_embedding_func=get_content_func,
        embedding_func=OpenAIEmbeddings(model="text-embedding-3-small").embed_query,
        data_manager=data_manager,
        similarity_evaluation=SearchDistanceEvaluation(),
    )
    print(cache.has_init)


llm_model = "gpt-3.5-turbo-0125"
llm = ChatOpenAI(temperature=0, model_name=llm_model)

set_llm_cache(GPTCache(init_gptcache))

execution_time = timeit.timeit(lambda: llm.invoke("Tell me a joke"), number=1)
print(f"Execution time: {execution_time} seconds")

execution_time = timeit.timeit(lambda: llm.invoke("Tell me a joke"), number=1)
print(f"Execution time: {execution_time} seconds")

Environment

No response

Anything else?

i get this error when i use the set_llm_cache() from langchain

it works fine when i use it normally i.e init but fails when i am trying to embed my text using the openai embeddings i get an error stating that to_embeddings doesn't exist when i change the code in the function to embed_query i get unexpected extra_params passed.

Thank you :D

@SimFG
Copy link
Collaborator

SimFG commented Mar 22, 2024

refer to: #585 (comment)
you should give the cache_obj param for the init func, like:

def init_gptcache(cache_obj: Cache, llm: str):
    print(cache.has_init)
    cache.init(
        cache_obj=cache_obj,
        pre_embedding_func=get_content_func,
        embedding_func=OpenAIEmbeddings(model="text-embedding-3-small").embed_query,
        data_manager=data_manager,
        similarity_evaluation=SearchDistanceEvaluation(),
    )
    print(cache.has_init)

@theinhumaneme
Copy link
Author

Okay that works, thank you but now i get this error

Traceback (most recent call last):
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 129, in <module>
    execution_time = timeit.timeit(lambda: llm.invoke("Tell me a joke"), number=1)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/timeit.py", line 237, in timeit
    return Timer(stmt, setup, timer, globals).timeit(number)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/timeit.py", line 180, in timeit
    timing = self.inner(it, self.timer)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<timeit-src>", line 6, in inner
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/development/scripts/chatbot-postgres-test.py", line 129, in <lambda>
    execution_time = timeit.timeit(lambda: llm.invoke("Tell me a joke"), number=1)
                                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 153, in invoke
    self.generate_prompt(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 546, in generate_prompt
    return self.generate(prompt_messages, stop=stop, callbacks=callbacks, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 407, in generate
    raise e
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 397, in generate
    self._generate_with_cache(
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_core/language_models/chat_models.py", line 579, in _generate_with_cache
    cache_val = llm_cache.lookup(prompt, llm_string)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/langchain_community/cache.py", line 813, in lookup
    res = get(prompt, cache_obj=_gptcache)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/gptcache/adapter/api.py", line 124, in get
    res = adapt(
          ^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/gptcache/adapter/adapter.py", line 78, in adapt
    embedding_data = time_cal(
                     ^^^^^^^^^
  File "/home/theinhumaneme/Documents/NebuLogic/conversation-bot/venv/lib/python3.11/site-packages/gptcache/utils/time.py", line 9, in inner
    res = func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^
TypeError: OpenAIEmbeddings.embed_query() got an unexpected keyword argument 'extra_param

is this because of the new openai endpoints or am i doing something wrong.

@SimFG
Copy link
Collaborator

SimFG commented Mar 22, 2024

@theinhumaneme This seems to be the wrong format of the custom embedding function.
you can refer to: https://github.com/zilliztech/GPTCache/blob/main/gptcache/embedding/openai.py

@SimFG
Copy link
Collaborator

SimFG commented Mar 22, 2024

@theinhumaneme or, you can show the embed_query func, maybe i can give you some advice

@theinhumaneme
Copy link
Author

theinhumaneme commented Mar 22, 2024

@theinhumaneme or, you can show the embed_query func, maybe i can give you some advice

there is no to_embeddings function in the OpenAIEmbeddings class now. We have embed_query and embed_documents

Here's the link
https://api.python.langchain.com/en/latest/_modules/langchain_openai/embeddings/base.html#OpenAIEmbeddings.embed_query

@SimFG
Copy link
Collaborator

SimFG commented Mar 22, 2024

@theinhumaneme
You cannot put langchain's embedding methods into gptcache because they are incompatible, and gptcache will not be considered when langchain is modified.

@theinhumaneme
Copy link
Author

Okay thank you, I will look into the openai library thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants