Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pinecone.core.exceptions.PineconeException with simple examples #154

Open
sergerdn opened this issue Mar 28, 2023 · 14 comments
Open

pinecone.core.exceptions.PineconeException with simple examples #154

sergerdn opened this issue Mar 28, 2023 · 14 comments
Labels
question Further information is requested

Comments

@sergerdn
Copy link

sergerdn commented Mar 28, 2023

I have created a simple script from readme to demonstrate that almost simple example is not working on my end for unknown reasons. I have tested several examples with the same result. However, I tested a JavaScript version that worked perfectly on my end. But I need to work with Python.
Using Python 3.10.10 with poetry on a Windows machine. I have spent two days trying to figure out what happened, but I have had no luck.

import logging
import os

logging.basicConfig(level=logging.DEBUG)
from dotenv import load_dotenv

load_dotenv()

import pinecone


def main():

    pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT")
    )

    index_name = "langchainjsfundamentals"
    print(pinecone.list_indexes())

    # ensure that index exists
    assert index_name in pinecone.list_indexes()

    index = pinecone.Index(index_name)  # or pinecone.GRPCIndex

    ########## ERROR IS HERE ########## 
    upsert_response = index.upsert(
        vectors=[
            ("vec1", [0.1, 0.2, 0.3, 0.4], {"genre": "drama"}),
            ("vec2", [0.2, 0.3, 0.4, 0.5], {"genre": "action"}),
        ],
        namespace="example-namespace"
    )
   ###################################### 

    print(upsert_response)


if __name__ == '__main__':
    main()
[tool.poetry.dependencies]
python = "^3.10"
click = "^8.1.3"
langchain = "^0.0.123"
python-dotenv = "^1.0.0"
pinecone-client = {extras = ["grpc"], version = "2.2.1"}
openai = "^0.27.2"
pypdf = "^3.7.0"
chromadb = "^0.3.13"
datasets = "^2.10.1"
....
    self._sslobj.do_handshake()
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "\example.py", line 39, in <module>
    main()
  File "example.py", line 27, in main
    upsert_response = index.upsert(
  File "\site-packages\pinecone\core\utils\error_handling.py", line 25, in inner_func
    raise PineconeProtocolError(f'Failed to connect; did you specify the correct index name?') from e
pinecone.core.exceptions.PineconeProtocolError: Failed to connect; did you specify the correct index name?
....
@rajat08
Copy link
Contributor

rajat08 commented Mar 28, 2023

I would double-check the environment and the API key. Any index you create will follow the URL scheme :
https://{index-name}-{project-id}.svc.{environment}.pinecone.io . This error is saying that it cannot connect to this index; this happens because either this URL does not exist (in cases where you either have the index name or env name wrong) or the API Key that is used to authenticate the connection to this URL is incorrect.

The API keys you use must be associated with the project and environment the index belongs to. If you are confident those bits of information are correct, please let us know. We can try to debug this further.

@sergerdn
Copy link
Author

sergerdn commented Mar 28, 2023

You are absolutely right, but part of my script is working correctly, including:

index_name = "langchainjsfundamentals"
# printed ["langchainjsfundamentals"]
print(pinecone.list_indexes()) 

# ensure that index exists: WORKS AS I EXPECTED, print my index name
assert index_name in pinecone.list_indexes()

Therefore, I believe that I provided the correct API key and index name.

If we had some end-to-end functional tests to check the connection to the server, I would be able to determine the issue. However, currently, we don't have any. I have only found a few outdated and basic tests.

It can be difficult to determine what the client should send and how the server should reply, which is why having integration/functional tests is crucial for developers.

@sergerdn
Copy link
Author

sergerdn commented Mar 28, 2023

I have figured out why I possibly got that error. The host was not generated properly. Instead of https://{index-name}-{project-id}.svc.{environment}.pinecone.io, it generated https://{index-name}-{index-name}.svc.{environment}.pinecone.io.

Code:

index = pinecone.Index(index_name)
print(index.describe_index_stats())

Capture

Capture

@gdj0nes
Copy link
Contributor

gdj0nes commented Mar 28, 2023

We're happy you found a solution!

@gdj0nes gdj0nes closed this as completed Mar 28, 2023
@gdj0nes gdj0nes added the question Further information is requested label Mar 28, 2023
@sergerdn
Copy link
Author

sergerdn commented Mar 28, 2023

We're happy you found a solution!

I didn't claim to have found a solution. I wondered why I got this error, but I did say that I know how it can be fixed.

But now I have found a solution instead of using:

pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT")
    )

We should using:

 pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT"),
        project_name="PROJECT_ID", # SHOULD PROJECT ID!!!, 
)

I did not expect that all examples lies about real usage. I believe that it is a bug in library code, not in docs.
Maybe it happened based on https://controller.us-west4-gcp.pinecone.io/actions/whoami replying as {"project_name":"PROJECT_ID","user_label":"default","user_name":"USERNAME_ID"} and it can be very confusing, because we have inconsistency with index-name and project-id.

So, I believe we have a bug.

Please note that the JavaScript version of the library is working as expected and as described in the documentation.

@gdj0nes gdj0nes reopened this Mar 28, 2023
@sergerdn
Copy link
Author

@gdj0nes

I will submit a pull request that doesn't fix the bug but helps you understand what's going on better. This will also assist other people in comprehending the issue. I will do it as soon as possible.

@rajat08
Copy link
Contributor

rajat08 commented Mar 28, 2023

@sergerdn, thanks for posting more info. We don't mention passing project id in the docs because the client is supposed to infer it from your API key and environment parameter. We make this call in config.py to get this.
Can you see if the response you get from calling pinecone.whoami() (after you call pinecone.init()) has the right project id? You can confirm the project id by looking at the index URL in the console.

(I am sure you have tried this but writing out so that others can follow it in the future)

@sergerdn
Copy link
Author

Can you see if the response you get from calling pinecone.whoami() (after you call pinecone.init()) has the right project id?

Yes, I confirm that I have seen it in both cases.

WhoAmIResponse(username='18bd562', user_label='default', projectname='5d63542')
WhoAmIResponse(username='18bd562', user_label='default', projectname='5d63542')
import logging
import os

logging.basicConfig(level=logging.DEBUG)
from dotenv import load_dotenv

load_dotenv()

import pinecone


def example_with_project_name():
    pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT"),
        project_name="5d63542",  # SHOULD PROJECT ID!!!
    )
    print(pinecone.whoami())


def example_not_with_project_name():
    pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT"),
    )
    print(pinecone.whoami())


if __name__ == '__main__':
    example_with_project_name()
    example_not_with_project_name()

@rajat08
Copy link
Contributor

rajat08 commented Mar 28, 2023

Can you see if the response you get from calling pinecone.whoami() (after you call pinecone.init()) has the right project id?

Yes, I confirm that I have seen it in both cases.

WhoAmIResponse(username='18bd562', user_label='default', projectname='5d63542')
WhoAmIResponse(username='18bd562', user_label='default', projectname='5d63542')
import logging
import os

logging.basicConfig(level=logging.DEBUG)
from dotenv import load_dotenv

load_dotenv()

import pinecone


def example_with_project_name():
    pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT"),
        project_name="5d63542",  # SHOULD PROJECT ID!!!
    )
    print(pinecone.whoami())


def example_not_with_project_name():
    pinecone.init(
        api_key=os.getenv("PINECONE_API_KEY"),
        environment=os.getenv("PINECONE_ENVIRONMENT"),
    )
    print(pinecone.whoami())


if __name__ == '__main__':
    example_with_project_name()
    example_not_with_project_name()

Thanks, the project_name and project_id mismatch is problematic, but we have a plan to phase it out. We have yet to do it because of some unfortunate naming mishaps(:)) in internal resources, but it'll be out soon.

As for your initial connection error, the URL should be generated correctly because whoami seems to give you the correct answer. The only source of error can be index name then, I'll try to find out if something else is up.

@sergerdn
Copy link
Author

We don't mention passing project id in the docs because the client is supposed to infer it from your API key and environment parameter.

I agree with you that we don't need to change the documentation. However, I was specifically referring to the code, not the documentation. I believe that the naming convention used in the code is confusing, particularly in the case of the project_id variable being named as projectname.

I understand that it happened because the API returned that name, but that's precisely why I have opened this issue. None of the examples worked as written without delving deep into the library, and there's confusion between different naming conventions as well, all because of the naming convention used.

I believe that, at the very least, the comments in the library code should describe what we are getting from the API. This would make it easier for other developers to understand the naming convention used and reduce confusion.

@rajat08
Copy link
Contributor

rajat08 commented Mar 28, 2023

We don't mention passing project id in the docs because the client is supposed to infer it from your API key and environment parameter.

I agree with you that we don't need to change the documentation. However, I was specifically referring to the code, not the documentation. I believe that the naming convention used in the code is confusing, particularly in the case of the project_id variable being named as projectname.

I understand that it happened because the API returned that name, but that's precisely why I have opened this issue. None of the examples worked as written without delving deep into the library, and there's confusion between different naming conventions as well, all because of the naming convention used.

I believe that, at the very least, the comments in the library code should describe what we are getting from the API. This would make it easier for other developers to understand the naming convention used and reduce confusion.

Noted, we'll update it. Appreciate your help 🙏

@sergerdn
Copy link
Author

@rajat08

Using pytest-vcr (https://pytest-vcr.readthedocs.io/en/latest/) with https://docs.pytest.org/en/7.2.x/ for writing functional tests can help prevent bugs in a very efficient and effective way. pytest-vcr allows you to mock and record any response from the API on the fly, while pytest provides an excellent framework for writing and executing tests in Python.

Together, they can be very helpful in testing different scenarios without making actual API requests, saving both time and resources.

I believe that writing tests should be easy and very fun.😄

@rajat08
Copy link
Contributor

rajat08 commented Mar 28, 2023

@rajat08

Using pytest-vcr (https://pytest-vcr.readthedocs.io/en/latest/) with https://docs.pytest.org/en/7.2.x/ for writing functional tests can help prevent bugs in a very efficient and effective way. pytest-vcr allows you to mock and record any response from the API on the fly, while pytest provides an excellent framework for writing and executing tests in Python.

Together, they can be very helpful in testing different scenarios without making actual API requests, saving both time and resources.

I believe that writing tests should be easy and very fun.😄

Thanks for the suggestion! Most of our tests for the client run in a separate private repo that orchestrates code generation but we'll add more tests around this

@sergerdn
Copy link
Author

Most of our tests for the client run in a separate private repo

If a user is struggling to comprehend how something functions, they may require access to tests in order to gain understanding. However, if these tests are not readily available in the main repository, the only recourse may be to explore the library with a debugger for troubleshooting. Therefore, I believe it is very important to have tests in the main repository rather than keeping them private.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants