Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReadTimeoutError during chembl upload into Elasticsearch #3

Open
khyurri opened this issue Sep 9, 2020 · 1 comment
Open

ReadTimeoutError during chembl upload into Elasticsearch #3

khyurri opened this issue Sep 9, 2020 · 1 comment

Comments

@khyurri
Copy link

khyurri commented Sep 9, 2020

Exception
ReadTimeoutError(HTTPSConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
when I upload chembl database into Elasticsearch.

How to reproduce

Just run:

python3 pubchem-crawler/crawl.py extract --database=elastic --elastic-no-verify-certs

and wait 50 mins.

Uninformative stacktrace

[2020-09-09 14:13:55 [WARNING] POST https://localhost:9200/pubchem/_doc [status:N/A request:10.010s]
Traceback (most recent call last):
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1332, in getresponse
    response.begin()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 303, in begin
    version, status, reason = self._read_status()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 264, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 245, in perform_request
    response = self.pool.urlopen(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/util/retry.py", line 379, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise
    raise value
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 428, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)
2020-09-09 11:13:55 [INFO] File /Volumes/TOSHIBA EXT/pubchem_full/pubchem/Compound/CURRENT-Full/SDF/Compound_000000001_000500000.sdf.gz uploaded
Traceback (most recent call last):
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 426, in _make_request
    six.raise_from(e, None)
  File "<string>", line 3, in raise_from
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 421, in _make_request
    httplib_response = conn.getresponse()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 1332, in getresponse
    response.begin()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 303, in begin
    version, status, reason = self._read_status()
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/http/client.py", line 264, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/socket.py", line 669, in readinto
    return self._sock.recv_into(b)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1241, in recv_into
    return self.read(nbytes, buffer)
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/ssl.py", line 1099, in read
    return self._sslobj.read(len, buffer)
socket.timeout: The read operation timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 245, in perform_request
    response = self.pool.urlopen(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 726, in urlopen
    retries = retries.increment(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/util/retry.py", line 379, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/packages/six.py", line 735, in reraise
    raise value
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 670, in urlopen
    httplib_response = self._make_request(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 428, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/urllib3/connectionpool.py", line 335, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "pubchem-crawler/crawl.py", line 295, in <module>
    args = parser.parse_args()
  File "pubchem-crawler/crawl.py", line 220, in extract
    elif arg_ns.database == 'elastic':
  File "pubchem-crawler/crawl.py", line 145, in extract
    handler(extracted)
  File "pubchem-crawler/crawl.py", line 196, in handler
    # todo get pubchem id?
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/client/utils.py", line 152, in _wrapped
    return func(*args, params=params, headers=headers, **kwargs)
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/client/__init__.py", line 391, in index
    return self.transport.perform_request(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 392, in perform_request
    raise e
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/transport.py", line 358, in perform_request
    status, headers_response, data = connection.perform_request(
  File "/Users/ruslan_khyurri/PycharmProjects/pubchem-finder/venv/lib/python3.8/site-packages/elasticsearch/connection/http_urllib3.py", line 257, in perform_request
    raise ConnectionTimeout("TIMEOUT", str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPSConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))
@khyurri
Copy link
Author

khyurri commented Sep 14, 2020

This PR should solve this issue
#4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant