You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 23, 2024. It is now read-only.
Describe the bug
I've noticed the first requests (around 50) to the serving server fail with 500. When the model is loaded it normally processes other incoming requests.
To reproduce
Run an image transform job with baching enabled using the ml.p2.xlarge instance.
Expected behavior
TF Serving server starts receiving requests when the model is loaded and ready.
Screenshots or logs
"POST /invocations HTTP/1.1" 500 288 "-" "Go-http-client/1.1"
ERROR:python_service:exception handling request: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/mdl:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc2c84cab10>: Failed to establish a new connection: [Errno 111] Connection refused')) |
ERROR:python_service:exception handling request: HTTPConnectionPool(host='localhost', port=10001): Max retries exceeded with url: /v1/models/mdl:predict (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fc2c84cab10>: Failed to establish a new connection: [Errno 111] Connection refused'))
Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 160, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) File "/usr/local/lib/python3.7/site-packages/gevent/_socket3.py", line 335, in connect raise error(err, strerror(err)) | Traceback (most recent call last): File "/usr/local/lib/python3.7/site-packages/urllib3/connection.py", line 160, in _new_conn (self._dns_host, self.port), self.timeout, **extra_kw File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 84, in create_connection raise err File "/usr/local/lib/python3.7/site-packages/urllib3/util/connection.py", line 74, in create_connection sock.connect(sa) File "/usr/local/lib/python3.7/site-packages/gevent/_socket3.py", line 335, in connect raise error(err, strerror(err))
ConnectionRefusedError: [Errno 111] Connection refused | ConnectionRefusedError: [Errno 111] Connection refused
when this log shows up everything is processed correctly (200):
Successfully loaded servable version {name: mdl version: 3}
Running gRPC ModelServer at 0.0.0.0:10000 ...
Exporting HTTP/REST API at:localhost:10001 ...
"POST /invocations HTTP/1.1" 200 40669 "-" "Go-http-client/1.1"
System information
763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.0-gpu-py37-cu102-ubuntu18.04
Additional context
I use this piece of code to start a job:
Our ping logic doesn't check if the model is loaded correctly. We need to fix the deep ping logic. It waits for a period of time for the model to load. What's the size of your model?
Describe the bug
I've noticed the first requests (around 50) to the serving server fail with 500. When the model is loaded it normally processes other incoming requests.
To reproduce
Run an image transform job with baching enabled using the ml.p2.xlarge instance.
Expected behavior
TF Serving server starts receiving requests when the model is loaded and ready.
Screenshots or logs
when this log shows up everything is processed correctly (200):
System information
763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.3.0-gpu-py37-cu102-ubuntu18.04
Additional context
I use this piece of code to start a job:
The text was updated successfully, but these errors were encountered: