-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] Python binding of Triton server C API #248
Conversation
// forward declaration | ||
class PyServer; | ||
|
||
class PyInferenceRequest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to breakdown this file to one class per file (e.g. PyInferenceRequest
, PyServerOptions
, ...)?
|
||
protected: | ||
TritonStruct* triton_object_{nullptr}; | ||
bool owned_{false}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering whether it is possible to do the same thing with shared_ptr
and custom deleters. For example,
class PyParameter {
...
std::shared_ptr<TRITONSERVER_Parameter, TRITONSERVER_ParameterDelete> pointer_;
};
We would have to allow COPY_AND_ASSIGN for this to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may work for objects that doesn't involve release of ownership, but for request object, user may perform the following and cause TRITONSERVER_InferenceRequest
being deleted:
req = triton_bindings.TRITONSERVER_InferenceRequest(...) # and initialize
req_2 = req
server.infer_async(req)
del req
del req_2
In current implementation of infer_async
, it informs req
that it no longer owns the Triton object, but such information is not passed to req_2
and it will delete the Triton object while it is needed for inference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I think in Python it is just reference counting the same object so req_2
and req
should be in sync. Although, just because I am less familiar with binding code in general,I still prefer to keep DISALLOW_COPY_AND_ASSIGN
to make sure the ownership is controlled within c++ side.
This PR provides the low level binding for Python user to interact with Triton library within the same process, however, this binding is not intended for Python user to use directly but to enable the development of the Python wrapper. Python user should use the wrapper mentioned because it encapsualtes many concepts that are not intuitive to Python user but inevitably leaked into the binding.
Most part of the binding is straight forward mapping of the C API, however, the
PyResponseAllocator
,PyTrace
,PyInferenceRequest
andPyInferenceResponse
requires extra attention because they are objects that involves callbacks and ownership transfer. Also, below is the snippet from.cc
file to help reviewers navigate things that are different from the C equivalent of the binding.