Python binding of Triton server C API #265

GuanLuo · 2023-09-13T22:51:42Z

Duplicated from #248 and developed on top of it, because the original PR is targeting the Python wrapper branch which is not finalized and merged into main branch. Please refer to #248 for context of the discussing below.

Remaining item for this PR:

setup build script for Python binding wheel file and generate wheel file as part of CMake build
(server PR) include the wheel file in docker image so user may install the binding. Setup CI to run the test.

Most of the comment in #248 has been addressed, but there are unresolved discussions and they will be resolved in separated PRs:

Simplify ownership management within PyXXX objects (discussion), because for most of the objects, the ownership is fixed.
Separate single large source file into per-object files (discussion), this requires separation of declarations and definitions, it will be done in separated PR to fit the timeline.
Add CMake option to control whether the Python package should be built

New issues to be resolved in separated PRs:

Fix callback resources management for request re-use. Currently callback resources are released as part of the internal PyTritonXXXCallback, which doesn't take into the account that the request can be re-used without re-configuring the callback setting. In that case, the callback resource is not valid for the sequential inferences.
Explicit response management. In Triton C API, there is implicit dependency between the server object and response object, which requires extra care when using the Python binding (i.e. explicit del res in testing to make sure server exit properly). Ideally such dependency is handled internally like the dependency between request and server.

Tabrizian

It mostly looks good. Will take a closer look at the testing/bindings in a follow up review.

CMakeLists.txt

python/binding/tritonserver_pybind.cc

CMakeLists.txt

Tabrizian

Looks great overall!

It doesn't have to be part of this PR but It would be great to have a valgrind test to make sure that the common functions (e.g., async_infer, model_load, tracing) do not have a memory leak.

Tabrizian · 2023-09-18T15:46:44Z

python/test/test_binding.py

+    if trace.id not in user_object:
+        user_object[trace.id] = []
+
+    # not owning trace, so must read property out


Does it mean that if the user stores the trace object it will become invalid?

For example, if I used user_object[trace.id]=trace would it result in segfault if I access trace after returning from callback?

On the current implementation it will be invalid as the py::object is passed as reference to a local variable (wrapper of Triton trace).

To make your example valid, what we can do is to store the trace wrapper in the trace callback resource (seen_traces) when seen and remove from it when release.

Tabrizian · 2023-09-18T15:46:57Z

python/test/test_binding.py

+        server.infer_async(request, trace)
+
+        # [FIXME] WAR due to trace lifecycle is tied to response in Triton core,
+        # trace reference should drop on response send..


If I'm understanding correctly, this WAR is mainly for testing purposes (i.e., to make sure that the trace object really gets deleted)?

Yes, the coupling is within core logic and we need to make change there to decouple the lifecycle. This is not obvious in direct C API usage because users typically delete response on response complete callback which triggers trace release callback, in Python the time of deletion is less-deterministic and can cause confusion when trace release is being expected.

Tabrizian · 2023-09-18T15:47:11Z

python/test/test_binding.py

+        self.assertEqual(len(res.output_classification_label(0, 1)), 0)
+        # [FIXME] keep alive behavior is not established between response
+        # and server, so must explicitly handle the destruction order for now.
+        del res


What would happen if response is not deleted here?

Triton response holds a shared pointer of the model and will cause server not exiting properly (waiting for model to fully unload). Not establishing keep_alive makes Python garbage collects server before response.

Approving the PR due to time constraints but I think we should address this since it affects basic usage of the low-level Python bindings.

Tabrizian · 2023-09-18T15:47:49Z

python/setup.py

+    version=VERSION,
+    author="NVIDIA Inc.",
+    author_email="[email protected]",
+    description="Python API of the Triton In-Process Server",


"Python bindings for TritonServer C-API"

I think this describes the Python package tritonserver? Python binding is meant to be lower level and not directly exposed.

python/setup.py

python/test/test_binding.py

Tabrizian · 2023-09-18T15:48:51Z

python/test/test_binding.py

+        # wrap in lambda function to avoid early evaluation that raises
+        # exception before assert
+        self.assertRaises(triton_bindings.TritonError,
+                          lambda: request.correlation_id)


Is this raising an exception because request.correlation_id_string was assigned before?

Yes, in core it checks the type of stored correlation id and returns error if mismatch

python/tritonserver/_c/tritonserver_pybind.cc

* Add Python binding for Triton server API * Use uintptr_t for pointer. Address exception name. Interact with GIL * Fix lifetime dependency between request and server * Better variable name. Pass by reference to avoid extra copy construct * Add wheel build script. Address comment * Format * Avoid core referring to PyBind in backend build * Using generator expression to build on Windows * Specify rpath for runtime library lookup. Fix bug * Address comment. Format

GuanLuo added 4 commits September 13, 2023 15:26

Add Python binding for Triton server API

68c54b8

Use uintptr_t for pointer. Address exception name. Interact with GIL

e56a474

Fix lifetime dependency between request and server

61fed3a

Better variable name. Pass by reference to avoid extra copy construct

a038ce5

GuanLuo requested review from Tabrizian, nnshah1 and tanmayv25 September 13, 2023 22:51

Tabrizian reviewed Sep 14, 2023

View reviewed changes

CMakeLists.txt Outdated Show resolved Hide resolved

CMakeLists.txt Outdated Show resolved Hide resolved

CMakeLists.txt Outdated Show resolved Hide resolved

python/binding/tritonserver_pybind.cc Outdated Show resolved Hide resolved

CMakeLists.txt Outdated Show resolved Hide resolved

GuanLuo added 4 commits September 17, 2023 16:52

Add wheel build script. Address comment

7326a11

Format

e986221

Avoid core referring to PyBind in backend build

d330b0c

Using generator expression to build on Windows

585e6d5

Tabrizian reviewed Sep 18, 2023

View reviewed changes

GuanLuo added 2 commits September 18, 2023 11:33

Specify rpath for runtime library lookup. Fix bug

65679d0

Address comment. Format

29e4b5a

GuanLuo requested a review from Tabrizian September 18, 2023 19:06

Tabrizian approved these changes Sep 18, 2023

View reviewed changes

GuanLuo merged commit e3524bb into main Sep 19, 2023

GuanLuo deleted the gluo-binding-main branch September 19, 2023 00:58

Python binding of Triton server C API #265

Python binding of Triton server C API #265

Uh oh!

Conversation

GuanLuo commented Sep 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Tabrizian left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Tabrizian left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

GuanLuo commented Sep 13, 2023 •

edited

Loading

Tabrizian left a comment •

edited

Loading