How to access inputs in TRITONSERVER_InferenceRequest #5499
-
Is your feature request related to a problem? Please describe. We would like to access the inputs we put on an inference request inside the inference request release callback function. However, there doesn't appear to be any documentation on how to do this (i.e. in tritonserver.h we can see that TRITONSERVER_InferenceRequest is defined on line 56 but we have no idea what the struct members are, so we don't know how to access the inputs). To be specific, we want to perform memory de-allocation here: Line 249 in db3d08b Describe the solution you'd like Some documentation on how to do above (i.e. what are the members of TRITONSERVER_InferenceRequest struct) Describe alternatives you've considered In the mean time, we are have a structure to hold the locations of request inputs, and can perform the deallocation using that. Additional context |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
You typically wouldn't modify that yourself. The structs are defined in the tritonserver.h header file, but they're more like aliases you use in calls to the C API. Those calls are detailed in the header file, like the request delete call here that you'd use for that memory deallocation. You should feel free to use the more complex servers as references for how to use the API as well. They are in the same folder, like multi_server and GRPC server, which have that delete call and others. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the response @dyastremsky ! What your saying makes total sense and I agree, the only issue there is TRITONSERVER_InferenceRequestDelete we aren't defining anywhere how the memory de-allocation is done so my assumption is that it isn't doing it? Or does it do the memory deallocation for us? For instance in the https://github.com/triton-inference-server/core/blob/24e2d3afd9ceb7f328cbe2b754ba5d4e144622fb/include/triton/core/tritonserver.h#L538 we define how the response memory allocation/deallocation is performed. I didn't see anything like that for Could you clarify if the TRITONSERVER_InferenceRequestDelete function actually performs the memory deallocation of the input? My guess is that it doesn't just looking here simply.cc since you guys define some custom memory deallocation. So my original question remains that when we get back control of the TRITONSERVER_InferenceRequest, I'd love to be able to use it to deallocate the memory allocated to the input (before we call TRITONSERVER_InferenceRequestDelete), but I can't do that since I don't know how to access the inputs on the TRITONSERVER_InferenceRequest object (i.e. we don't know what the members are). For reference I did take a look at the multi_server and GRPC server code, and its the same issue as far as I could tell. Thanks!! |
Beta Was this translation helpful? Give feedback.
You typically wouldn't modify that yourself. The structs are defined in the tritonserver.h header file, but they're more like aliases you use in calls to the C API. Those calls are detailed in the header file, like the request delete call here that you'd use for that memory deallocation.
You should feel free to use the more complex servers as references for how to use the API as well. They are in the same folder, like multi_server and GRPC server, which have that delete call and others.