Skip to content

[Question] Best practises to track inputs and predictions?  #475

Open
@FernandoDorado

Description

@FernandoDorado

Hello,

I am seeking advice on the best practices for tracking all inputs and predictions made by a model when using Triton Inference Server. Specifically, I would like to track every interaction the model handles, including input data and the corresponding predictions.

I have reviewed the documentation about Triton Server Trace, but it is unclear if this feature can track predictions as well. You can find the documentation here: Triton Server Trace Documentation.

Additionally, I am concerned about the impact of tracking on system latency. While I am aware that solutions for traditional ML platforms (such as Seldon-Core) often use technologies like KNative and Kafka to store tracking information, it is not clear how these approaches can be integrated with Triton without compromising performance.

I would appreciate recommendations on:

  • How to effectively track inputs and predictions in Triton Inference Server.
  • Whether Triton Server Trace can be utilized for this purpose, and if so, how.
  • Alternative methods or best practices for tracking interactions in Triton while maintaining low latency.

Thank you for your assistance.

Metadata

Metadata

Assignees

Labels

questionFurther information is requestedtriagedIssue has been triaged by maintainers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions