-
Notifications
You must be signed in to change notification settings - Fork 1.6k
[ENH]: add execution time to read/write tracing #4616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Reviewer ChecklistPlease leverage this checklist to ensure your code review is thorough before approving Testing, Bugs, Errors, Logs, Documentation
System Compatibility
Quality
|
@@ -589,6 +597,7 @@ impl ServiceBasedFrontend { | |||
database_name, | |||
source_collection_id, | |||
target_collection_name, | |||
received_at_timestamp_ns: request_received_at_timestamp_ns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm this feels a bit leaky to expose in all methods we call, I think we ought to think of a better abstraction/design here. Will think on this. cc @sanketkedia @Sicheng-Pan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use thread_local
storage? https://doc.rust-lang.org/std/macro.thread_local.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an async runtime, so I doubt that works since tasks can move between threads. That also feels overly magic. I think we are just in need of some encapsulation or indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tokio::task_local
? https://docs.rs/tokio_wasi/latest/tokio/macro.task_local.html https://docs.rs/axum/0.7.7/axum/#using-tokios-task_local-macro
Edit:
I think we are just in need of some encapsulation or indirection.
Makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -762,6 +780,7 @@ impl ServiceBasedFrontend { | |||
documents, | |||
uris, | |||
metadatas, | |||
received_at_timestamp_ns: request_received_at_timestamp_ns, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside - I suggest the use of instant for this sort of a task - https://users.rust-lang.org/t/why-is-std-instant-opaque/74786
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the current implementation relying on passing the start time through the request object, we need whatever type we use to be serializable via utoipa. we can opt for chrono::DateTimechrono::Utc as we figure out a better long-term metering solution.
@@ -623,6 +624,14 @@ impl ServiceBasedFrontend { | |||
database: database_name, | |||
collection_id: source_collection_id.0, | |||
latest_collection_logical_size_bytes, | |||
request_execution_time_ns: Some( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One potential option is for service based frontend to accept a partially built MeterEvent, and add the subsequent data to it and submit it. We can follow the https://cliffle.com/blog/rust-typestate/ and have partial types. This is a bit verbose so I don't love it but its an option.
rust/python_bindings/src/bindings.rs
Outdated
@@ -631,6 +676,13 @@ impl Bindings { | |||
database: String, | |||
py: Python<'_>, | |||
) -> ChromaPyResult<QueryResponse> { | |||
let request_received_at_timestamp_ms = SystemTime::now() | |||
.duration_since(UNIX_EPOCH) | |||
// We use a `QueryError` here because by convention in the service implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice comment
…per/latency-dashboard-api
…racking instead of u128 and don't propagate timing errors back to the client
…per/latency-dashboard-api
Add Execution Time Tracing to Read/Write Operations This PR introduces tracking of execution time for read and write operations in the Chroma database, adding a request_execution_time_ns field to relevant tracing MeterEvent variants. This field is propagated and emitted on the frontend, using chrono::DateTimechrono::Utc for timestamp tracking, with extensive plumbing to ensure all affected API requests include the request start timestamp. The changes span request/response data types, core logic in the Rust frontend implementation, and corresponding updates to Python bindings, testing infrastructure, and dependencies. Key Changes: Affected Areas: Potential Impact: Functionality: Execution time for read/write operations is now traceable on a per-request basis and included in MeterEvent telemetry, allowing for latency monitoring and eventual dashboards. Performance: Negligible runtime overhead from capturing and storing request timestamps/durations per operation. Security: No direct security impact; only tracking timing information. Scalability: Minimal impact, but adds small metadata to every traced request. No major scalability concern. Review Focus: Testing Needed• Verify that all read/write operations (Add, Update, Upsert, Delete, Fork, Query, Get, Count) correctly record Code Quality Assessmentrust/python_bindings/src/bindings.rs: Python interface updated to pass through current Utc::now, with related plumbing. rust/types/src/u128.rs: New module is clear, tested, and follows Rust/Serde conventions for (de)serializing optional u128 values as hex. rust/tracing/src/meter_event.rs: Expanded for new field, correct serialization attributes, and new tests for serialization cases. rust/frontend/src/impls/service_based_frontend.rs: Touches many methods to accept and forward timestamps; comments and reviewer feedback suggest design concerns but implementation is methodical. rust/types/src/api_types.rs: All major API requests augmented for received_at_timestamp; constructors and validation extended accordingly. Best PracticesObservability: Serialization: Testing: Dependency Management: Potential Issues• Current design requires explicit timestamp argument/field in many places, which can be error-prone or brittle long-term. This summary was automatically generated by @propel-code-bot |
Description of changes
This PR introduces execution time tracing to read and write operations on collections. We introduce a
request_execution_time_ns
field to any tracing event whose request performs a read or write operation on a collection. These are bundled and emitted on the frontend.TODO:
Test plan
- [x]
pytest
for python,- [x]
yarn test
for js- [x]
cargo test
for rustDocumentation Changes
N/A