Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(app): Add hostname label to http metrics #3258

Merged
merged 1 commit into from
Oct 15, 2024

Conversation

cratelyn
Copy link
Collaborator

@cratelyn cratelyn commented Oct 4, 2024

this commit introduces an additional label to HTTP and gRPC route
metrics, containing the DNS host of requests. requests destined for an
ip address of some kind are not recorded with a hostname metric, as a
way to help minimize the impact of time series cardinality.

the core of this change is in this field addition:

// /linkerd/app/outbound/src/http/logical/policy/route/metrics/labels.rs
-pub struct Route(pub ParentRef, pub RouteRef);
+pub struct Route {
+    parent: ParentRef,
+    route: RouteRef,
+    hostname: Option<dns::Name>,
+}

see this part of the change to our MkStreamLabel implementation, used
in our metrics tracking request durtion, and counting response status
codes:

-    fn mk_stream_labeler<B>(&self, _: &::http::Request<B>) -> Option<Self::StreamLabel> {
+    fn mk_stream_labeler<B>(&self, req: &::http::Request<B>) -> Option<Self::StreamLabel> {
         let parent = self.params.parent_ref.clone();
         let route = self.params.route_ref.clone();
-        Some(metrics::LabelHttpRsp::from(metrics::labels::Route::from((
-            parent, route,
-        ))))
+        Some(metrics::LabelHttpRsp::from(metrics::labels::Route::new(
+            parent,
+            route,
+            req.uri(),
+        )))
     }

we now inspect the request, and use the URI to label metrics related to
this traffic by hostname.

a http_request_hostnames() test case is added to exercise this.

some todo comments are left, noting where we would ideally like to
simplify or generalize the machinery related to RetryLabelExtract, the
type that bridges the labels needed based on our NewService<T> target,
and the request type later accepted by the instantiated Service<T>.

@cratelyn cratelyn self-assigned this Oct 4, 2024
@cratelyn cratelyn changed the title feat(app) Add hostname label to http metrics feat(app): Add hostname label to http metrics Oct 4, 2024
@cratelyn cratelyn force-pushed the kate/add-hostname-label-to-http-metrics branch from 471301d to 7ef06e9 Compare October 4, 2024 22:52
Copy link
Collaborator Author

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💬 some drafting notes...

@cratelyn

This comment was marked as resolved.

@cratelyn

This comment was marked as resolved.

@cratelyn cratelyn force-pushed the kate/add-hostname-label-to-http-metrics branch 5 times, most recently from f471077 to 1396e89 Compare October 9, 2024 17:05
Copy link
Collaborator Author

@cratelyn cratelyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 🍞 breadcrumbs and discussion seeds for reviewers...

linkerd/app/outbound/src/http/logical/policy/route.rs Outdated Show resolved Hide resolved
Comment on lines +76 to +77
// TODO(kate): it'd be nice if we could avoid creating a time series if it does not exist,
// so that tests can confirm that certain label sets do not exist within the family.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i've discussed this in air, but it seems like it'd be a very straightforward change to drive upstream.

linkerd/http/retry/src/lib.rs Outdated Show resolved Hide resolved
@cratelyn cratelyn requested review from olix0r and sfleen October 9, 2024 17:25
@cratelyn cratelyn marked this pull request as ready for review October 9, 2024 17:25
@cratelyn cratelyn requested a review from a team as a code owner October 9, 2024 17:25
Copy link
Collaborator

@sfleen sfleen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I like the overall structure and nothing jumps out at me as too complex given what we're already working with.

linkerd/app/outbound/src/http/logical/policy/route.rs Outdated Show resolved Hide resolved
linkerd/http/retry/src/lib.rs Outdated Show resolved Hide resolved
cratelyn added a commit that referenced this pull request Oct 10, 2024
#3258 (comment)

Co-Authored-By: Oliver Gould <[email protected]>
Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Oct 10, 2024
these were briefly helpful for debugging, but this is a change that can
be reverted.

#3258 (comment)

Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Oct 10, 2024
cratelyn added a commit that referenced this pull request Oct 10, 2024
#3258 (comment)

Oliver astutely pointed out that, on behalf of the blanket impl that
exists for `Fn(&T) -> P`, we can be more consistent by bounding our `X`
in terms of that, rather than as an `Fn` item.

Signed-off-by: katelyn martin <[email protected]>
@cratelyn
Copy link
Collaborator Author

pushed some commits addressing some of the low-hanging fruit, there is still some feedback to address... 🍎 ✅

cratelyn added a commit that referenced this pull request Oct 11, 2024
#3258 (comment)

this addresses the concerns linked in this thread, and discussed further
in a review conversation.

this refactors the interfaces of `HttpRoute` to make use of the
`linkerd_app_core::dns::Name` type used elsewhere in our networking
stack.

we now accept a `&http::uri::Uri` reference, provided by a request.
then, we inspect the *host* of this request, and retain it for encoding
in our prometheus metrics if it is a valid domain name.

the `Name` type is compatible with SNI, see:
<https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/dns/name/src/name.rs#L20-L26>

```rust
/// A DNS Name suitable for use in the TLS Server Name Indication (SNI)
/// extension and/or for use as the reference hostname for which to verify a
/// certificate.
///
/// A `Name` is guaranteed to be syntactically valid. The validity rules are
/// specified in [RFC 5280 Section 7.2], except that underscores are also
/// allowed.
```

NB: this is *slightly* different from the approach we'd considered in
#3258 (comment),
which would involve calling `NameAddr::from_authority_with_port`,
and subsequently falling back to
`NameAddr::from_authority_with_default_port` if that failed.

this approach is still phrased around the hyper/http url types, but
moves the call to `host()` into our function, sparing us the slightly
spookier invalidated `Option<&str>` argument.

a `#[cfg(test)]`-gated function that accepts an `Option<Name>` is
provided for the sake of test cases, which may want to inspect the
metrics involving a request whose uri includes an authority we would
*not* like to include in our label values.

Co-Authored-By: Oliver Gould <[email protected]>
Signed-off-by: katelyn martin <[email protected]>
cratelyn added a commit that referenced this pull request Oct 11, 2024
#3258 (comment)

this addresses the concerns linked in this thread, and discussed further
in a review conversation.

this refactors the interfaces of `HttpRoute` to make use of the
`linkerd_app_core::dns::Name` type used elsewhere in our networking
stack.

we now accept a `&http::uri::Uri` reference, provided by a request.
then, we inspect the *host* of this request, and retain it for encoding
in our prometheus metrics if it is a valid domain name.

the `Name` type is compatible with SNI, see:
<https://github.com/linkerd/linkerd2-proxy/blob/main/linkerd/dns/name/src/name.rs#L20-L26>

```rust
/// A DNS Name suitable for use in the TLS Server Name Indication (SNI)
/// extension and/or for use as the reference hostname for which to verify a
/// certificate.
///
/// A `Name` is guaranteed to be syntactically valid. The validity rules are
/// specified in [RFC 5280 Section 7.2], except that underscores are also
/// allowed.
```

NB: this is *slightly* different from the approach we'd considered in
#3258 (comment),
which would involve calling `NameAddr::from_authority_with_port`,
and subsequently falling back to
`NameAddr::from_authority_with_default_port` if that failed.

this approach is still phrased around the hyper/http url types, but
moves the call to `host()` into our function, sparing us the slightly
spookier invalidated `Option<&str>` argument.

a `#[cfg(test)]`-gated function that accepts an `Option<Name>` is
provided for the sake of test cases, which may want to inspect the
metrics involving a request whose uri includes an authority we would
*not* like to include in our label values.

Co-Authored-By: Oliver Gould <[email protected]>
Signed-off-by: katelyn martin <[email protected]>
@cratelyn cratelyn force-pushed the kate/add-hostname-label-to-http-metrics branch from a738325 to aed9d30 Compare October 11, 2024 06:05
cratelyn added a commit that referenced this pull request Oct 14, 2024
@cratelyn cratelyn force-pushed the kate/add-hostname-label-to-http-metrics branch 2 times, most recently from 81e1f0b to 0b8f66a Compare October 14, 2024 21:16
@cratelyn
Copy link
Collaborator Author

note: i have squashed this branch into a single commit, due to the fact that some changes were walked back in the process of review. https://github.com/linkerd/linkerd2-proxy/tree/kate/add-hostname-label-to-http-metrics.pre-squash-24-10-14 this branch has been pushed for posterity, which contains these changes and incremental review commits left intact.

after squashing, i've also rebased this branch on top of the latest state of main.

@cratelyn cratelyn requested a review from olix0r October 14, 2024 21:20
this commit introduces an additional label to HTTP and gRPC route
metrics, containing the DNS host of requests. requests destined for an
ip address of some kind are not recorded with a hostname metric, as a
way to help minimize the impact of time series cardinality.

the core of this change is in this field addition:

```diff
// /linkerd/app/outbound/src/http/logical/policy/route/metrics/labels.rs
-pub struct Route(pub ParentRef, pub RouteRef);
+pub struct Route {
+    parent: ParentRef,
+    route: RouteRef,
+    hostname: Option<dns::Name>,
+}
```

see this part of the change to our `MkStreamLabel` implementation, used
in our metrics tracking request durtion, and counting response status
codes:

```diff
-    fn mk_stream_labeler<B>(&self, _: &::http::Request<B>) -> Option<Self::StreamLabel> {
+    fn mk_stream_labeler<B>(&self, req: &::http::Request<B>) -> Option<Self::StreamLabel> {
         let parent = self.params.parent_ref.clone();
         let route = self.params.route_ref.clone();
-        Some(metrics::LabelHttpRsp::from(metrics::labels::Route::from((
-            parent, route,
-        ))))
+        Some(metrics::LabelHttpRsp::from(metrics::labels::Route::new(
+            parent,
+            route,
+            req.uri(),
+        )))
     }
```

we now inspect the request, and use the URI to label metrics related to
this traffic by hostname.

a `http_request_hostnames()` test case is added to exercise this.

some todo comments are left, noting where we would ideally like to
simplify or generalize the machinery related to `RetryLabelExtract`, the
type that bridges the labels needed based on our `NewService<T>` target,
and the request type later accepted by the instantiated `Service<T>`.

Signed-off-by: katelyn martin <[email protected]>
@cratelyn cratelyn force-pushed the kate/add-hostname-label-to-http-metrics branch from 0b8f66a to dffabfe Compare October 14, 2024 21:33
@cratelyn
Copy link
Collaborator Author

➰ tying up a final loop, updated the PR description above, and the commit message to be up-to-date with what we're landing now that we've gone through review. this should now be ready to ship :shipit:

Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌮

@cratelyn cratelyn merged commit b728ffe into main Oct 15, 2024
15 checks passed
@cratelyn cratelyn deleted the kate/add-hostname-label-to-http-metrics branch October 15, 2024 13:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants