Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SentryHttpMessageHandler issue grouping is too eager #3582

Open
ericsampson opened this issue Aug 30, 2024 · 14 comments
Open

SentryHttpMessageHandler issue grouping is too eager #3582

ericsampson opened this issue Aug 30, 2024 · 14 comments

Comments

@ericsampson
Copy link

ericsampson commented Aug 30, 2024

Problem Statement

When using the SentryHttpMessageHandler, errors from all called URLs get lumped into one Issue. That's not great as the grouping is too eager; it's not very helpful to bundle HTTP errors from 25 different microservices and 15 external SaaS providers into a single Issue.

Solution Brainstorm

I'd love if the grouping could use an auto-parameterized version of the called URL as a grouping key.

@jamescrosswell
Copy link
Collaborator

I'd love if the grouping could use an auto-parameterized version of the called URL as a grouping key.

Hey @ericsampson ,

Could you give an example of some URLs that are being grouped together and how these would ideally be grouped instead?

@bitsandfoxes
Copy link
Contributor

We're always looking at ways to improve the out-of-the-box experience. So some examples would be hugely appreciated.

As one way to unblock yourself, you could look into setting your own custom fingerprint for events.

@ericsampson
Copy link
Author

@bitsandfoxes re the custom fingerprint option;

  • how would I correctly use this with the SentryHttpMessageHandler.cs, i.e. what SentryEvent property should I test against to know in a non-fragile manner that the event source was the SentryHttpMessageHandler ? It sets HttpClientOrigin, but that's an internal prop currently.
  • In the custom fingerprint code, would it be correct for me to use the event.Request.Url property?

@ericsampson
Copy link
Author

ericsampson commented Sep 4, 2024

@jamescrosswell unless I'm misreading something in the dashboard, it currently appears that all called URLs are grouped into one single Issue. Because the default grouping that Sentry is choosing for these events is just the stack trace, which is identical regardless of the called URLs.

In my ideal world, each of these lines would be captured as a separate Issue:

GET foo.com/blah
GET bar.org/blah
POST bar.org/blah
// The handler should autodetect that the following called URL pattern is `foo.com/org/{id}/division/{id}`
// via tokenizing the called URL for patterns of `/{number}/?' and `/{UUID}/?` and '/{bool}/?`
// This isn't 100% perfect because it won't catch textual URL path params, but should cover 95% of cases seen in the real world.
GET foo.com/org/1/division/1, foo.com/org/2/division/1, foo.com/org/2/division/2 

@jamescrosswell
Copy link
Collaborator

  • how would I correctly use this with the SentryHttpMessageHandler.cs, i.e. what SentryEvent property should I test against to know in a non-fragile manner that the event source was the SentryHttpMessageHandler ? It sets HttpClientOrigin, but that's an internal prop currently.

@ericsampson the HttpFailedRequestHandler sets an exception mechanism, so you can do something like this:

    options.SetBeforeSend((evt, hint) =>
    {
        if (evt.SentryExceptions?.Any(x => x.Mechanism is { Type: "SentryHttpFailedRequestHandler" }) is true)
        {
            if (evt.Request.Url != null)
            {
                evt.Fingerprint = [ evt.Request.Url ];
            }
        }
    });

You can get a bunch of other information via the Hint as well (see docs).

@jamescrosswell
Copy link
Collaborator

jamescrosswell commented Sep 5, 2024

@jamescrosswell unless I'm misreading something in the dashboard, it currently appears that all called URLs are grouped into one single Issue. Because the default grouping that Sentry is choosing for these events is just the stack trace, which is identical regardless of the called URLs.

I see... I guess depending on the HttpStatusCode it may or may not make sense to group by Uri. So it might be preferable to group all the 429 Too Many Requests responses together (same for 503 Service Unavailable) but something like 403 Forbidden you probably want to group by URI (since the specific location is relevant to the error).

Various other things like the request content and headers could also be relevant.

I wonder if there is some "default behaviour" Sentry could provide, out of the box, that would provide useful fingerprinting in at least the majority of situations (without being able to know anything specific about the various use cases)... 🤔

@ericsampson
Copy link
Author

ericsampson commented Sep 13, 2024

@jamescrosswell I don't think there should be any status-code-dependent grouping for the default behavior. It should group by method + URL

There's no way to universally classify any HTTP code as "entire server" vs "this specific HTTP call" IMO. For example it's very possible for 429 to be per-endpoint rate limits, and e.g 503 only means that the server is not unavailable to respond to that specific request--it could totally be possible for it to respond successfully to a different URL depending on what's going on behind the scenes.
And the same argument can be made for any other 500 or 400 code that I can think of; HTTP responses are resource-targeted, and so you can't generically classify any HTTP codes as "per resource" vs "entire server"
Anything that involves more detail about specific codes would be only relevant to a given implementation, and so the SDK consumer would have to override the default behavior with their system knowledge, if desired

@jamescrosswell
Copy link
Collaborator

I don't think there should be any status-code-dependent grouping for the default behavior. It should group by method + URL

Yeah, reading through this again I think I agree with you... how HTTP Status codes are used is going to be specific to the server application.

I'm wondering how we could auto-parameterize URL though. For incoming requests, we can use Route data such as Endpoints etc. but in the case of outgoing HttpClient requests there's less structured information available to infer which parts of the URL are parameters and which are not (or whether a parameter should be included or scrubbed for the purpose of grouping).

@ericsampson
Copy link
Author

ericsampson commented Sep 16, 2024

I'm wondering how we could auto-parameterize URL though

For sure it's not going to be 100% perfect, but I think that this is one of those cases where "good enough is good enough" and "don't let perfect be the enemy of good". Start with a proposal like the following and then can adjust as user feedback is gained.

Here's what I would suggest:

  • Tokenize the called URL to replace all patterns of /{number}/? and /{UUID}/? and /{bool}/? with a string like ".../{param}/..."

This isn't perfect because it won't catch textual URL path params, but should cover 95% of cases* seen in the real world
*data 100% fabricated ;)

@jamescrosswell
Copy link
Collaborator

When using the SentryHttpMessageHandler, errors from all called URLs get lumped into one Issue. That's not great as the grouping is too eager; it's not very helpful to bundle HTTP errors from 25 different microservices and 15 external SaaS providers into a single Issue.

btw @ericsampson how is it that you end up with the same stack trace for calls to lots of different external URLs? Are you looking up the actual URL to be called dynamically (rather than specifying this in code) prior to making the Http request?

I'm mainly asking because I'd like to create a simple way to reproduce this problem.

@ericsampson
Copy link
Author

ericsampson commented Oct 2, 2024

@jamescrosswell in our monolith, we have created tons of different external HTTP clients that all derive from a single base class HttpClientBase.cs that exposes methods like public async Task<Result<T>> GetAsync<T>(string url, ...), and the derived classes construct the required urls dynamically call these methods. That seems very normal to me.

The HttpClientBase.cs holds injected Microsoft HttpClient with an inner SentryHttpMessageHandler.

For every http call made by the HttClientBase that results in the SentryHttpFailedRequestHandler > response.EnsureSuccessStatusCode() throwing, the resulting Sentry events ALL get reported as a single sentry Issue with the following exact same grouping details as shown in the attached screenshot. Which is problematic for us, and the motivation for this ticket :)

We have not added any sentry dashboard custom Issue Grouping rules, so what we are seeing is stock Sentry behavior.

Does that help? Thanks!!

Image

@jamescrosswell
Copy link
Collaborator

OK thanks @ericsampson. I think we can put together some code that will ensure the same stack trace when calling multiple different URLs, to simulate your scenario... From there it should be fairly simple to extend the Sentry SDK so that it has some callbacks or something on the options that you could use to customise the fingerprint for those outgoing URLs (and some default implementation of this that we provide out of the box).

@ericsampson
Copy link
Author

That sounds awesome, thanks James!

@bruno-garcia
Copy link
Member

@ericsampson is there a link to an issue in Sentry you you share? I wonder if perhaps the frames InApp flags are not preperly marked.
Are the stack traces not including all frames leading up to the call site? Where your code started the HTTP request?

If not, I wonder if the stack trace is losing frames somehow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: No status
Development

No branches or pull requests

4 participants