Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sentry::init return Result<T> to prevent sentry panicking during outages #698

Open
benjaminjellis opened this issue Oct 17, 2024 · 1 comment
Labels

Comments

@benjaminjellis
Copy link
Contributor

benjaminjellis commented Oct 17, 2024

Hi,

I think sentry experienced an outage yesterday. We discovered this when one of our production services went down. After some debugging we found that a sentry service was returning a 502 which (I think) made sentry panic. See the logs snippet below where we get the html of a 502 error page from sentry

2024-10-16T19:10:45.950Z
[sentry] enabled sentry client for DSN 
[sentry] dropping client guard -> disposing client
[sentry] client close; request transport to shut down
[sentry] Get response: `
<html><head>
<meta http-equiv="content-type" content="text/html;charset=utf-8">
<title>502 Server Error</title>
</head>
<body text=#000000 bgcolor=#ffffff>
<h1>Error: Server Error</h1>
<h2>The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds.</h2>
<h2></h2>
</body></html>
`

I've spent some time digging around the source but haven't yet pinpointed where the http call is made. What I wanted to check is that if sentry::init does panic on getting a 502 (from somwhere) if you'd be open to changing the return type to a Result to make it easier to handle that failure mode?

If this is amenable I'd be more than happy to make a contribution but given this would be a breaking change I wanted to ask before writing any code.

Let me know what you think.

Thanks,

@benjaminjellis
Copy link
Contributor Author

benjaminjellis commented Oct 22, 2024

I spent a few hours digging into this further this afternoon. I don't think my initial diagnosis is correct. As far as I can tell the log message of the 502 (Get response) is fine with the status being a 502 and does not panic.

I don't understand why we were seeing the dropping client guard -> disposing client and client close; request transport to shut down messages however. I think this is the more likely root cause of the issue we experienced.

This probably means that the title and initial issue are no longer relevant but I want to leave this issue open because I'd appreciate some help in ensuring we don't experience this again

@smeubank smeubank added the bug label Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants