feat(issues): Implement suspect flag heuristics for feature flags aggregated inside issue details #93076

ryan953 · 2025-06-06T21:46:02Z

This isn't a final UI for everything, still behind it's own FF for employees right now only.

If this set of heuristics turns out to be decent for a range of issues, then we'll have to iterate and move a bunch of logic into the backend directly. Right now it's fun and easy to experiment in the frontend, and render data related to what's used in the heuristics (ie: we use distribution, so render that too).

I'm happier with the results of this algorithm in this iteration. What i've found by spot-checking a bunch of issues is that the flags which are surfaced seem related to the issue i'm looking at. more cross-cutting issues are showing more generic flags, like the trace-view-v1 flag in the example. But I'm happy because it's trace related and the problem happened on the trace page.

What the new heuristic is focused on a few things types of flags:

Flags that are 100% on within the issue. This means that the flag has to be one of the most recently checked flags, so that it's still in the SDK's buffer when the error happens.
Flags that are not 100% enabled for every user. We're doing an approximation of this, because we can only see flags when an issue happens (we're only look at the issue data set right now). There were some examples of flags that I assumed were rolled out to 100%, so like 18,000 examples where foo=true but had some single-digit cases where foo=false. These are tricky!
Flags where every error within the issue includes the flag. Similar to the 1st condition above, we only want to focus on flags where each error event includes a check for that issue. If the issue has 100 errors, spread across 100 users, then we want to have seen the flag 100 times as well within the issue.
Flags where the definition changed recently. This is poorly implemented right now. In this PR we do sort of a best-effort attempt to load a list of 100 changes for the list of flags identified in 1 thru 3 above. Then we keep the flags where a change was detected. This is another way to filter down flags to expose those which are in flux. It seems to be working well right now for new issues where firstSeen is less than 90d ago. We need to iterate on this and test more for issues that are older than 90d. Or... it could be the case that for older issues we don't need to worry about sus flags too much.

Desc	Img
Loading
Nothing found
Single Item
Some Items
Flag Details

github-actions · 2025-06-06T21:46:21Z

🚨 Warning: This pull request contains Frontend and Backend changes!

It's discouraged to make changes to Sentry's Frontend and Backend in a single pull request. The Frontend and Backend are not atomically deployed. If the changes are interdependent of each other, they must be separated into two pull requests and be made forward or backwards compatible, such that the Backend or Frontend can be safely deployed independently.

Have questions? Please ask in the #discuss-dev-infra channel.

ryan953 · 2025-06-06T21:51:26Z

static/app/components/issues/suspect/useGroupSuspectFlagScores.tsx

+  distribution: {
+    baseline: Record<string, number>;
+    outliers: Record<string, number>;
+  };


updated to match #92801

codecov · 2025-06-06T21:55:01Z

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #93076   +/-   ##
=======================================
  Coverage   87.82%   87.83%           
=======================================
  Files       10284    10284           
  Lines      590331   590285   -46     
  Branches    22950    22941    -9     
=======================================
- Hits       518468   518457   -11     
+ Misses      71416    71381   -35     
  Partials      447      447

cmanallen · 2025-06-09T15:39:32Z

I approve 👍

…regated inside issue details (#93076) This isn't a final UI for everything, still behind it's own FF for employees right now only. If this set of heuristics turns out to be decent for a range of issues, then we'll have to iterate and move a bunch of logic into the backend directly. Right now it's fun and easy to experiment in the frontend, and render data related to what's used in the heuristics (ie: we use distribution, so render that too). I'm happier with the results of this algorithm in this iteration. What i've found by spot-checking a bunch of issues is that the flags which are surfaced seem related to the issue i'm looking at. more cross-cutting issues are showing more generic flags, like the `trace-view-v1` flag in the example. But I'm happy because it's trace related and the problem happened on the trace page. What the new heuristic is focused on a few things types of flags: 1. Flags that are 100% on within the issue. This means that the flag has to be one of the most recently checked flags, so that it's still in the SDK's buffer when the error happens. 2. Flags that are not 100% enabled for every user. We're doing an approximation of this, because we can only see flags when an issue happens (we're only look at the issue data set right now). There were some examples of flags that I assumed were rolled out to 100%, so like 18,000 examples where `foo=true` but had some single-digit cases where `foo=false`. These are tricky! 3. Flags where every error within the issue includes the flag. Similar to the 1st condition above, we only want to focus on flags where each error event includes a check for that issue. If the issue has 100 errors, spread across 100 users, then we want to have seen the flag 100 times as well within the issue. 4. Flags where the definition changed recently. This is poorly implemented right now. In this PR we do sort of a best-effort attempt to load a list of 100 changes for the list of flags identified in 1 thru 3 above. Then we keep the flags where a change was detected. This is another way to filter down flags to expose those which are in flux. It seems to be working well right now for new issues where firstSeen is less than 90d ago. We need to iterate on this and test more for issues that are older than 90d. Or... it could be the case that for older issues we don't need to worry about sus flags too much. | Desc | Img | | --- | --- | | Loading | <img width="501" alt="loading suspects" src="https://github.com/user-attachments/assets/ce2dd12e-e658-4337-a6af-1b46fb8054ff" /> | Nothing found | <img width="502" alt="no suspects" src="https://github.com/user-attachments/assets/685c2b9c-aed8-444b-9e99-16c8aab53fe9" /> | Single Item | <img width="500" alt="single suspect" src="https://github.com/user-attachments/assets/429000a6-6692-4b40-9341-a51f339e63fd" /> | Some Items | <img width="908" alt="flag details" src="https://github.com/user-attachments/assets/0f3c053b-e8da-44bd-92fd-25d3726ad37f" /> | Flag Details | <img width="502" alt="some suspects" src="https://github.com/user-attachments/assets/da775b29-baf7-4786-a054-800971cce15b" />

ryan953 added 5 commits June 3, 2025 12:57

feat(issues): Expose baseline percentages with suspect tag info

991ab51

WIP - suspect tags

b934cf5

Merge branch 'master' into ryan953/suspect-flags-fe

6d3cc7f

iterate!

0ab142a

prevent infinite loading

5959748

ryan953 requested review from a team as code owners June 6, 2025 21:46

github-actions bot added Scope: Frontend Automatically applied to PRs that change frontend components Scope: Backend Automatically applied to PRs that change backend components labels Jun 6, 2025

vercel bot deployed to Preview June 6, 2025 21:46 View deployment

ryan953 force-pushed the ryan953/suspect-flags-fe branch from 2a32413 to 6a7527f Compare June 6, 2025 21:48

vercel bot deployed to Preview June 6, 2025 21:51 View deployment

ryan953 commented Jun 6, 2025

View reviewed changes

revert py changes

3d900bf

ryan953 force-pushed the ryan953/suspect-flags-fe branch from a388e29 to 3d900bf Compare June 6, 2025 21:54

vercel bot deployed to Preview June 6, 2025 21:56 View deployment

ryan953 requested a review from a team June 7, 2025 20:47

scttcper approved these changes Jun 9, 2025

View reviewed changes

fix tests

42163c5

vercel bot deployed to Preview June 9, 2025 17:32 View deployment

ryan953 merged commit ecd01cb into master Jun 9, 2025
42 checks passed

ryan953 deleted the ryan953/suspect-flags-fe branch June 9, 2025 17:50

github-actions bot locked and limited conversation to collaborators Jun 25, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(issues): Implement suspect flag heuristics for feature flags aggregated inside issue details #93076

feat(issues): Implement suspect flag heuristics for feature flags aggregated inside issue details #93076

Uh oh!

ryan953 commented Jun 6, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

ryan953 Jun 6, 2025

Uh oh!

codecov bot commented Jun 6, 2025 •

edited

Loading

Uh oh!

cmanallen commented Jun 9, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

feat(issues): Implement suspect flag heuristics for feature flags aggregated inside issue details #93076

feat(issues): Implement suspect flag heuristics for feature flags aggregated inside issue details #93076

Uh oh!

Conversation

ryan953 commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

ryan953 Jun 6, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jun 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cmanallen commented Jun 9, 2025

Uh oh!

Uh oh!

Uh oh!

ryan953 commented Jun 6, 2025 •

edited

Loading

codecov bot commented Jun 6, 2025 •

edited

Loading