-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add weighted mean for retention insights #28302
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Summary
This PR enhances retention insights by adding weighted mean calculations and fixing cohort size issues. Here's a summary of the key changes:
- Added new
RetentionMeanDropdown
component in/frontend/src/scenes/insights/filters/RetentionMeanDropdown.tsx
replacing the boolean checkbox with a tri-state selector (none/simple/weighted) - Created migration
0660_migrate_retention_show_mean.py
to convert existing booleanshow_mean
values to strings ('simple' or null) - Fixed bug in
/frontend/src/scenes/retention/RetentionTable.tsx
to exclude cohorts with 0 users from mean calculations - Updated type definitions in
/frontend/src/types.ts
and/posthog/schema.py
to support string-based mean types - Modified query handling in
/frontend/src/scenes/insights/utils.tsx
for backward compatibility with old boolean values
💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!
10 file(s) reviewed, 7 comment(s)
Edit PR Review Bot Settings | Greptile
onChange={(showMean) => { | ||
updateInsightFilter({ showMean }) | ||
}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: consider adding type safety for showMean parameter to ensure it matches RetentionMeanType
if (retentionFilter && 'showMean' in retentionFilter && typeof retentionFilter.showMean === 'boolean') { | ||
retentionFilter.showMean = retentionFilter.showMean ? 'simple' : null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: This code is duplicated from parseDraftQueryFromLocalStorage. Consider extracting to a shared helper function
retention_insights = Insight.objects.filter(filters__insight="RETENTION", deleted=False).exclude( | ||
filters__show_mean__isnull=True | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
logic: The filter excludes null show_mean values but doesn't handle cases where show_mean key doesn't exist in filters dict. Consider adding filters__has_key='show_mean' to the query.
if isinstance(insight.filters.get("show_mean"), bool): | ||
# Convert boolean to string - if True, use 'simple' | ||
insight.filters["show_mean"] = "simple" if insight.filters["show_mean"] else None | ||
insight.save() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: No error handling for corrupted data where filters['show_mean'] exists but is neither boolean nor string. Consider adding a try/except block.
if isinstance(insight.filters.get("show_mean"), bool): | |
# Convert boolean to string - if True, use 'simple' | |
insight.filters["show_mean"] = "simple" if insight.filters["show_mean"] else None | |
insight.save() | |
try: | |
if isinstance(insight.filters.get("show_mean"), bool): | |
# Convert boolean to string - if True, use 'simple' | |
insight.filters["show_mean"] = "simple" if insight.filters["show_mean"] else None | |
insight.save() | |
except Exception as e: | |
print(f"Error processing insight {insight.id}: {e}") |
📸 UI snapshots have been updated14 snapshot changes in total. 0 added, 14 modified, 0 deleted:
Triggered by this commit. |
Size Change: 0 B Total Size: 1.18 MB ℹ️ View Unchanged
|
@@ -587,7 +595,15 @@ export function crushDraftQueryForLocalStorage(query: Node<Record<string, any>>, | |||
|
|||
export function parseDraftQueryFromURL(query: string): Node<Record<string, any>> | null { | |||
try { | |||
return JSON.parse(query) | |||
const parsedQuery = JSON.parse(query) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this repeated code in a helper function called something like "migrateDraftQuery"?
There will be a lot more of these I think.
label: 'Simple mean', | ||
}, | ||
{ | ||
value: 'weighted', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we could add a tooltip here that just describes what this is in slightly more words?
Not sure - do you think it's more clear if these modes are called
"Average by week"
"Average by person"
Or do you think the current "Mean" and "weight mean" are more clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think simple mean and weighted mean are more clear. Even if someone doesn't know what a weighted mean is, it's something they can look up. I do think it might be helpful to have a concise tooltip though. Not sure what the copy should be, I'll sleep on it and add this in the AM.
Insight = apps.get_model("posthog", "Insight") | ||
|
||
# Get all retention insights | ||
retention_insights = Insight.objects.filter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're going to want to chunk these to avoid potential memory issues (although I think it would be okay as is):
see 0459_convert_personsnode_insights_to_actorsquery.py for an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple things to address but this looks (and works) fantastic! Amazing 🎉
Great work again, much needed feature! |
ee2d752
to
1b92bf3
Compare
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
📸 UI snapshots have been updated1 snapshot changes in total. 0 added, 1 modified, 0 deleted:
Triggered by this commit. |
6a0f9fe
to
fdd5b67
Compare
… bool values for showMean
fdd5b67
to
7c24c03
Compare
📸 UI snapshots have been updated12 snapshot changes in total. 0 added, 12 modified, 0 deleted:
Triggered by this commit. |
@Sriram-bk is it possible this PR broke existing retention insights? I've just submitted help ticket #24857 with more details. We have two retention charts that are now not loading, and the dashboard they were on shows an error message. All was working fine with these yesterday before this PR. Also, when trying to recreate the insights to fix our dashboard, I get this error when toggling on the "Show mean across cohorts" option. If I toggle it back off, the error remains and I can't recover from it without having to start all over again. ![]() |
Problem
This PR adds functionality to allow our customers to get a weighted mean for their retention insights. It also fixes a bug where cohort sizes of 0 were being included in the mean calculations.
Closes #25998
Closes #26217
Changes
No mean
Simple Mean
Weighted Mean
👉 Stay up-to-date with PostHog coding conventions for a smoother review.
Does this work well for both Cloud and self-hosted?
How did you test this code?
Tested locally.