Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add weighted mean for retention insights #28302

Merged
merged 16 commits into from
Feb 6, 2025

Conversation

Sriram-bk
Copy link
Contributor

@Sriram-bk Sriram-bk commented Feb 4, 2025

Problem

This PR adds functionality to allow our customers to get a weighted mean for their retention insights. It also fixes a bug where cohort sizes of 0 were being included in the mean calculations.

Closes #25998
Closes #26217

Changes

No mean

Screenshot 2025-02-04 at 1 06 55 PM

Simple Mean

Screenshot 2025-02-04 at 12 56 18 PM 6-bea6-19c71efdaad4" />

Weighted Mean

Screenshot 2025-02-04 at 12 56 39 PM

👉 Stay up-to-date with PostHog coding conventions for a smoother review.

Does this work well for both Cloud and self-hosted?

How did you test this code?

Tested locally.

@Sriram-bk Sriram-bk requested a review from a team February 4, 2025 21:06
@Sriram-bk Sriram-bk changed the title Add weighted mean for retention insights feat: Add weighted mean for retention insights Feb 4, 2025
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Summary

This PR enhances retention insights by adding weighted mean calculations and fixing cohort size issues. Here's a summary of the key changes:

  • Added new RetentionMeanDropdown component in /frontend/src/scenes/insights/filters/RetentionMeanDropdown.tsx replacing the boolean checkbox with a tri-state selector (none/simple/weighted)
  • Created migration 0660_migrate_retention_show_mean.py to convert existing boolean show_mean values to strings ('simple' or null)
  • Fixed bug in /frontend/src/scenes/retention/RetentionTable.tsx to exclude cohorts with 0 users from mean calculations
  • Updated type definitions in /frontend/src/types.ts and /posthog/schema.py to support string-based mean types
  • Modified query handling in /frontend/src/scenes/insights/utils.tsx for backward compatibility with old boolean values

💡 (1/5) You can manually trigger the bot by mentioning @greptileai in a comment!

10 file(s) reviewed, 7 comment(s)
Edit PR Review Bot Settings | Greptile

Comment on lines +25 to +28
onChange={(showMean) => {
updateInsightFilter({ showMean })
}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: consider adding type safety for showMean parameter to ensure it matches RetentionMeanType

Comment on lines 603 to 604
if (retentionFilter && 'showMean' in retentionFilter && typeof retentionFilter.showMean === 'boolean') {
retentionFilter.showMean = retentionFilter.showMean ? 'simple' : null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: This code is duplicated from parseDraftQueryFromLocalStorage. Consider extracting to a shared helper function

Comment on lines 8 to 12
retention_insights = Insight.objects.filter(filters__insight="RETENTION", deleted=False).exclude(
filters__show_mean__isnull=True
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: The filter excludes null show_mean values but doesn't handle cases where show_mean key doesn't exist in filters dict. Consider adding filters__has_key='show_mean' to the query.

Comment on lines +13 to +18
if isinstance(insight.filters.get("show_mean"), bool):
# Convert boolean to string - if True, use 'simple'
insight.filters["show_mean"] = "simple" if insight.filters["show_mean"] else None
insight.save()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: No error handling for corrupted data where filters['show_mean'] exists but is neither boolean nor string. Consider adding a try/except block.

Suggested change
if isinstance(insight.filters.get("show_mean"), bool):
# Convert boolean to string - if True, use 'simple'
insight.filters["show_mean"] = "simple" if insight.filters["show_mean"] else None
insight.save()
try:
if isinstance(insight.filters.get("show_mean"), bool):
# Convert boolean to string - if True, use 'simple'
insight.filters["show_mean"] = "simple" if insight.filters["show_mean"] else None
insight.save()
except Exception as e:
print(f"Error processing insight {insight.id}: {e}")

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

14 snapshot changes in total. 0 added, 14 modified, 0 deleted:

Triggered by this commit.

👉 Review this PR's diff of snapshots.

Copy link
Contributor

github-actions bot commented Feb 4, 2025

Size Change: 0 B

Total Size: 1.18 MB

ℹ️ View Unchanged
Filename Size
frontend/dist/toolbar.js 1.18 MB

compressed-size-action

@@ -587,7 +595,15 @@ export function crushDraftQueryForLocalStorage(query: Node<Record<string, any>>,

export function parseDraftQueryFromURL(query: string): Node<Record<string, any>> | null {
try {
return JSON.parse(query)
const parsedQuery = JSON.parse(query)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this repeated code in a helper function called something like "migrateDraftQuery"?

There will be a lot more of these I think.

@PostHog PostHog deleted a comment from greptile-apps bot Feb 5, 2025
@PostHog PostHog deleted a comment from greptile-apps bot Feb 5, 2025
label: 'Simple mean',
},
{
value: 'weighted',
Copy link
Contributor

@aspicer aspicer Feb 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could add a tooltip here that just describes what this is in slightly more words?

Not sure - do you think it's more clear if these modes are called
"Average by week"
"Average by person"

Or do you think the current "Mean" and "weight mean" are more clear?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think simple mean and weighted mean are more clear. Even if someone doesn't know what a weighted mean is, it's something they can look up. I do think it might be helpful to have a concise tooltip though. Not sure what the copy should be, I'll sleep on it and add this in the AM.

Insight = apps.get_model("posthog", "Insight")

# Get all retention insights
retention_insights = Insight.objects.filter(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're going to want to chunk these to avoid potential memory issues (although I think it would be okay as is):
see 0459_convert_personsnode_insights_to_actorsquery.py for an example

Copy link
Contributor

@aspicer aspicer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple things to address but this looks (and works) fantastic! Amazing 🎉

@anirudhpillai
Copy link
Contributor

Great work again, much needed feature!
What do you think about defaulting to always show the simple mean?
Defaulting to showing the mean instead of hiding it makes more sense (especially later when we add breakdowns where means will become important)

@Sriram-bk Sriram-bk force-pushed the sri/retention-add-weighted-mean branch 2 times, most recently from ee2d752 to 1b92bf3 Compare February 6, 2025 00:10
@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 1)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

1 snapshot changes in total. 0 added, 1 modified, 0 deleted:

  • chromium: 0 added, 1 modified, 0 deleted (diff for shard 1)
  • webkit: 0 added, 0 modified, 0 deleted

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@Sriram-bk Sriram-bk force-pushed the sri/retention-add-weighted-mean branch from 6a0f9fe to fdd5b67 Compare February 6, 2025 18:06
@Sriram-bk Sriram-bk force-pushed the sri/retention-add-weighted-mean branch from fdd5b67 to 7c24c03 Compare February 6, 2025 18:35
@posthog-bot
Copy link
Contributor

📸 UI snapshots have been updated

12 snapshot changes in total. 0 added, 12 modified, 0 deleted:

Triggered by this commit.

👉 Review this PR's diff of snapshots.

@Sriram-bk Sriram-bk merged commit c20e135 into master Feb 6, 2025
103 checks passed
@Sriram-bk Sriram-bk deleted the sri/retention-add-weighted-mean branch February 6, 2025 21:16
@isAdrisal
Copy link

isAdrisal commented Feb 7, 2025

@Sriram-bk is it possible this PR broke existing retention insights? I've just submitted help ticket #24857 with more details.

We have two retention charts that are now not loading, and the dashboard they were on shows an error message. All was working fine with these yesterday before this PR.

Also, when trying to recreate the insights to fix our dashboard, I get this error when toggling on the "Show mean across cohorts" option. If I toggle it back off, the error remains and I can't recover from it without having to start all over again.

Screenshot 2025-02-07 at 13 04 57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants