-
Notifications
You must be signed in to change notification settings - Fork 463
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document new experiments methodology #10217
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
|
||
Funnel experiments use Bayesian statistics with a beta model to evaluate the **win probabilities** and **credible intervals** for an experiment. [Read the statistics primer for an overview](/docs/experiments/statistics-primer) if you haven't already. | ||
|
||
## What the heck is a Beta model? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd probably tone down the casual tone here :D Though I generally appreciate it, Experiment have a high standard for rigor, so I feel a more formal language better communicates this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets drop it in the title but have fun with the example (like you do)
I only skimmed this due to a lack of time, but seems like a good start! Maybe we want to avoid repeating the explanation of credible intervals and sampling three times, and instead covering it just once in the statistics primer? But I don’t have a strong opinion on this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is well written, it's just a really tricky subject to write about well.
The biggest thing is having a clear "why someone should read this" for each doc. How does this make them a better engineer as well as someone who is better able to use our experiments product. Right now, it feels like it is missing a bit of that depth.
Say you just started an experiment a few hours ago and see these results: | ||
* 1 in 10 people in the control group complete the funnel = 10% success rate. | ||
* 1 in 9 people in the test variant group complete the funnel = 11% success rate. | ||
* The control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to tell me it is too complicated, but how did they probabilities get calculated? It seems just like magic, but it would be help to know.
At the very least, something like this:
* The control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better. | |
* Using Bayesian analysis, we'll find the control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Attempted to explain better with 5eb1f8f
|
||
## What the heck is a Beta model? | ||
|
||
Imagine you run a pizza shop and want to know if customers say "yes" to adding pineapple. Some customers will say yes, others will say no. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would you care about this?
You want to know how much to promote pineapple? You want to know how much pineapple to order?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added explanation in 5132470
Imagine you run a pizza shop and want to know if customers say "yes" to adding pineapple. Some customers will say yes, others will say no. | ||
|
||
The **beta distribution** is a statistical model that's great for analyzing proportions or probabilities. It helps us understand: | ||
1. The true probability of customers saying yes to adding pineapple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between probability and true probability?
1. The true probability of customers saying yes to adding pineapple. | |
1. The true probability of customers saying yes to adding pineapple. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question! Added an explanation c7aba65
|
||
The **win probability** tells you how likely it is that a given variant has the highest conversion rate compared to all other variants in the experiment. It helps you determine whether the experiment shows a **statistically significant** real effect vs. simply random chance. | ||
|
||
Let's say you're testing a new signup flow and have these results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use the same pineapple on pizza example? If it doesn't work here, maybe we should use one that works for both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe pizza website? Experiment to upsell pineapple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
## Credible intervals | ||
|
||
A **credible interval** tells you the range where the true conversion rate lies with 95% probability. Unlike traditional confidence intervals, credible intervals give you a direct probability statement about the conversion rate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
give you a direct probability statement about the conversion rate
What does "direct" mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Explained in 5c0e225
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
@jurajmajerik Good call out. I'd like to keep the repetition because they're important concepts and the definitions are relevant to the corresponding context. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, much improved :)
- When we have very little data, the Gamma distribution is wide, saying "hey, the true rate could be anywhere in this broad range". | ||
- As we collect more data, the Gamma distribution gets narrower, saying "we're getting more confident about what the true rate is". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
Co-authored-by: Ian Vanagas <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good starting point! 🙌
As we talked about in the huddle, I would like to suggest a slightly different structure. But as this documentation is a topic for next quarter assigned to me, I'll save it for that.
|
||
Et voilà! The test variant's win probability increased significantly, and the credible intervals became narrower and more distinct. You can decide on the winner now or continue to wait, depending on your business requirements. | ||
|
||
## Supported methodologies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its unclear I think what we mean with "methodologies" here. I would suggest to call this "Supported metric types" or something, as that is what our users care about
I suggest something like this:
Supported metric types:
As different type of data have different shape, we need to use different models to better match the true distribution of the data. For example, funnel conversions are always between 0 to 1 (0 - 100 %), pageview counts can be any positive integer (0, 50, 280), and continuous values such as revenue can vary widely and tend to be right-skewed. The metric types we currently support are:
- [funnels (conversion rates)] (/docs/experiments/funnels-statistics)
- [count data (page views, etc.)] (/docs/experiments/trends-count-statistics)
- [continuous values (revenue, etc.)] (/docs/experiments/trends-property-value-statistics)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only capital letters if is a title. But in the middle of a sentence it should be "a gamma-poisson model". The excpetion is if one refers to a specific distribution like this "we use a Beta(1, 1) distribution as prior ..."
@@ -0,0 +1,71 @@ | |||
--- | |||
title: Statistical methodology for funnel experiments |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should write "Statistical methodology for funnel metrics".
Now, with the multiple metrics feature, we can have both funnel metrics and trend metrics in the same experiment, so it makes more sense to refer to these as different metrics rather than different experiments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated with 1aafe97
Co-authored-by: Anders <[email protected]>
Co-authored-by: Anders <[email protected]>
@andehen Thanks for the review!
Adapted with 1ef9b83
Fixed up with d7da622 |
Changes
Introduces a new set of docs to describe the statistical methodology introduced in PostHog/posthog#26713
I kept the old document around and linked to it as "Legacy methodology".