Document new experiments methodology #10217

danielbachhuber · 2024-12-24T18:05:00Z

Changes

Introduces a new set of docs to describe the statistical methodology introduced in PostHog/posthog#26713

I kept the old document around and linked to it as "Legacy methodology".

vercel · 2024-12-24T18:05:05Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
posthog	✅ Ready (Inspect)	Visit Preview	Jan 7, 2025 3:59pm

jurajmajerik · 2024-12-29T20:50:36Z

contents/docs/experiments/funnels-statistics.mdx

+
+Funnel experiments use Bayesian statistics with a beta model to evaluate the **win probabilities** and **credible intervals** for an experiment. [Read the statistics primer for an overview](/docs/experiments/statistics-primer) if you haven't already.
+
+## What the heck is a Beta model?


I'd probably tone down the casual tone here :D Though I generally appreciate it, Experiment have a high standard for rigor, so I feel a more formal language better communicates this.

@Lior539 @ivanagas What do y'all think?

Lets drop it in the title but have fun with the example (like you do)

jurajmajerik · 2024-12-29T20:52:36Z

I only skimmed this due to a lack of time, but seems like a good start! Maybe we want to avoid repeating the explanation of credible intervals and sampling three times, and instead covering it just once in the statistics primer? But I don’t have a strong opinion on this.

ivanagas

I think this is well written, it's just a really tricky subject to write about well.

The biggest thing is having a clear "why someone should read this" for each doc. How does this make them a better engineer as well as someone who is better able to use our experiments product. Right now, it feels like it is missing a bit of that depth.

contents/docs/experiments/statistics-primer.mdx

ivanagas · 2025-01-02T16:31:25Z

contents/docs/experiments/statistics-primer.mdx

+Say you just started an experiment a few hours ago and see these results:
+* 1 in 10 people in the control group complete the funnel = 10% success rate.
+* 1 in 9 people in the test variant group complete the funnel = 11% success rate.
+* The control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better.


Feel free to tell me it is too complicated, but how did they probabilities get calculated? It seems just like magic, but it would be help to know.

At the very least, something like this:

Suggested change

* The control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better.

* Using Bayesian analysis, we'll find the control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better.

Attempted to explain better with 5eb1f8f

contents/docs/experiments/statistics-primer.mdx

ivanagas · 2025-01-02T16:58:32Z

contents/docs/experiments/funnels-statistics.mdx

+
+## What the heck is a Beta model?
+
+Imagine you run a pizza shop and want to know if customers say "yes" to adding pineapple. Some customers will say yes, others will say no.


Why would you care about this?

You want to know how much to promote pineapple? You want to know how much pineapple to order?

Added explanation in 5132470

ivanagas · 2025-01-02T16:59:03Z

contents/docs/experiments/funnels-statistics.mdx

+Imagine you run a pizza shop and want to know if customers say "yes" to adding pineapple. Some customers will say yes, others will say no.
+
+The **beta distribution** is a statistical model that's great for analyzing proportions or probabilities. It helps us understand:
+1. The true probability of customers saying yes to adding pineapple.


What is the difference between probability and true probability?

Suggested change

1. The true probability of customers saying yes to adding pineapple.

1. The true probability of customers saying yes to adding pineapple.

Good question! Added an explanation c7aba65

contents/docs/experiments/funnels-statistics.mdx

ivanagas · 2025-01-02T17:00:05Z

contents/docs/experiments/funnels-statistics.mdx

+
+The **win probability** tells you how likely it is that a given variant has the highest conversion rate compared to all other variants in the experiment. It helps you determine whether the experiment shows a **statistically significant** real effect vs. simply random chance.
+
+Let's say you're testing a new signup flow and have these results:


Could we use the same pineapple on pizza example? If it doesn't work here, maybe we should use one that works for both.

Maybe pizza website? Experiment to upsell pineapple?

Yeah, good catch. Updated the examples with bb2d23c and b20beae

ivanagas · 2025-01-02T17:01:13Z

contents/docs/experiments/funnels-statistics.mdx

+
+## Credible intervals
+
+A **credible interval** tells you the range where the true conversion rate lies with 95% probability. Unlike traditional confidence intervals, credible intervals give you a direct probability statement about the conversion rate.


give you a direct probability statement about the conversion rate

What does "direct" mean?

Explained in 5c0e225

Co-authored-by: Ian Vanagas <[email protected]>

danielbachhuber · 2025-01-06T17:10:28Z

Maybe we want to avoid repeating the explanation of credible intervals and sampling three times, and instead covering it just once in the statistics primer? But I don’t have a strong opinion on this.

@jurajmajerik Good call out. I'd like to keep the repetition because they're important concepts and the definitions are relevant to the corresponding context.

ivanagas

Nice, much improved :)

contents/docs/experiments/funnels-statistics.mdx

contents/docs/experiments/statistics-primer.mdx

contents/docs/experiments/trends-continuous-statistics.mdx

ivanagas · 2025-01-06T17:54:33Z

contents/docs/experiments/trends-count-statistics.mdx

+- When we have very little data, the Gamma distribution is wide, saying "hey, the true rate could be anywhere in this broad range".
+- As we collect more data, the Gamma distribution gets narrower, saying "we're getting more confident about what the true rate is".


contents/docs/experiments/trends-count-statistics.mdx

contents/docs/experiments/statistics-primer.mdx

Co-authored-by: Ian Vanagas <[email protected]>

andehen

Good starting point! 🙌

As we talked about in the huddle, I would like to suggest a slightly different structure. But as this documentation is a topic for next quarter assigned to me, I'll save it for that.

andehen · 2025-01-07T10:23:52Z

contents/docs/experiments/statistics-primer.mdx

+
+Et voilà! The test variant's win probability increased significantly, and the credible intervals became narrower and more distinct. You can decide on the winner now or continue to wait, depending on your business requirements.
+
+## Supported methodologies


Its unclear I think what we mean with "methodologies" here. I would suggest to call this "Supported metric types" or something, as that is what our users care about

I suggest something like this:

Supported metric types:

As different type of data have different shape, we need to use different models to better match the true distribution of the data. For example, funnel conversions are always between 0 to 1 (0 - 100 %), pageview counts can be any positive integer (0, 50, 280), and continuous values such as revenue can vary widely and tend to be right-skewed. The metric types we currently support are:

[funnels (conversion rates)] (/docs/experiments/funnels-statistics)

[count data (page views, etc.)] (/docs/experiments/trends-count-statistics)

[continuous values (revenue, etc.)] (/docs/experiments/trends-property-value-statistics)

andehen · 2025-01-07T10:31:08Z

contents/docs/experiments/funnels-statistics.mdx

I think only capital letters if is a title. But in the middle of a sentence it should be "a gamma-poisson model". The excpetion is if one refers to a specific distribution like this "we use a Beta(1, 1) distribution as prior ..."

andehen · 2025-01-07T13:36:20Z

contents/docs/experiments/funnels-statistics.mdx

@@ -0,0 +1,71 @@
+---
+title: Statistical methodology for funnel experiments


I think we should write "Statistical methodology for funnel metrics".
Now, with the multiple metrics feature, we can have both funnel metrics and trend metrics in the same experiment, so it makes more sense to refer to these as different metrics rather than different experiments.

Updated with 1aafe97

contents/docs/experiments/trends-count-statistics.mdx

Co-authored-by: Anders <[email protected]>

danielbachhuber · 2025-01-07T15:40:44Z

@andehen Thanks for the review!

Its unclear I think what we mean with "methodologies" here. I would suggest to call this "Supported metric types" or something, as that is what our users care about

Adapted with 1ef9b83

I think only capital letters if is a title. But in the middle of a sentence it should be "a gamma-poisson model". The excpetion is if one refers to a specific distribution like this "we use a Beta(1, 1) distribution as prior ..."

Fixed up with d7da622

danielbachhuber added 4 commits December 24, 2024 05:50

Move Methodology below Features and rename

7043ff1

First pass at statistics primer

ef6cf80

Active voice

b1f2789

First pass at funnel statistics doc

2aa7a71

vercel bot deployed to Preview December 24, 2024 18:17 View deployment

danielbachhuber added 6 commits December 24, 2024 12:27

First pass at Trends count statistics

98f58c4

Add "What the heck?" sections

04d05e0

Edits

3f9f429

Edits

37d0d71

First pass at continuous trends

be3fac7

Link to all overviews

6eb49a2

vercel bot deployed to Preview December 24, 2024 21:40 View deployment

danielbachhuber added 2 commits December 24, 2024 13:42

Link in sidebar

1575c5b

Edits

c5bc948

vercel bot deployed to Preview December 24, 2024 22:12 View deployment

Edits

3898e7d

danielbachhuber marked this pull request as ready for review December 24, 2024 22:39

danielbachhuber requested review from a team, Lior539 and ivanagas December 24, 2024 22:39

danielbachhuber mentioned this pull request Dec 24, 2024

chore(experiments): Stats cleanup PostHog/posthog#27151

Merged

vercel bot deployed to Preview December 24, 2024 22:49 View deployment

jurajmajerik reviewed Dec 29, 2024

View reviewed changes

ivanagas requested changes Jan 2, 2025

View reviewed changes

danielbachhuber and others added 4 commits January 6, 2025 04:35

Formatting

1d8fe54

Co-authored-by: Ian Vanagas <[email protected]>

Edit

edffcff

Co-authored-by: Ian Vanagas <[email protected]>

Formatting

bf802a7

Co-authored-by: Ian Vanagas <[email protected]>

Formatting

cccb2cb

Co-authored-by: Ian Vanagas <[email protected]>

vercel bot deployed to Preview January 6, 2025 17:20 View deployment

danielbachhuber added 2 commits January 6, 2025 09:29

Clarify "Beta model" vs. "Beta distribution"

2293f14

Formatting

045c810

danielbachhuber requested review from ivanagas and andehen January 6, 2025 17:39

vercel bot deployed to Preview January 6, 2025 17:44 View deployment

Deprecate the legacy methodology

cd96971

ivanagas approved these changes Jan 6, 2025

View reviewed changes

danielbachhuber and others added 7 commits January 6, 2025 10:04

Explain true probability

c7aba65

Missing word

dc666e4

Co-authored-by: Ian Vanagas <[email protected]>

Missing word

2c0464b

Co-authored-by: Ian Vanagas <[email protected]>

Missing word

6f15396

Co-authored-by: Ian Vanagas <[email protected]>

Edits

db5a215

Co-authored-by: Ian Vanagas <[email protected]>

Two grafs

3acbd21

Co-authored-by: Ian Vanagas <[email protected]>

Two grafs

45057a3

Co-authored-by: Ian Vanagas <[email protected]>

vercel bot deployed to Preview January 6, 2025 18:28 View deployment

andehen reviewed Jan 7, 2025

View reviewed changes

danielbachhuber and others added 6 commits January 7, 2025 07:17

Merge branch 'master' into experiments/new-stats-methodology

b5baa94

Replace "experiments" with "metrics"

a282d20

Co-authored-by: Anders <[email protected]>

Edit

6634857

Co-authored-by: Anders <[email protected]>

"Funnel metrics", not "Funnel experiments"

1aafe97

Rewrite as "Supported metric types"

1ef9b83

Fix casing

d7da622

Rename to "Statisics overview"

3f8e672

danielbachhuber enabled auto-merge (squash) January 7, 2025 15:47

vercel bot deployed to Preview January 7, 2025 15:59 View deployment

danielbachhuber merged commit b87b539 into master Jan 7, 2025
4 checks passed

danielbachhuber deleted the experiments/new-stats-methodology branch January 7, 2025 15:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document new experiments methodology #10217

Document new experiments methodology #10217

danielbachhuber commented Dec 24, 2024 •

edited

Loading

vercel bot commented Dec 24, 2024 •

edited

Loading

jurajmajerik Dec 29, 2024

danielbachhuber Jan 2, 2025

ivanagas Jan 2, 2025

jurajmajerik commented Dec 29, 2024

ivanagas left a comment

ivanagas Jan 2, 2025

danielbachhuber Jan 6, 2025

ivanagas Jan 2, 2025

danielbachhuber Jan 6, 2025

ivanagas Jan 2, 2025

danielbachhuber Jan 6, 2025

ivanagas Jan 2, 2025

ivanagas Jan 2, 2025

danielbachhuber Jan 6, 2025

ivanagas Jan 2, 2025

danielbachhuber Jan 6, 2025

danielbachhuber commented Jan 6, 2025

ivanagas left a comment

ivanagas Jan 6, 2025

andehen left a comment

andehen Jan 7, 2025

andehen Jan 7, 2025

andehen Jan 7, 2025

danielbachhuber Jan 7, 2025

danielbachhuber commented Jan 7, 2025


		Funnel experiments use Bayesian statistics with a beta model to evaluate the win probabilities and credible intervals for an experiment. [Read the statistics primer for an overview](/docs/experiments/statistics-primer) if you haven't already.

		## What the heck is a Beta model?

	* The control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better.
	* Using Bayesian analysis, we'll find the control variant has a 46.7% probability of being better and the test variant has a 53.3% probability of being better.


		## What the heck is a Beta model?

		Imagine you run a pizza shop and want to know if customers say "yes" to adding pineapple. Some customers will say yes, others will say no.

	1. The true probability of customers saying yes to adding pineapple.

	1. The true probability of customers saying yes to adding pineapple.


		The win probability tells you how likely it is that a given variant has the highest conversion rate compared to all other variants in the experiment. It helps you determine whether the experiment shows a statistically significant real effect vs. simply random chance.

		Let's say you're testing a new signup flow and have these results:


		## Credible intervals

		A credible interval tells you the range where the true conversion rate lies with 95% probability. Unlike traditional confidence intervals, credible intervals give you a direct probability statement about the conversion rate.

		- When we have very little data, the Gamma distribution is wide, saying "hey, the true rate could be anywhere in this broad range".
		- As we collect more data, the Gamma distribution gets narrower, saying "we're getting more confident about what the true rate is".


		Et voilà! The test variant's win probability increased significantly, and the credible intervals became narrower and more distinct. You can decide on the winner now or continue to wait, depending on your business requirements.

		## Supported methodologies

		@@ -0,0 +1,71 @@
		---
		title: Statistical methodology for funnel experiments

Document new experiments methodology #10217

Document new experiments methodology #10217

Conversation

danielbachhuber commented Dec 24, 2024 • edited Loading

Changes

vercel bot commented Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jurajmajerik commented Dec 29, 2024

ivanagas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielbachhuber commented Jan 6, 2025

ivanagas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andehen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielbachhuber commented Jan 7, 2025

danielbachhuber commented Dec 24, 2024 •

edited

Loading

vercel bot commented Dec 24, 2024 •

edited

Loading