How to derive sample size formula for A/B testing from scratch?

lzhangstat
2 min readMar 10, 2021

--

Simply math. Math is sometimes self-explaining : )

We assume throughout that the sample sizes are large enough that it is safe to assume the mean statistics have a normal distribution by the Central Limit Theorem.

Derive the sample size from the power calculation

Consider X_i and Y_i as the observed values of the metrics of interest in the test and control groups respectively. Let X and Y represent the means of X_i and Y_i. Note that while we assume the normality of X_i and Y_i for simplicity in the following derivation, it is not strictly necessary. What matters is the normality of X and Y, which can be ensured through the Central Limit Theorem.

Appendix

When we are interested in count metrics like daily active users. It’s straightforward to define the Z statistics as

Now, let’s consider a special case when the metric is a ratio.

--

--

lzhangstat
lzhangstat

Written by lzhangstat

Stat, math, machine learning and much more!

No responses yet