#StackBounty: #experiment-design #power #ab-test AB test sample size calculation by hand

Bounty: 50

Evan Miller has created a well-known online AB test sample size calculator. For the sake of being able to program and modify this formula, I would like to know how to calculate sample size Evan Miller-style by hand.

Personally, I’ll calculate such a metric by working backwards from how we calculate a 95% confidence interval with the z-test of proportions around the difference in conversion between the two variations ($hat{d}$) by setting it zero.

I’ll define/assume:

  • $alpha$ = .05, $beta$ = .2
  • a 50/50 split between the control and experiment, i.e. $n_exp$ = $n_control$
  • the control conversion rate, i.e. the base rate before the experiment = $c$
  • $p$ = pooled conversion rate = (number of exp conversions + number of control conversions/ (n_control + n_experiment)) -> in this context -> $(nc+n(c+hat{d}))/2n$ = $(2c+hat{d})/2$

Now time to solve for $n$

$$ hat{d} + Z_{(1+alpha)/2} * StandardError = 0 $$
$$ hat{d} + 1.96 * StandardError = 0 $$
$$ hat{d} + 1.96 * sqrt{p(1-p)(frac{1}{n_exp} + frac{1}{n_control})} = 0$$
$$ hat{d} + 1.96 * sqrt{p(1-p)(frac{2}{n})} = 0$$
$$ sqrt{p(1-p)(frac{2}{n})} =frac{-hat{d}}{1.96}$$

with more simplifying we get to:

$$ frac{(1.96^2) 2p(1-p)}{hat{d}^2} = n $$
$$ frac{(1.96^2) (2c+2chat{d}-2c^2+frac{3}{2}hat{d}^2)}{hat{d}^2} = n $$

At the moment though, my calculation doesn’t incorporate power (1-$beta$) but Evan Miller’s does.

What should I think about as next steps to incorporate power into my sample size calculation?

(Feel free to also point out other errors in my calculation or assumptions!)

Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.