Processing math: 100%

statsandstuff a blog on statistics and machine learning

Distinguishing proportions: the risk ratio

Suppose I conduct an experiment to determine whether to use font A or font B in an online ad.  After running the experiment, I find that there is a 1% chance that a user clicks on the font A ad, and a 0.8% chance that the user clicks on the font B ad.

We can can compare these click rates on an absolute scale (font A increases the click rate by 0.2%) or compare on a relative scale (the click rate for font A is 1.25 times higher).  The relative comparison (the probability of clicking on a font A ad over the probability of clicking on a font B ad) is called the risk ratio in some fields.  To determine statistical significance, we can see if the difference is significantly different than 0 or if the ratio is significantly different than 1.  I’d highly recommend comparing the difference to 0 (in fact, we’ll handle the ratio by taking the log and turning it into a difference), but the ratio is nice for reporting purposes and so I’ll discuss computing a confidence interval for it.

Suppose I show n impressions of the ad with font A, and let X1,,Xn be the outcomes (X1=1 means the user clicked on the ad and X1=0 means the user did not click on the ad).  Similarly, let Y1,,Ym be the outcomes of the m impressions of the ad with font B I show.

The estimated probability of clicking on the font A ad is ˆpA=Xi/n.  By the central limit theorem, ˆpA is approximately distributed N(pA, pA(1pA)/n), where pA is the true probability.  Similarly, ˆpB is approximately distributed N(pB, pB(1pB)/m).

Thus the difference is approximately distributed

ˆpAˆpB ˙ N(pApB, pA(1pA)/n+pB(1pB)/m).

Under the null hypothesis, we assume pA=pB.  If we let p denote this common value, we have

ˆpAˆpBp(1p)(1n+1m) ˙ N(0,1).

Slutsky’s theorem lets us replace p with the pooled estimate ˆp=(Xi+Yi)/(n+m).  In summary, to test if the difference ˆpAˆpB is significantly different than 0, we compute the Z-score

ˆpAˆpBˆp(1ˆp)(1n+1m)

and see how extreme it is for a draw from standard normal.

As an aside, an alternative statistic to use to test significance is

Z=ˆpAˆpBˆpA(1ˆpA)/n+ˆpB(1ˆpB)/m.

One nice feature of this statistic is that the test for being significant at level α using it is equivalent to the (1α) confidence interval for the difference ˆpAˆpB containing 0.

How can we find a confidence interval for the risk ratio?  The distribution of the ratio of two independent normals is complicated (unless both normals have zero mean, in which case the ratio is distributed Cauchy).  The trick is to turn the ratio into a difference by taking a log, use propagation of error, and then transform back.

Propagation of error approximately describes how the mean and variance of a random variable change under a transformation.  More precisely, E(f(X))f(E(X)) and Var(f(X))f(E(X))2 Var(X).  Propagation of error is often used in conjunction with the central limit theorem: if n(Xnμ) converges in distribution to N(0,σ2), then n(f(Xn)f(μ)) converges to N(0,f(μ)2σ2).

By propagation of error, logˆpA is approximately distributed N(log(pA),1pAnpA).  The standard error for the difference in log hat probabilities is thus

SE=1ˆpAnˆpA+1ˆpBmˆpB,

and the 95% confidence interval for the difference is ±1.96 SE.  We exponentiate to get the confidence interval for the risk ratio:

ˆpAˆpB e±1.96 SE.

Note the confidence interval is asymmetric!