#StackBounty: #logistic #normalization #probit #logistic-distribution Comparison of logit and probit estimations and normalization

Bounty: 50

There are a lot of questions concerning logit and probit relations (led by 20523), but I’m still confused with a seemingly simple issue.

On the one hand, often we see that for ‘rule-of-thumb’ correction of $beta$ in logit and probit people use $1.6$ scalar (in example, Wooldridge, 5ed., ch. 17, p. 586).

In the typical case that $g$ is a symmetric density about zero…

For example, in the probit case with $g(z) = phi(z)$, $g(0) = phi(0) = 1/sqrt{2pi} approx .40$.

In the logit case, $g(z) = exp(z)/[1 – exp(z)]^2$, and so $g(0) = .25$.

And a bit later (Wooldridge, p. 593):

Still, sometimes one wants a quicker way to compare magnitudes of the
different estimates. As mentioned earlier, for probit $g(0) approx .4$ and
for logit, $g(0)approx .25$. Thus, to make the magnitudes of probit and logit
roughly comparable, we can multiply the probit coefficients by $.4/.25
> = 1.6$
, or we can multiply the logit estimates by $.625$.

On the other hand, I read in (Train, 2009, p. 24):

… the error variances in a standard logit model are traditionally
normalized to $π^2 /6$, which is about $1.6$. In this case, the
preceding model becomes $U_{nj} = x’_{nj} (β/σ) sqrt {1.6} + ε_{nj}$
with $Var(ε_{nj} ) = 1.6$. The coefficients still reflect the variance
of the unobserved portion of utility. The only difference is that the
coefficients are larger by a factor of $sqrt{1.6}$ – standard deviation of extreme values distribution of errors.

As stated earlier, the error variance is normalized to $1.6$ for logit.
Suppose the researcher normalized the probit to have error variances
of 1, which is traditional with independent probits. This difference
in normalization must be kept in mind when comparing estimates from
the two models. In particular, the coefficients in the logit model
will be $sqrt{1.6}$ times larger than those for the probit model, simply due
to the difference in normalization.

The question. So we see that usually logit estimate should be divided by approx $1.6$ to match probit estimate of the same data (and this value is approxiation of $1 / sqrt{pi /8}$), but Train suggests to correct by approx $sqrt{1.6}$, which is derived from $sqrt{pi^2 /6}$.

What is the difference come from? How do those approaches relate to each other? Is it the same correction after all?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.