# #StackBounty: #mixed-model #generalized-linear-model #lme4-nlme #random-effects-model #multinomial-distribution Mixed logit as a Genera…

### Bounty: 150

I wonder if the Mixed Logit model could be understood, stated, and estimated as a Generalized Linear Mixed Model.

### Mixed Logit

Consider a standard discrete choice setting where individual $$n in {1,…,N}$$ chooses $$t in {1,…,T}$$ times one alternative $$i in {1,…,J}$$ mutually exclusive choice set. Additionally, we define that the binary variable $$y_{nit}$$ take the value of $$1$$ when individual
$$n$$ chooses alternative $$i$$ in the choice situation $$t$$. Accordingly, the random utility maximization model is specified as

$$U_{nit} = alpha_{i} + boldsymbol{X}{nit} boldsymbol{beta}{n} + varepsilon_{nit}= alpha_{i} + boldsymbol{X}^{F}{nit} boldsymbol{beta}^{F} + boldsymbol{X}^{R}{nit} boldsymbol{beta}^{R}{n} + varepsilon{nit}$$

Where $$U_{nit}$$ is the random utility associated with individual $$n$$ choosing alternative $$i$$ during choice situation $$t$$ and $$ε_{nit}$$ is an iid extreme value type I preference shock. Moreover, both the alternative attributes and preference parameters are sorted into two groups.

• On the one hand, $$boldsymbol{beta}^{F}$$ is a vector of fixed preference parameters, and boldsymbol{X}^{F}_{nit} is the attribute/covariate vector associated with these fixed parameters.
• On the other hand, $$boldsymbol{beta}^{R}_{n}$$ is a vector of random parameters and $$boldsymbol{X}^{R}_{nit}$$ is the attribute vector (or regressors) for which the researcher expects the presence of unobserved preference heterogeneity. Commonly, the $$boldsymbol{beta}^{R}_{n}$$ is assumed to follow a parametric distribution, such as, for instance, a normal distribution, in which case we say that, $$boldsymbol{beta}^{R}_{n} sim mathcal{N}(mu, Omega)$$ (this is why there is a sub-index $$n$$ in $$boldsymbol{beta}^{R}_{n}$$ because each individual has a different parameter, and these parameters come from a Gaussian Distribution. )

Given this framework, it can be shown (see Train 2009 section 3.10 Derivation of Logit Probabilities) that the conditional (conditional on knowing each $$boldsymbol{beta}^{R}_{n}$$ ) choice probability that individual $$n$$ chooses alternative $$i$$ in the choice situation $$t$$ is given by :

$$P_{int}(boldsymbol{beta}{n}) = P(y{nit}=1 | boldsymbol{beta}{n}, boldsymbol{X}{nit} ) = dfrac{exp(boldsymbol{X}{nit} boldsymbol{beta}{n})} {sum_{j=1}^{J}exp(boldsymbol{X}{njt} boldsymbol{beta}{n})}$$

Additionally, given that the same individual chooses $$t$$ times an alternative from the choice set, we define the sequences of choices from individual $$n$$ as:

$$P_{n}(boldsymbol{beta}{n}) = prod{t=1}^{T} prod_{i=1}^{J} P_{int}(boldsymbol{beta}{n})^{y{nit}}$$

Finally, the likelihood is defined as

$$ln L(boldsymbol{varphi}) = sum_{n=1}^{N} ln left[ int_{boldsymbol{beta}{n}} P{n}(boldsymbol{beta}_{n}) f(boldsymbol{beta}|boldsymbol{varphi}) dboldsymbol{beta} right]$$

where $$f(boldsymbol{beta}|varphi)$$ is the parametric distribution over the random parameters $$(boldsymbol{beta}^{R}{n})$$ which in this case, given that $$boldsymbol{beta}^{R}$${n} sim mathcal{N}(mu, Omega)\$, means that $$boldsymbol{varphi} = (mu, Omega, boldsymbol{beta}^{F})$$.

In a frequentist framework, the model can be fitted using simulated maximum likelihood taking draws from the assumed parametric distribution for the random parameters (see Train (2009) Chapter 10)

### Generalized Linear Mixed Models

In the Generalized Linear Mixed Models we assume that the data consist of outcomes from $$m$$ clusters, with $$n_{i}$$ observations in cluster $$i$$ ($$i= (1,…,m)$$). Within a cluster, the outcomes are independent, but conditional on the cluster-specific $$d times 1$$ vector of random effects $$b_{i}$$, the outcomes $$y_{ij}$$ are independent and follow a generalized linear mixed model with mean:

$$mu_{ij} = E left[ y_{ij}|beta, b_{i} right] = g^{-1}left[beta^{T}x_{ij} + b_{i}^{T}z_{ij} right]$$

where $$x_{ij}$$ and $$z_{ij}$$ are covariate vectors for the fixed effects $$beta$$ and the random effects $$b_{i}$$ of cluster $$i$$ and $$g()$$ is the so-called link function. Additionally, $$b_{i}$$ can be assumed to be normally distributed that is to say $$b_{i} sim mathcal{N}(0, Sigma)$$.

### Mixed logit as a Generalized Linear Mixed Model.

The similarities between the two models are somewhat evident since both assume a parametric distribution over a subset of the parameters. However, (1) what is not very clear to me is how to accommodate the estimation of the "mean" of the random parameters as it is done in the mixed logit case where the randoms effects follow a non-zero Gaussian Distribution and (2) I am not very familiar with the Generalized Linear Mixed Model so, is it possible to estimate a model where we have $$b_{i} sim mathcal{N}(boldsymbol{mu}, Sigma)$$?

Finally,in the case that it is possible to fit a write a Mixed logit as a Generalized Linear Mixed Model would it be possible to fit a mixed logit using for example the `glmer()` R package instead of, say the `mlogit()` R package?