# #StackBounty: #mcmc #beta-distribution #stan #finite-mixture-model Finite Beta mixture model in stan — mixture components not identified

### Bounty: 50

I’m trying to model data $$0 < Y_i < 1$$ with a finite mixture of Beta components. To do this, I’ve adapted the code given in section 5.3 of the Stan manual. Instead of (log)normal priors, I am using $$mathrm{Exponential}(1)$$ priors for the $$alpha$$ and $$beta$$ parameters. Thus, as I understand it, my model is as follows:

begin{align} alpha_k, beta_k &overset{iid}{sim} mathrm{Exponential}(1) \ Z_i &sim mathrm{Categorical}(1, ldots, K) \ Y_i mid left(Z_i = kright) &sim mathrm{Beta}_{alpha_k, beta_k} end{align}

Now, for my implementation in stan, I have the following two code chunks:

``````# fit.R
y <- c(rbeta(100, 1, 5), rbeta(100, 2, 2))
stan(file = "mixture-beta.stan", data = list(y = y, K = 2, N = 200))
``````

and

``````// mixture-beta.stan

data {
int<lower=1> K;
int<lower=1> N;
real y[N];
}

parameters {
simplex[K] theta;
vector<lower=0>[K] alpha;
vector<lower=0>[K] beta;
}

model {
vector[K] log_theta = log(theta);

// priors
alpha ~ exponential(1);
beta ~ exponential(1);

for (n in 1:N) {
vector[K] lps = log_theta;

for (k in 1:K) {
lps[k] += beta_lpdf(y[n] | alpha[k], beta[k]);
}

target += log_sum_exp(lps);
}
}

``````

After running the code above (defaults to 4 chains of 2000 iterations, with 1000 warmup) I find that all the posterior components are essentially the same:

``````> print(fit)
Inference for Stan model: mixture-beta.
4 chains, each with iter=2000; warmup=1000; thin=1;
post-warmup draws per chain=1000, total post-warmup draws=4000.

mean se_mean   sd  2.5%   25%   50%   75% 97.5% n_eff Rhat
theta  0.50    0.01 0.13  0.26  0.42  0.50  0.58  0.75   259 1.01
theta  0.50    0.01 0.13  0.25  0.42  0.50  0.58  0.74   259 1.01
alpha  2.40    0.38 1.73  0.70  0.94  1.20  3.89  6.01    21 1.16
alpha  2.57    0.37 1.74  0.70  0.96  2.29  4.01  6.05    22 1.16
beta   3.54    0.11 1.10  1.84  2.66  3.46  4.26  5.81    93 1.04
beta   3.58    0.12 1.07  1.88  2.77  3.49  4.26  5.89    82 1.05
lp__     30.80    0.05 1.74 26.47 29.92 31.21 32.08 33.02  1068 1.00

Samples were drawn using NUTS(diag_e) at Thu Sep 17 12:16:13 2020.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).
``````

I read the warning about label switching, but I can’t see how to use the trick of `ordered[K] alpha` since I also need to integrate the constraint of $$alpha$$ and $$beta$$ being positive.

Could someone help explain what’s going on here?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.