#StackBounty: #r #bayesian #convergence #jags #dirichlet-distribution Convergence issue dirichlet model JAGS, implemented in R

Bounty: 50

I have data on the relative abundances of 3 species, stored in the matrix r.spp.y. Species 1 has a negative relationship with the variable mat, and species 2 and 3 have a positive relationship with the variable mat:

#Simulate some species relative abundance data.
n <- 100
y1 <- round(rnorm(n, 100, 5)) 
mat <- rnorm(length(y1), 10, 3)
y1 <- y1 + round(mat*-3)
y2 <- round(rnorm(n, 100, 5))
y3 <- round(rnorm(n, 100, 5))
spp.y <- as.matrix((data.frame(y1,y2,y3)))
r.spp.y <- spp.y / rowSums(spp.y)

Here is a plot to show these relationships exist:

#show there are relationships with mat.
par(mfrow=c(1,3))
for(i in 1:ncol(r.spp.y)){
  plot(r.spp.y[,i] ~ mat)
  abline(lm(r.spp.y[,i] ~ mat), lwd = 2)
  rsq <- summary(lm(r.spp.y[,i] ~ mat))$r.squared
  txt <- paste0('R2 = ',round(rsq,2))
  mtext(txt, side = 3, line = -2, adj = 0.05)
}

enter image description here

I fit a multivariate model to these data using the dirlichet distribution, which is the multivariate generalization of the beta distribution using the runjags package in R. Code below.

dirlichet.model = "
model {
#setup priors for each species
for(j in 1:N.spp){
m0[j] ~ dnorm(0, 1.0E-3) #intercept prior
m1[j] ~ dnorm(0, 1.0E-3) #      mat prior
}

#implement dirlichet
for(i in 1:N){
for(j in 1:N.spp){
log(a0[i,j]) <- m0[j] + m1[j] * mat[i]
}
y[i,1:N.spp] ~ ddirch(a0[i,1:N.spp]) 
}

} #close model loop.
"

jags.data <- list(y = r.spp.y,mat = mat, N = nrow(r.spp.y), N.spp = ncol(r.spp.y))
jags.out <- run.jags(dirlichet.model,
                     data=jags.data,
                     adapt = 200,
                     burnin = 2000,
                     sample = 2000,
                     n.chains=3,
                     monitor=c('m0','m1'))
out <- summary(jags.out)

When I look at the model parameter summary I see two things: (1) none of the chains really converged, indicated by the prsf values. (2) None of the parameter 95% credible intervals for mat are different from zero. Increasing sample size or running a longer JAGS simulation does not change this outcome. Output printed here:

          Lower95        Median     Upper95          Mean         SD        Mode       MCerr MC%ofSD SSeff
m0[1]  5.35768574  5.8718514712 6.228604005  5.8278869822 0.28491858  5.99532850 0.050381909    17.7    32
m0[2]  5.34849586  5.8649746778 6.225999953  5.8255381716 0.28831543  6.01183353 0.058322339    20.2    24
m0[3]  5.30951948  5.8417698981 6.195775429  5.7972049003 0.29268382  5.96165385 0.056296339    19.2    27
m1[1] -0.07914617 -0.0363984822 0.008964854 -0.0366993309 0.02742381 -0.01897293 0.006479847    23.6    18
m1[2] -0.04133772  0.0001891246 0.045319610  0.0007892048 0.02775694  0.01978320 0.006837205    24.6    16
m1[3] -0.03889988  0.0033254536 0.049436277  0.0028891798 0.02820870  0.02239507 0.006558280    23.2    19
          AC.10     psrf
m0[1] 0.9043977 7.045678
m0[2] 0.9117633 7.159424
m0[3] 0.9177679 7.143409
m1[1] 0.9345297 5.923798
m1[2] 0.9440124 5.907549
m1[3] 0.9479331 5.925029

HOWEVER: Looking at the predicted vs. observed plots, the model parameters do a reasonable job fitting the data, despite the lack of convergence! So, it seems there are multiple parameter combinations that can generate this outcome. Whats the best way to handle this problem? Its clear that the mat predictor is important for modeling these relative abundances, but I cannot conclude this from these parameter credible intervals. Model fits visualized here:

#Get predicted values for each species.
pred.list <- list()
data <- as.matrix(data.frame(rep(1,nrow(r.spp.y)),mat,map))
for(i in 1:ncol(r.spp.y)){
  a <- c('m0','m1','m2')
  to.grep <- paste0('[',i,']')
  to.grep <- paste0(a,to.grep)
  preds <- out[rownames(out) %in% to.grep,]
  pred.list[[i]] <- exp(data %*% preds[,4])
}
pred.list <- (as.matrix(do.call('cbind', pred.list)))
pred.list <- pred.list / rowSums(pred.list)

#Plot predicted vs. observed and the 1:1 line.
par(mfrow = c(1,3))
for(i in 1:ncol(r.spp.y)){
  plot(r.spp.y[,i] ~ pred.list[,i])
  r.sq <- summary(lm(r.spp.y[,i] ~ pred.list[,i]))$r.squared
  abline(0,1, lwd = 2)
  txt <- paste0('R2 = ',round(r.sq,2))
  mtext(txt, side = 3, line = -2, adj = 0.05)
}

enter image description here


Get this bounty!!!

Leave a Reply