#StackBounty: #r #mixed-model #lme4-nlme #categorical-encoding What is the appropriate zero-correlation parameter model for factors in …

Bounty: 50

When one wants to specify a lmer model including variance components but no correlation parameters, as opposed to m1, for categorical predictors (factors) one has to convert the factors to numeric covariates or use lme4::dummy().
Until now I thought m2a (or equivalently m2b using the double-bar syntax) would be the correct way to specify such a zero-correlation parameter model.

But Rune Haubo Bojesen Christensen pointed out that this model does not make sense to him. Instead he suggests m3a, which is the same as m3b and m3c, as an appropriate model.

library("lme4")
data("Machines", package = "MEMSS")    
d <- Machines

m1 <- lmer(score ~ Machine + (Machine | Worker), d)

mm1 <- model.matrix(~ Machine, d)
c1 <- mm1[, 2]
c2 <- mm1[, 3]  
m2a <- lmer(score ~ Machine + (1 | Worker) + (0 + c1 | Worker) + (0 + c2 | Worker), d)
m2b <- lmer(score ~ Machine + (c1 + c2 || Worker), d)
VarCorr(m2a)
 Groups   Name        Std.Dev.
 Worker   (Intercept) 4.07425 
 Worker.1 c1          5.88935 
 Worker.2 c2          3.64708 
 Residual             0.96228 

mm0 <- model.matrix(~ 0 + Machine, d)
A <- mm0[, 1]
B <- mm0[, 2]
C <- mm0[, 3]
m3a <- lmer(score ~ Machine + (1 | Worker) + (0 + dummy(Machine, "A") | Worker) + 
                                             (0 + dummy(Machine, "B") | Worker) +
                                             (0 + dummy(Machine, "C") | Worker), d)
m3b <- lmer(score ~ Machine + (1 | Worker) + (0 + A | Worker) + (0 + B | Worker) + (0 + C | Worker), d)
m3c <- lmer(score ~ Machine + (1 | Worker) + (0 + A + B + C || Worker), d)
VarCorr(m3a)
 Groups   Name                Std.Dev.
 Worker   (Intercept)         3.78595 
 Worker.1 dummy(Machine, "A") 1.94032 
 Worker.2 dummy(Machine, "B") 5.87402 
 Worker.3 dummy(Machine, "C") 2.84547 
 Residual                     0.96158


m4 <- lmer(score ~ Machine + (1 | Worker) + (1 | Worker:Machine), d)

How should one specify a lmm without correlation parameters for factors and what are the differences between m2a and m3?
Is there a preferred model for model comparison with m4 (which is discussed here)?


Update: there is an ongoing discussion regarding this and related questions on the R-sig-mixed-models mailing list.


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.