# #StackBounty: #anova #mixed-model #lme4-nlme #repeated-measures #lm Repeated measures anova: lm vs lmer

### Bounty: 50

I’m trying to reproduce several interaction test between with both `lm` and `lmer` on repeated measures (2x2x2). The reason I want to compare both methods is because SPSS’s GLM for repeated measures yields the exact same results as the `lm` approach presented here, so at the end I want to compare SPSS vs R-lmer. So far, I’ve only managed to reproduce (closely) some of these interactions.

You’ll find below a script to better illustrate my point:

``````library(data.table)
library(tidyr)
library(lmerTest)
library(MASS)

set.seed(1)

N     <- 100 # number of subjects
sigma <- 1   # popuplation sd
rho   <- .6  # correlation between variables

# X1:   a  a  a  a  b  b  b  b
# X2:   a  a  b  b  a  a  b  b
# X3:   a  b  a  b  a  b  a  b
mu <- c(5, 3, 3, 5, 3, 5, 5, 3) # means

# Simulate the data
sigma.mat <- rep(sigma, length(mu))
S <- matrix(sigma.mat, ncol = length(sigma.mat), nrow = length(sigma.mat))
Sigma <- t(S) * S * rho
diag(Sigma) <- sigma**2
X <- data.table( mvrnorm(N, mu, Sigma) )
setnames(X, names(X), c("aaa", "aab", "aba", "abb", "baa", "bab", "bba", "bbb"))
X[, id := 1:.N]

# Long format
XL <- data.table( gather(X, key, Y, aaa:bbb) )
XL[, X1 := substr(key, 1, 1)]
XL[, X2 := substr(key, 2, 2)]
XL[, X3 := substr(key, 3, 3)]

# Recode long format (a = +1; b = -1)
XL[, X1c := ifelse(X1 == "a", +1, -1)]
XL[, X2c := ifelse(X2 == "a", +1, -1)]
XL[, X3c := ifelse(X3 == "a", +1, -1)]

### Composite scores to be used with lm
# X2:X3 2-way interaction (for half the data; i.e. when X1 == "a")
X[, X1a_X2.X3 := (aaa - aab) - (aba - abb)]

# X2:X3 2-way interaction (for all the data)
X[, aa := (aaa + baa) / 2]
X[, ab := (aab + bab) / 2]
X[, ba := (aba + bba) / 2]
X[, bb := (abb + bbb) / 2]
X[, X2.X3 := (aa - ab) - (ba - bb)]

# X1:X2:X3 3-way interaction (for all the data)
X[, X1.X2.X3 := ( (aaa - aab) - (aba - abb) ) - ( (baa - bab) - (bba - bbb) )]

### Fit models
# X2:X3 2-way interaction (for half the data; i.e. when X1 == "a")
summary( lm(X1a_X2.X3 ~ 1, X) ) # t = 34.13303
summary( lmer(Y ~ X2c*X3c + (X2c+X3c|id), XL[X1 == "a"]) ) # t = 34.132846  close match
summary( lmer(Y ~ X2c*X3c + (X2c+X3c||id), XL[X1 == "a"]) ) # t = 34.134624  close match

# X2:X3 2-way interaction (for all the data)
summary( lm(X2.X3 ~ 1, X) ) # t = 0.3075025
summary( lmer(Y ~ X2c*X3c + (X2c+X3c|id), XL) ) # t = 0.1641932
summary( lmer(Y ~ X2c*X3c + (X2c+X3c||id), XL) ) # t = 0.1640710
summary( lmer(Y ~ X2c*X3c + (X2c*X3c|id), XL) ) # t = 0.1641765
anova(   lmer(Y ~ X2c*X3c + (X2c*X3c|id), XL), ddf = "Kenward-Roger" ) # t = 0.1643168
summary( lmer(Y ~ X2c*X3c + (X2c*X3c|id), XL, REML = FALSE) ) # t = 0.1645303
summary( lmer(Y ~ X2c*X3c + (X2c*X3c||id), XL) ) # t = 0.1640704

# X1:X2:X3 3-way interaction (for all the data)
summary( lm(X1.X2.X3 ~ 1, X) ) # t = 46.50177
summary( lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c - X1c:X2c:X3c|id), XL) ) # t = 49.0317599
anova(   lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c - X1c:X2c:X3c|id), XL), ddf = "Kenward-Roger" ) # t = 49.03176
summary( lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c - X1c:X2c:X3c|id), XL, REML = FALSE) ) # t = 49.2677606
summary( lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c - X1c:X2c:X3c||id), XL) ) # t = 46.5193774 close match
summary( lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c|id), XL) ) # unidentifiable
summary( lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c|id), XL,
control = lmerControl(check.nobs.vs.nRE="ignore")) ) # t = 46.5148684 close match
``````

As you can see from above, none of the `lm` estimations are exactly matched by the `lmer` ones. Although some of the results are very similar and may differ only due because of numerical/computational reasons. The gap between both estimation method is specially large for the `X2:X3 2-way interaction (for all the data)`.

My question is if there’s a way to obtain the exact same results with both methods, and if there is a correct way to perform the analyses with `lmer` (although it may not match the `lm` results).

Bonus:

I’ve noticed that the `t value` associated with the 3-way interaction is affected by the way factors are coded, which seems very strange to me:

``````summary( lmer(Y ~ X1*X2*X3 + (X1*X2*X3 - X1:X2:X3||id), XL) ) # t = 48.36
summary( lmer(Y ~ X1c*X2c*X3c + (X1c*X2c*X3c - X1c:X2c:X3c||id), XL) ) # t = 56.52
``````

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.