#StackBounty: #r #anova #lme4-nlme #ancova #lsmeans Different ways to include pre-test performance as a covariate in a linear mixed-eff…

Bounty: 50

I have a pre-post experimental design, where I have measured participants’ performance in three courses (tasks) at both pre and post-test. The participants were quasi-randomly assigned to two groups. The data structure looks like this:

subjectID Course Group pre-test.score post-test.score
1 A ii ### #
1 B ii ### #
1 C ii ### #
2 A b ### #
2 B b ### #
2 C b ### #

I have analysed these data using a linear mixed-effect regression model where I predict post-test performance, controlling for pre-test performance with the + sign:

# I fit these models with lmer in R
CI_post <- lmer(
  post.diff ~ 
    pre.diff +
    group * course 
  + (1|subjectID) , 
  data = dat, 
  REML = FALSE)

Using Satterthwaite’s method from the emmeans package I get:

CI_post_interaction_coursegroup <- emmeans(CI_post, specs = c("course", "group"),lmer.df = "satterthwaite")

course group       emmean    SE   df lower.CL upper.CL
 A      blocked      0.311 0.191 6.65  -0.1452    0.768
 B      blocked      0.649 0.180 5.38   0.1954    1.102
 C      blocked      1.141 0.195 7.28   0.6847    1.598
 A      interleaved  0.189 0.194 7.15  -0.2666    0.645
 B      interleaved  0.497 0.179 5.31   0.0451    0.949
 C      interleaved  1.046 0.191 6.72   0.5907    1.502

But I could perhaps also perform the same model adding pre-test with as an interaction term to the model, so that that the model becomes pre.test * course * group

CI_post <- lmer(
  post.diff ~ 
    pre.diff *
    group * course 
  + (1|subjectID),
  data = dat, 
  REML = FALSE)

, which gives very different estimates:

 course group        emmean    SE    df lower.CL upper.CL
 A      blocked     -0.0669 0.188 11.10   -0.481    0.347
 B      blocked      0.6466 0.161  6.09    0.255    1.038
 C      blocked      1.1980 0.194 12.65    0.778    1.618
 A      interleaved -0.1520 0.211 16.76   -0.597    0.293
 B      interleaved  0.4872 0.160  6.12    0.098    0.876
 C      interleaved  1.0593 0.181  9.82    0.654    1.464

I am trying to understand the exact differences between these two models, and which is them is "correct"?. Long (see comments below) gave a helpful comment that "if you want the group:course interaction to vary depending on the value of pre.diff then you fit the 2nd model with the 3-way interaction". But since my pre-test score is a continuous variable that measured each participant’s performance on course A, course B and course C, will the model use this data structure when I add an interaction term between the covariate and factors in the model?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.