#StackBounty: #python #random-forest #random-effects-model Mixed effects random forest (MERF) in Python: random slope or just intercept?

Bounty: 100

Does the MERF implementation in python allow for random slope effects?
If so, how?

I assume yes, and that it is achieved using the Z matrix. With a column of ones for the intercept and a column for the independent variable?

However, if that is the case, as the $b_i$ is drawn from a $N(0,sigma_b)$ distribution, do these $b_i$ adjust an average slope? How is this information accessed in this implementation?


Get this bounty!!!

#StackBounty: #lme4-nlme #random-effects-model #non-independent Estimating expected values for correlated data using random effects mod…

Bounty: 100

Statement of the problem: in a study, continuous and dichotomous variables were measured for both eyes for 60 individuals. The researchers needs estimates of expected values (means and proportions) for those measurements for all 60 subjects across bot eyes. In order to do this, the 120 eyes from the 60 subjects must be used to provide a pooled estimate.

The proposed random effects models to achieve this are as follows:

$E(y_{ij})=mu+alpha_j+epsilon_{ij}$

and

$Logit(p_{ij})=gamma+alpha_j$

Where $mu$ is the overall mean for a continuous variable $y_{ij}$, $gamma$ is the overall log odds of the probability for the dichotomous variables, $alpha_j, epsilon {ij}$ are random, uncorrelated random effects with normal distributions ($alpha_j sim N(0,sigma{gamma}), ; epsilon_{ij} sim N(0,sigma_{epsilon}), Cov(alpha_j,epsilon_{ij})=0$). Index $j$ stands for subject and index $i$ stands for eye nested in the subject.

A more complex nested random effects model could be appropriate, however, for the sake of simplicity will be ignored.

I have made a github project with the data and code in R to do this (https://github.com/nmolanog/glmer_question).

Now I present the main issue of this post: for dichotomous variables I am observing huge differences in estimates ignoring correlation of eyes nested in subjects vs estimates provided by the random effect models. Those differences are so important, that researchers are questioning and mistrusting the approach and its results. For continuous variables the differences in estimates are almost none existent and (as expected) the main differences are found in confidence intervals, where random effects models provides wider CI’s (see figure).
Estimates and confidence intervals for random effects models and classical estimation procedures

See for example variables M and N, the differences between approaches are huge. In the github repo I explored a nested random effects models for variable K obtaining very similar results to those provided by the simpler random effects model.

How those differences could be explained? Is there any problem with the approach?

Update-Sample code:

###estimate proportion for variable K using glm
mk_glm<-glm(K~1,data = ldf, family = binomial(link = "logit"))
mk_glm_ci<-inv.logit(confint(mk_glm))

##arrange result from glm model
(res_df<-data.frame(method="glm",estimate=inv.logit(mk_glm$coefficients),LCI=mk_glm_ci[1],UCI=mk_glm_ci[2]))

#compare to  raw estimate:
ldf$K%>%table()%>%{.[2]/sum(.)}

###estimate proportion for variable K using glmer model 1
mk_glmer<-glmer(K~1+(1|Id),data = ldf, family = binomial(link = "logit"),control=glmerControl(optimizer = "bobyqa"),nAGQ = 20)
mk_glmer_ci<-confint(mk_glmer)
#add result to res_df
(res_df<-rbind(res_df,data.frame(method="glmer",estimate=inv.logit(fixef(mk_glmer)),LCI=inv.logit(mk_glmer_ci[2,1]),UCI=inv.logit(mk_glmer_ci[2,2]))))

###estimate proportion for variable K using glmer model 2, nested random effects
mk_glmer_2<-glmer(K~1+(1|Id/eye),data = ldf, family = binomial(link = "logit"),control=glmerControl(optimizer = "bobyqa"))
mk_glmer_2_ci<-confint(mk_glmer_2)
(res_df<-rbind(res_df,data.frame(method="glmer2",estimate=inv.logit(fixef(mk_glmer_2)),LCI=inv.logit(mk_glmer_2_ci[3,1]),UCI=inv.logit(mk_glmer_2_ci[3,2]))))

Outuput

             method  estimate       LCI       UCI
(Intercept)     glm 0.7083333 0.6231951 0.7846716
(Intercept)1  glmer 0.9230166 0.7399146 0.9990011
(Intercept)2 glmer2 0.9999539 0.9991883 0.9999995

The dataset and code can be found in https://github.com/nmolanog/glmer_question


Get this bounty!!!

#StackBounty: #regression #mixed-model #forecasting #references #random-effects-model Predictions and forecasting with mixed-effects mo…

Bounty: 50

I am not sure I fully understand how mixed-effects models (such as mixed-effects PK/PD models) can be used for forecasting.

Some notations

Let $p in mathbb{N}$ with $p geq 2$. We assume that for each individual $i in lbrace 1,ldots,p rbrace$, we have $k_i in mathbb{N}^{ast}$ scalar observations $(y_{i,j}){1 leq j leq k_i}$ obtained at times $(t{i,j}){1 leq j leq k_i}$. Therefore, for each individual, the observations are $left( y{i,j}, t_{i,j} right)_{1 leq j leq k_i}$. We also assume the following model:

$$ y_{i,j} = fleft( t_{i,j}, b_i, theta right) + varepsilon_{i,j} $$

where $theta$ is a vector of parameters which contains fixed effects and variance-covariance parameters ; $b_i$ is a vector of individual random effects ; $f$ is sometimes called the structural model ; $varepsilon_{i,j}$ is the observation noise. We assume that:

$$ b_i sim mathcal{N}left( 0, mathbf{D} right), quad text{and} quad varepsilon_i = begin{bmatrix} varepsilon_{i,1} \ vdots \ varepsilon_{i, k_i} end{bmatrix} sim mathcal{N}left( 0, mathbf{Sigma} right). $$

The individual random effects $b_i$ are assumed i.i.d. and independent from $varepsilon_i$.

The question

Given $left( y_{i,j}, t_{i,j} right)_{substack{1 leq i leq p \ 1 leq j leq k_i}}$, one can obtain an estimate $hat{theta}$ of the model parameters $theta$ (which contain the unique coefficients in $mathbf{D}$ and $mathbf{Sigma}$) by maximizing the model likelihood. This can be done, for instance, using stochastic versions of the EM algorithm (see link above).

Assume that $hat{theta}$ is available.

If we are given some observations $y_{s}^{mathrm{new}}$ for a new individual $s notin lbrace 1, ldots, p rbrace$, its individual random effects are estimated by:

$$ widehat{b_s} = mathop{mathrm{argmax}} limits_{b_s} pleft( b_s mid y_{s}^{mathrm{new}}, hat{theta} right) $$

where $pleft( cdot mid y_{s}^{mathrm{new}}, hat{theta} right)$ is the posterior distribution of the random effects given the new observations $y_{s}^{mathrm{new}}$ and the point estimate of the model parameters $hat{theta}$. Thanks to Bayes’ theorem, this is equivalent to maximizing the product "likelihood $times$ prior:

$$ widehat{b_s} = mathop{mathrm{argmax}} limits_{b_s} pleft( y_{s}^{mathrm{new}} mid b_{s}, hat{theta} right) pleft( b_{s} mid hat{theta} right). $$

Now, if $t , longmapsto , f(t, cdot, cdot)$ is a continuous function of time, we may call it a growth curve. It describes the evolution of the measurements with time. Let $i_{0} in lbrace 1, ldots, p rbrace$ and $t$ such that $t_{i_{0},1} < ldots < t_{i_{0},k_i} < t$.

How can we use this mixed-effects model to predict the most likely value $y_{i_{0}}^{ast}$ for individual $i_{0}$ at time $t$? This relates to forecasting since we want to predict the measurement value at a future time.

Naively, I would do as follows. Given $left( y_{i,j}, t_{i,j} right){substack{1 leq i leq p \ 1 leq j leq k_i}}$, I would estimate $hat{theta}$ (we estimate the model parameters using all the data including the past observations for individual $i{0}$). Then I would estimate $widehat{b_{i_{0}}}$ as described above. Eventually, I would say that:

$$ y_{i_{0}}^{ast} = fleft( t, widehat{b_{i_{0}}}, hat{theta} right). $$

If this is right, I don’t see how I would prove it mathematically. Still, I’m feeling like I’m missing something because this predicted value $y_{i_{0}}^{ast}$ does not take into account the noise distribution. Also, I do not see how I would be able to estimate CIs for $y_{i_{0}}^{ast}$ with this.

In a Bayesian setting (with a prior distribution on $theta$), would I need to use the posterior predictive distribution (see this post and these notes)? From what I understand, if $y_{i_{0}}$ denotes the vector of the past observations for individual $i_{0}$, this posterior predictive distribution is given by:

$$ pleft( y_{i_{0}}^{ast} mid y_{i_{0}} right) = int_{Theta} pleft( y_{i_{0}}^{ast} mid theta, y_{i_{0}} right) pleft( theta mid y_{i_{0}} right) , dtheta. $$

However, I’m not sure it applies here and I’m not sure where the random effects come in.

Any reference, explanation, hint,… is welcome ! 🙂


Get this bounty!!!

#StackBounty: #r #random-variable #random-effects-model #factor-analysis #dimensionality-reduction Is it possible to run an EFA in R th…

Bounty: 50

Suppose I have the following data, in which 20 participants each rated 10 items on five different dimensions. In reality I have much more of each, but I am just using this as an example.

set.seed(123)    

library(data.table)
library(psych)

ppt <- rep(1:20, each = 10)
item <- rep(1:10, times = 20)
dim_a <- rnorm(200)
dim_b <- rnorm(200)
dim_c <- rnorm(200)
dim_d <- rnorm(200)
dim_e <- rnorm(200)

d <- as.data.table(cbind(ppt, item, dim_a, dim_b, dim_c, dim_d, dim_e))

I am interested in conducting a factor analysis on the five dimensions. I could do so with something like this. (In this example it suggests no factors be extracted because of the random data, but I proceed as though it suggested two.)

parallel <- fa.parallel(d[, c(3:7)], fm = 'pa', fa = 'fa') 
efa2 <- fa(d[, c(3:7)], nfactors =2, rotate = "oblimin", fm = 'pa', max.iter = 100000) 
print(efa2$loadings,cutoff = 0.5)

However, this doesn’t account for the fact that ratings were nested in participants, and in items. It seems like what I need is to add crossed random effects for subjects and items. After a lot of searching, it seems like this isn’t possible in R. Am I right?


Get this bounty!!!

#StackBounty: #mixed-model #random-effects-model #factor-analysis #sem How to account for multiple raters in an exploratory factor anal…

Bounty: 100

I have a dataset on which I want to perform an exploratory factor analysis. There are ~150 subjects that are rated by ~25 raters on ~70 items (the items are about certain behaviors of the subjects). Different subjects can be rated by different people, and different people can rate different subjects as well (~500 observations in total).

I suspect that perhaps the raters could be accounted for by including them as random effects? Is that possible in an exploratory factor analysis context? Or would another approach be more suited?


Get this bounty!!!