# #StackBounty: #r #regression #feature-selection How to code coverage probability for SIS screening in R

### Bounty: 50

I have a high dimensional multivariate regression model $$(p>>n: p=10,000, n=200)$$

$$Y = X^Tbeta + epsilon$$

which may be sparse and therefore i have attempted some screening methods (SIS, ISIS, Lars…) and I want to compute some statistics below which are defined in a paper i read here. The “coverage probability” is the proportion of non-zero parameters of T that are also in S and is defined by:

CP = $$frac{1}{n}sum_{k} 1_{{T subset hat{S}_{(k)}}}$$

Say $$T = (1,2,3,…q)$$ is the true model and of size $$q=8$$ for example. Say also that $$hat{S}_{(k)} ={ j:|hat{beta}|>0 }$$ is the simulated model. Any advice on how can I code this?

My code so far is:

``````#Model T which is of length 8 (i.e. only 8 values of beta are non-zero)
Y<-X%*%beta + rnorm(n)

#SIS model of length 38
library(SIS)
sismodel=SIS(X, Y, family='gaussian')
#Coefficients
beta_hat<-sismodel$$coef.est #Index of coef path<-sismodel$$ix
``````

so I believe i need to find the index values (path) of beta which are the same as $$hat{beta}$$…is this right? I’m thinking along the lines of using this function somehow but come a bit unstuck, any advice on how to proceed would be helpful!

``````beta_hat = rep(0,d)
beta_hat[setdiff(path,beta)] =1
``````

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.