#StackBounty: #r #regression #logistic #statistical-significance #instrumental-variables Logistic regression including instrumental var…

Bounty: 50

I am trying to run a logistic regression including instrumental variables by using “ivprobit” function in R from the package called “ivprobit”.

If I do not include the instrumental variable, the “glm” function setting “family=binomial” works for my purpose. Then I have the following logistic regression result using the “glm” function in R and have many significant coefficients including my five main variables x1, x2, x3, x4 and x5 which MUST have significant coefficients in any cases.

    Call:               
    glm(formula = y ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 * x10 + x8 + x9 + x11, family = binomial(link = "logit"), data = DATA)               

    Deviance Residuals:                 
        Min       1Q   Median       3Q      Max                 
    -1.8332  -0.7871  -0.5911   0.8679   2.5837                 

    Coefficients:               
                       Estimate   Std. Error z value Pr(>|z|)                   
    (Intercept)        4.442e+00  2.461e+00   1.805  0.07110 .                  
    x1                -7.339e-02  6.159e-03 -11.916  < 2e-16 ***                
    x2                 1.649e+00  2.123e-01   7.767 8.03e-15 ***                
    x3                -1.083e-01  3.966e-02  -2.730  0.00632 **                 
    x4                -2.072e-04  5.082e-05  -4.078 4.55e-05 ***                
    x5                 9.218e-02  6.320e-03  14.587  < 2e-16 ***                
    x6                 2.444e-02  9.314e-03   2.624  0.00869 **                 
    x7                -4.513e-01  2.950e-01  -1.530  0.12605                    
    x10               -6.248e+00  2.743e+00  -2.278  0.02275 *                  
    x8                 2.228e-02  2.114e-02   1.054  0.29173                    
    x9                 4.433e-01  1.358e-01   3.263  0.00110 **                 
    x11                8.334e-03  1.802e-01   0.046  0.96311                    
    x7:x10             7.130e-01  3.331e-01   2.140  0.03233 *                  
    ---             
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1              

    (Dispersion parameter for binomial family taken to be 1)                

        Null deviance: 3846.4  on 3291  degrees of freedom              
    Residual deviance: 3405.7  on 3279  degrees of freedom              
      (4236 observations deleted due to missingness)                
    AIC: 3431.7             

    Number of Fisher Scoring iterations: 5  

Then I also run the logistic regression including instrumental variables called z1, z2 and z3 using “ivprobit” function and get the results as below.

    logit=ivprobit(y~x1+x2+x3+x5+x6+x7*x10+x8+x9+x11|x4|x1+x2+x3+x5+x6+x7*x10+x8+x9+x11+z1+z2+z3, data=DATA)
    summary(logit)

                       Coef        S.E.        t-stat   p-val    
    Intercep           2.60317729  2.40349393  1.0831  0.278852    
    x1                -0.04107591  0.01011537 -4.0607 5.006e-05 ***
    x2                 0.92400412  0.28985010  3.1879  0.001447 ** 
    x3                -0.07930707  0.15858798 -0.5001  0.617051    
    x4                -0.00008158  0.00070110 -0.1164  0.907373    
    x5                 0.05451171  0.00370911 14.6967 < 2.2e-16 ***
    x6                 0.01453500  0.00520783  2.7910  0.005285 ** 
    x7                -0.27094763  0.22488458 -1.2048  0.228356    
    x8                 0.01302787  0.01546667  0.8423  0.399671    
    x9                 0.26098242  0.20932203  1.2468  0.212560    
    x10               -3.69160559  3.25415202 -1.1344  0.256697    
    x11                0.00096123  0.26835992  0.0036  0.997142    
    x7:x10             0.41670471  0.38242267  1.0896  0.275950    
    ---
    Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

However, unlike my previous “glm” result which excludes the instrumental variables, I could not get all my main variables x1, x2, x3, x4 and x5 significant in this case using “ivprobit” function which includes the instrumental variables.

Unfortunately, “ivprobit” seems to be one of the rare function which can run logistic regression including instrumental variables in R.

Therefore, I would appreciate if I can get answers to the following three questions.

  1. Is there a way to have significant coefficients in “ivprobit” result (=logit regression result considering instrumental variables) as in the case of “glm” result (=logit regression result without considering instrumental variables)?

  2. Otherwise, is there a recommendable R-package code which I can use to run logistic regression including instrumental variables?

  3. Finally, is there a way to derive goodness-of-fit measures from the “ivprobit” output such as chi-square, R-square, loglikelihood, etc.?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.