*Bounty: 50*

*Bounty: 50*

I am trying to get the goodness-of-fit measures, such as R-square, chi-square, etc. from the “ivglm” code in the “ivtools” package in R programming.

However, I could not find a way to get these from its output.

For your reference, I also have different number of missing values for each variable as well.

For instance, I run the following code and get the output.

```
reg_X.LZ=glm(reg[,5]+reg[,3]+reg[,6]~reg[,14]+reg[,25]+reg[,15]+reg[,46], data=reg)
reg_Y.LX=glm(reg[,8]~reg[,5]+reg[,7]+reg[,6]+reg[,3]+reg[,4]+reg[,9]+reg[,10]*reg[,13]+reg[,11]+reg[,12]+reg[,14], data=reg, family=binomial(link="logit"))
reg_logit=ivglm(estmethod="ts", fitX.LZ=reg_X.LZ, fitY.LX=reg_Y.LX, data=reg, family=binomial(link="logit"))
> summary(reg_logit)
Call:
ivglm(estmethod = "ts", fitX.LZ = reg_X.LZ, fitY.LX = reg_Y.LX,
data = reg, family = binomial(link = "logit"))
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 2.582e+00 1.673e+00 1.543 0.122738
reg[, 5] -7.177e-02 4.150e-03 -17.293 < 2e-16 ***
reg[, 7] 1.666e+00 1.163e-01 14.331 < 2e-16 ***
reg[, 6] -1.339e-01 2.393e-02 -5.596 2.19e-08 ***
reg[, 3] -1.678e-04 2.763e-05 -6.075 1.24e-09 ***
reg[, 4] 1.016e-01 3.873e-03 26.235 < 2e-16 ***
reg[, 9] 2.169e-02 6.504e-03 3.335 0.000854 ***
reg[, 10] -2.127e-01 1.870e-01 -1.137 0.255463
reg[, 13] -4.391e+00 1.899e+00 -2.313 0.020721 *
reg[, 11] 4.420e-02 1.112e-02 3.976 7.01e-05 ***
reg[, 12] 3.070e-01 6.807e-02 4.510 6.48e-06 ***
reg[, 14] 1.919e-01 7.351e-02 2.610 0.009046 **
reg[, 10]:reg[, 13] 4.545e-01 2.138e-01 2.126 0.033488 *
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
```

However, I could not find a single way to get any goodness-of-fit measures.

I even tried to use some crude manners to get these goodness-of-fit measures as such but never worked:

```
1-pchisq(reg_logit$null.deviance-reg_logit$deviance, reg_logit$df.null-reg_logit$df.residual)
reg_logit$null.deviance-reg_logit$deviance
1- reg_logit$deviance/reg_logit$null.deviance
```

So I will show what other alternative methods I tried to get (1) Chi-squared values and (2) R-squared values which I will need your help.

**(1) R-squared values**: I tried to following crude method to get the R-squared value but I put zeros across the missing values (NAs) in the data by using the code below:

```
predicted = 1/(1+exp(-(reg_logit$est[1]+reg_logit$est[2]*reg[, 5]+reg_logit$est[3]*reg[, 7]+reg_logit$est[4]*reg[, 6]+reg_logit$est[5]*reg[, 3]+reg_logit$est[6]*reg[, 4]+reg_logit$est[7]*reg[, 9]+reg_logit$est[8]*reg[, 10]+reg_logit$est[9]*reg[, 13]+reg_logit$est[10]*reg[, 11]+reg_logit$est[11]*reg[, 12]+reg_logit$est[12]*reg[, 14]+reg_logit$est[13]*reg[, 10]*reg[, 13])))
y=reg[, 8]
predicted[is.na(predicted)] <- 0
1 - sum((y-predicted)^2)/sum((y-mean(y))^2)
```

However, the R-squared value I get is ” -0.8353449″, a negative value from this. I’ve heard that R-squared value calculation for instrumental variable regressions should be treated differently (if I am correct) so it could be natural to get negative R-squared values if following the basic formula.

The source of this information is as below:

https://www.stata.com/support/faqs/statistics/two-stage-least-squares/#example

**My main concern about this is how to supress the negative R-squared value and make into a positive R-squared value following the advice in the link above.**

**(2) Chi-squared values:** I tried the following crude method again to get the chi-squared value for my instrumental variable logistic regression which do not give it out by default as other types of regressions.

```
chi=((y-predicted)^2)/predicted
chi[is.na(chi)] <- 0
chisq_rs=sum(chi)
p_val = pchisq(chisq_rs, df = 11, lower.tail = FALSE)
```

Here I get a large chi-squared value of “2250.203” and P-value of “0” which sort of makes sense. However, unlike above method, I did not have to worry about converting missing values NAs into “0” (zeros) as before since this method still worked. **However, since I do not treat NAs as “0” (zeros) here in chi-squared value calculations while I do treat NAs as “0” (zeros) in my R-squared value calculation above, I will still need help on this to ask which way is correct and how to fix the problem for consistency in missing data treatment.**

I would appreciate if I can get on this R-squared value calculation along with the chi-squared value calculation as my goodness-of-fit measures of my instrumental variable logistic regression.