I’m using a binomial logistic regression to identify if exposure to
has_y impacts the likelihood that a user will click on something. My model is the following:
fit = glm(formula = has_clicked ~ has_x + has_y, data=df, family = binomial())
This the output from my model:
Call: glm(formula = has_clicked ~ has_x + has_y, family = binomial(), data = active_domains) Deviance Residuals: Min 1Q Median 3Q Max -0.9869 -0.9719 -0.9500 1.3979 1.4233 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.504737 0.008847 -57.050 < 2e-16 *** has_xTRUE -0.056986 0.010201 -5.586 2.32e-08 *** has_yTRUE 0.038579 0.010202 3.781 0.000156 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 217119 on 164182 degrees of freedom Residual deviance: 217074 on 164180 degrees of freedom AIC: 217080 Number of Fisher Scoring iterations: 4
As each coefficient is significant, using this model I’m able to tell what the value of any of these combinations is using the following approach:
predict(fit, data.frame(has_x = T, has_y=T), type = "response")
I don’t understand how I can report on the Std. Error of the prediction.
- Do I just need to use $1.96*SE$? Or do I need to convert the
$SE$ using an approach described here?
- If I want to understand the standard-error for both variables
how would I consider that?
Unlike this question, I am interested in understanding what the upper and lower bounds of the error are in a percentage. For example, of my prediction shows a value of 37% for
True,True can I calculate that this is $+/- 0.3%$ for a $95% CI$? (0.3% chosen to illustrate my point)