# #StackBounty: #r #regression #machine-learning #predictive-models #r-squared Multiple Regression, good P-value, but Low R2

### Bounty: 50

I am trying to build a model in R to predict Conversion Rate (CR) based on age, gender, and interest (and also the campaign_Id):

The CR values look like this:

The correlation coefficients are not very promising:

`rcorr(as.matrix(data.numeric))`

correlations with CR:

xyz_campaign_id (-0.19), age (-0.1), gender(-0.04), interest(-0.03)

So, the model below:

``````library(caret)
set.seed(100)
TrainIndex <- sample(1:nrow(data), 0.8*nrow(data))
data.train <- data[TrainIndex,]
data.test <- data[-TrainIndex,]
nrow(data.test)
model <- lm(CR ~ age + gender + interest + xyz_campaign_id , data=data.train)
``````

will not have a good adjusted r-squared (0.04):

``````Call:
lm(formula = CR ~ age + gender + interest + xyz_campaign_id,
data = data.train)

Residuals:
Min      1Q  Median      3Q     Max
-18.636 -11.858  -4.087   0.115  96.421

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)     47.231250   6.287738   7.512  1.4e-13 ***
age35-39         1.214713   1.916649   0.634  0.52639
age40-44        -1.971037   1.986316  -0.992  0.32131
age45-49        -3.064858   1.866713  -1.642  0.10097
genderM          3.709192   1.412311   2.626  0.00878 **
interest         0.030384   0.027617   1.100  0.27154
xyz_campaign_id -0.037856   0.006076  -6.231  7.1e-10 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 21.16 on 907 degrees of freedom
Multiple R-squared:  0.05237,   Adjusted R-squared:  0.04611
F-statistic: 8.355 on 6 and 907 DF,  p-value: 7.81e-09
``````

I also understand that I should probably convert “interest” from numeric to factor (I have tried that too, although I considered all 40 interest levels which is not ideal)

So, based on the provided information, is there any way to improve the model? what other models shall I try besides linear models to make sure that I have a good predictive model?

If you need more information, the challenge is available Here. Data is Here

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.