# #StackBounty: #model-selection #r-squared #performance #rms Why does the rank order of models differ for R squared and RMSE?

### Bounty: 50

I am comparing \$R^2\$ and RMSE of different models. Interestingly, the rank ordering of the models with respect to \$-R^2\$ and RMSE is different and I do not understand why.

Here is an example in R:

``````library(caret)

set.seed(0)
d<-SLC14_1(n=1000)
tc<-trainControl(method="cv",number=10)
t1<-train(y~.,data=d,method="glmnet",trControl=tc)
order(t1\$results\$RMSE)==order(-t1\$results\$Rsquared)
``````

Output:

``````[1]  TRUE FALSE  TRUE FALSE  TRUE FALSE FALSE FALSE  TRUE
``````

Thus, the order if different for \$-R^2\$ suqared and \$RMSE\$.

The question is, why.

Let \$SS_{res}\$ be the sum of squared residuals \$sum(y_i-f_i)^2\$.

\$RMSE\$ is defined as \$sqrt{SS_{res}/n}\$.

\$R^2\$ is defined as \$1-SS_{res}/SS_{tot}\$ where \$SS_{tot}\$ is \$sum(y_i-overline{y})^2\$.

Since \$SS_{res}=n(RMSE)^2\$, we can write \$R^2\$ as \$1-n(RMSE)^2/SS_{tot}\$.
Since \$n\$ and \$SS_{tot}\$ are constant and the same for all models, \$-R^2\$ and \$RMSE\$ should strictly positively related. However, they are not since the ranking order is in practice not identical (see example code).

What is wrong with my argument?

Get this bounty!!!

This site uses Akismet to reduce spam. Learn how your comment data is processed.