I am using `caret`

to evaluate the classification performance of several models on a small dataset (190 obs) with two classes and just a handful of features.

When I inspect the `train()`

object for one of the models, I get what looks to be the mean metric values (ROC, Sens, and Spec).

```
Resampling: Cross-Validated (10 fold, repeated 5 times)
Summary of sample sizes: 171, 171, 171, 171, 171, 171, ...
Resampling results across tuning parameters:
nIter method ROC Sens Spec
50 Adaboost.M1 0.8866667 0.9866667 0.58
50 Real adaboost 0.5566667 0.9844444 0.50
100 Adaboost.M1 0.8844444 0.9877778 0.58
100 Real adaboost 0.5738889 0.9833333 0.52
150 Adaboost.M1 0.8800000 0.9877778 0.60
150 Real adaboost 0.5994444 0.9833333 0.52
```

When I use the `resamples()`

function and put all of the models in a list, I get the means again, but also the median values. (other model results omitted for clarity)

```
Models: RF, GBM, SVM, ADABOOST, C5, NB
Number of resamples: 50
ROC
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
ADABOOST 0.25000 0.8958 0.9444 0.8867 1 1 0
Sens
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
ADABOOST 0.8889 1.0000 1.0000 0.9867 1.0000 1.0000 0
Spec
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
ADABOOST 0 0 1 0.58 1 1 0
```

The `bwplot()`

function appears to display the median values as the point estimates.

It seems to me like the `train()`

output wants me to evaluate the models based on the means. `bwplot()`

focuses on the median. My first thought was that the median would be a better metric with such spread.

Which would you use, and why?

