#StackBounty: #cross-validation #model-selection #gaussian-process Cross-validation and building a final model when using hyperparamete…

Bounty: 50

I am trying to build a Gaussian process (GP) regression for a problem in which each experiment is computationally expensive, using cross-validation. Here is how I do it:

  • Build the GP regressor on the full available dataset, with hyperparameter optimization (anisotropic Gaussian kernel)
  • Perform 10-fold cross validation using the optimized hyperparameter set from the previous step

Now, what model should I select as the output of my procedure?

  1. The model trained on the full dataset, considering that its global performance is validated by each cross validation fold?
  2. A compound of each of the 10 models from cross validation?
  3. The model from cross validation with the highest score?

I’m currently going for #1 but I was opposed that my model was then not properly validated. But I think it is implicitly validated because I used the same hyperparameters in cross validation as in the model. #2 would perhaps be better but this does not feel right to me. #3, in my opinion, is not an option, because that could mean selecting a model which performs well just because the few validation cases are adapted to it.

Am I doing the process right?

Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.