#StackBounty: #python #scikit-learn #xgboost #optimization Is it possible to get worse model after optimization?

Bounty: 50

I am trying recently to optimize models but for some reason, whenever I try to run the optimization the model score in the end is worse than before, so I believe I do something wrong.

in order to optimize my model I define param grid and than fit with the train data and then according to the results run again with nre parameters, e.g-

#ROUND 1
param_grid={
    'max_depth': [3,4,5],
    'learning_rate':[0.1,0.01,0.05],
    'gamma': [0,0.25,1.0],
    'reg_lambda':[0,1.0,10.0],
    'scale_pos_weight':[1,3,5]
}

grid_search = GridSearchCV(estimator = clf_xgb, param_grid = param_grid, 
                          cv = 3, n_jobs = -1, verbose = 2)
grid_search.fit(X_train,y_train)
grid_search.best_params_

>>>.....

(and now based on the result changing the params…)

after this step I choose the best hyperparameters and run the model;

clf_xgb=xgb.XGBClassifier(seed=42,
                         objective='binary:logistic',
                         gamma=0,
                         learn_rate=0.7,
                         max_depth=6,
                         reg_lambda=0.8,
                         scale_pos_weight=1,
                         subsample=0.9,
                         cilsample_bytree=0.5)

clf_xgb.fit(X_train,
           y_train,
           verbose=True,
           early_stopping_rounds=10,
           eval_metric='aucpr',
           eval_set=[(X_test,y_test)])

The problem is that when I check the model score

clf_xgb.score(X_test,y_test)

I always get lower score than what I got before the optimization which makes me suspect that I’m missing something in the way doing it/basic principle in this process.

Is it possible that after running the optimization my score won’t get better (and even worse?) ? Where is my mistake? Are there other parameters that could influence or improve my model?


Get this bounty!!!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.