gridSearch performance measure effect


March 2019


6 time


I have an assignment and it asks me to:

Improve the performance of the models from the previous stepwith hyperparameter tuning and select a final optimal model using grid search based on a metric (or metrics) that you choose. Choosing an optimal model for a given task (comparing multiple regressors on a specific domain) requires selecting performance measures, for example, R2(coefficient of determination) and/or RMSE (root mean squared error) to compare the model performance.

I used this code for hyperparameter tuning:

model_example = GradientBoostingRegressor()
parameters = {'learning_rate': [0.1, 1], 
              'max_depth': [5,10]}

model_best = GridSearchCV(model_example,

I found the learning rate=0.1 and max_dept=5 I have chosen scoring='r3' as a performance measure but it doesn't have any effect on my model accuracy when I used this code for providing my best model:

my_best_model = GradientBoostingRegressor(learning_rate=0.1,

Do you know what's wrong with my work?


1 answers


Try setting a random_state as a parameter of your GradientBoostingRegressor(). For example, GradientBoostingRegressor(random_state=1).

The model will then produce the same results on the same data. Without that parameter, there's an element of randomness that makes it difficult to compare different model fits.

Setting a random state on the train-test-split will also help with this.