Tuning XGBoost Hyperparameters with Grid Search

Python

In this code snippet we train an XGBoost classifier model, using GridSearchCV to tune five hyperparamters.

In the example we tune subsample, colsample_bytree, max_depth, min_child_weight and learning_rate. Each hyperparameter is given two different values to try during cross validation.

 1|  from sklearn.model_selection import GridSearchCV
 2|  import xgboost as xgb
 3|  
 4|  # create a dictionary containing the hyperparameters
 5|  # to tune and the range of values to try
 6|  PARAMETERS = {"subsample":[0.75, 1],
 7|                "colsample_bytree":[0.75, 1],
 8|                "max_depth":[2, 6],
 9|                "min_child_weight":[1, 5],
10|                "learning_rate":[0.1, 0.01]}
11|  
12|  # create a validation set which will be used for early stopping
13|  eval_set = [(X_val, y_val)]
14|  
15|  # initialise an XGBoost classifier, set the number of estimators,
16|  # evaluation metric & early stopping rounds
17|  estimator = xgb.XGBClassifier(n_estimators=100, 
18|                                n_jobs=-1, 
19|                                eval_metric='logloss',
20|                                early_stopping_rounds=10)
21|  
22|  # initialise GridSearchCV model by passing the XGB classifier we
23|  # initialised in the last step along with the dictionary of parameters
24|  # and values to try. We also set the number of folds to validate over
25|  # along with the scoring metic to use
26|  model = GridSearchCV(estimator=estimator,
27|                      param_grid=PARAMETERS,
28|                      cv=3,
29|                      scoring="neg_log_loss")
30|  
31|  # fit model
32|  model.fit(X_train,
33|            y_train,
34|            eval_set=eval_set,
35|            verbose=0)
36|  
37|  # print out the best hyperparameters
38|  print(model.best_params_)
Did you find this snippet useful?

Sign up for free to to add this to your code library