Never Forget Another Line of Code

Datasnips is a free code snippet hosting platform for Data Science & AI. It enables your code snippets to be organized, searchable & shareable.

PUBLIC SNIPPETS

LATEST SNIPPETS

TOP SNIPPETS

POPULAR TAGS

Tuning XGBoost Hyperparameters with Grid Search

Python

In this code snippet we train an XGBoost classifier model, using GridSearchCV to tune five hyperparamters.

In the example we tune subsample, colsample_bytree, max_depth, min_child_weight and learning_rate. Each hyperparameter is given two different values to try during cross validation.

 1|  from sklearn.model_selection import GridSearchCV
 2|  import xgboost as xgb
 3|  
 4|  # create a dictionary containing the hyperparameters
 5|  # to tune and the range of values to try
 6|  PARAMETERS = {"subsample":[0.75, 1],
 7|                "colsample_bytree":[0.75, 1],
 8|                "max_depth":[2, 6],
 9|                "min_child_weight":[1, 5],
10|                "learning_rate":[0.1, 0.01]}
11|  
12|  # create a validation set which will be used for early stopping
13|  eval_set = [(X_val, y_val)]
14|  
15|  # initialise an XGBoost classifier, set the number of estimators,
16|  # evaluation metric & early stopping rounds
17|  estimator = xgb.XGBClassifier(n_estimators=100, 
18|                                n_jobs=-1, 
19|                                eval_metric='logloss',
20|                                early_stopping_rounds=10)
21|  
22|  # initialise GridSearchCV model by passing the XGB classifier we
23|  # initialised in the last step along with the dictionary of parameters
24|  # and values to try. We also set the number of folds to validate over
25|  # along with the scoring metic to use
26|  model = GridSearchCV(estimator=estimator,
27|                      param_grid=PARAMETERS,
28|                      cv=3,
29|                      scoring="neg_log_loss")
30|  
31|  # fit model
32|  model.fit(X_train,
33|            y_train,
34|            eval_set=eval_set,
35|            verbose=0)
36|  
37|  # print out the best hyperparameters
38|  print(model.best_params_)