Tuning XGBoost Hyperparameters with Grid Search
Python
In this code snippet we train an XGBoost classifier model, using GridSearchCV to tune five hyperparamters.
In the example we tune subsample, colsample_bytree, max_depth, min_child_weight and learning_rate. Each hyperparameter is given two different values to try during cross validation.
1| from sklearn.model_selection import GridSearchCV 2| import xgboost as xgb 3| 4| # create a dictionary containing the hyperparameters 5| # to tune and the range of values to try 6| PARAMETERS = {"subsample":[0.75, 1], 7| "colsample_bytree":[0.75, 1], 8| "max_depth":[2, 6], 9| "min_child_weight":[1, 5], 10| "learning_rate":[0.1, 0.01]} 11| 12| # create a validation set which will be used for early stopping 13| eval_set = [(X_val, y_val)] 14| 15| # initialise an XGBoost classifier, set the number of estimators, 16| # evaluation metric & early stopping rounds 17| estimator = xgb.XGBClassifier(n_estimators=100, 18| n_jobs=-1, 19| eval_metric='logloss', 20| early_stopping_rounds=10) 21| 22| # initialise GridSearchCV model by passing the XGB classifier we 23| # initialised in the last step along with the dictionary of parameters 24| # and values to try. We also set the number of folds to validate over 25| # along with the scoring metic to use 26| model = GridSearchCV(estimator=estimator, 27| param_grid=PARAMETERS, 28| cv=3, 29| scoring="neg_log_loss") 30| 31| # fit model 32| model.fit(X_train, 33| y_train, 34| eval_set=eval_set, 35| verbose=0) 36| 37| # print out the best hyperparameters 38| print(model.best_params_)
143
128
116