XGBoost Early Stopping To Avoid Overfitting

Python

Early stopping in XGBoost is a way to find the optimal number of estimators by monitoring the model's performance on a validation set and stopping the training when the performance starts to degrade.

In the below example, we monitor the model's performance on the validation set, scoring the the performance using the RMSE metric. We set early stopping rounds to 10 so if the model hasn't improved after 10 rounds then training stops.

 1|  from xgboost import XGBRegressor
 2|  
 3|  # Step 1: Initialise XGBoost regression model
 4|  model = XGBRegressor(objective='reg:squarederror', 
 5|                      n_estimators=1000,
 6|                      random_state=101)
 7|  
 8|  # Step 2: Declare evaluation set
 9|  eval_set = [(X_test, y_test)]
10|  
11|  # Step 3: Fit model with early stopping rounds set to 10
12|  model.fit(X_train, 
13|            y_train,
14|            eval_set=eval_set,
15|            eval_metric='rmse',
16|            early_stopping_rounds=10)
Did you find this snippet useful?

Sign up for free to to add this to your code library