2 Upvotes

How to Train XGBoost with Imbalanced Data Using Scale_pos_weight

In this example we train a binary classification model using XGBoost with data that is imbalanced.

To remedy this we calculate the number of a weight spw which is the count of the the class that appears most frequently in the data divided by the count of the class that appears least frequently.

Finally this weight is passed to the model as the scale_pos_weight parameter.

from xgboost import XGBClassifier

smallest_class_count = y_train.sum()
largest_class_count = len(y_train) - smallest_class_count
spw = largest_class_count / smallest_class_count

model = XGBClassifier(booster='gbtree', 
			objective='binary:logistic', 
			max_depth=12, learning_rate=0.1, 
			n_estimators=10, 
			scale_pos_weight=spw, 
			random_state=101, 
			n_jobs=-1)

model.fit(X_train, y_train)

By analyseup - Last Updated April 7, 2022, 10:26 a.m.

Did you find this snippet useful?

Sign up to bookmark this in your snippet library

COMMENTS
RELATED SNIPPETS
Top Contributors
103
100