Classification with HistGradientBoostingClassifier in Scikit-Learn: An Implementation Guide

Python

This code snippet trains a classification model using Sklearns HistGradientBoostingClassifier algorithm.

This histogram gradient boosting classification algorithm trains faster than a standard gradient boosting algorithm and also allows categorical features to be included in the training data.

 1|  from sklearn.ensemble import HistGradientBoostingClassifier
 2|  from sklearn.metrics import classification_report, log_loss, roc_auc_score
 3|  
 4|  # Step 1: Create list containing the indices of categorical features
 5|  categorical_features = [1]
 6|  
 7|  # Step 2: Initialise histogram gradient boosting classification 
 8|  # model and fit
 9|  model = HistGradientBoostingClassifier(learning_rate=0.1,
10|                                         max_depth=None,
11|                                         max_bins=255,
12|                                         categorical_features=categorical_features,
13|                                         random_state=101)
14|  model.fit(X_train, y_train)
15|  
16|  # Step 3: Make predictions for test data & evaluate performance
17|  y_pred = model.predict(X_test)
18|  print('Classification Report:',classification_report(y_test, y_pred))
19|  print('Log Loss:',log_loss(y_test, y_pred))
20|  print('ROC AUC:',roc_auc_score(y_test, y_pred))
Did you find this snippet useful?

Sign up for free to to add this to your code library