CatBoost Mulitclass Classification

Supervised Learning

In this code snippet we on how to train and evaluate a multiclass classification model using the CatBoostClassifier.

First, we initialise and fit the CatBoostClassifier with the desired hyperparameters such as the loss function, number of estimators, maximum depth, learning rate, and L2 regularization. Once the model is trained, we can save it for future use.

To understand which features are important for our model, we plot the feature importances using matplotlib. This gives us an insight into the most relevant features for our model, which can be useful for feature selection and engineering.

Finally, we make predictions for the test data using the predict() method and evaluate the performance of the model using the accuracy_score() function from sklearn.metrics.

 1|  from catboost import CatBoostClassifier
 2|  import matplotlib.pyplot as plt
 3|  from sklearn.metrics import accuracy_score
 5|  # Step 1: Initialise and fit CatBoost multiclass model
 6|  model = CatBoostClassifier(loss_function='MultiClass', 
 7|                             n_estimators=1000, 
 8|                             max_depth=4, 
 9|                             learning_rate=0.1, 
10|                             l2_leaf_reg=1, 
11|                             random_seed=101,
12|                             train_dir='/train-dir')
13|, y_train, verbose=False)
15|  model.save_model('cat_classification.model')
17|  # Step 2: Plot feature importances
18|  features = X_train.columns
19|  importance_values = model.feature_importances_
21|  plt.barh(y=range(len(features)),
22|           width=importance_values,
23|           tick_label=features)
26|  # Step 3: Make predictions for test data & evaluate performance
27|  y_pred = model.predict(X_test)
28|  print('Accuracy',accuracy_score(y_test, y_pred))
Did you find this snippet useful?

Sign up to bookmark this in your snippet library