Evaluating Poor Predictions Using SHAP
Python
In this code snippet we sort the predictions on a tests set in order of largest error. We then use the Shap force plot to display how each feature impacts the prediction.
1| import shap 2| shap.initjs() 3| 4| pred = model.predict(X_test) 5| pred = pd.DataFrame(pred,columns=['predictions']) 6| 7| """ 8| Blend X_test, y_test and pred, calculate error between predictions 9| and actuals then sort by largest error first 10| """ 11| def create_test_df(X_test, y_test, pred): 12| X_test.reset_index(drop=True,inplace=True) 13| y_test.reset_index(drop=True,inplace=True) 14| test_df = pd.concat(objs=[X_test,y_test,pred],axis=1) 15| test_df['error'] = (test_df['price']-test_df['predictions'])**2 16| test_df.sort_values(by='error',ascending=False,inplace=True) 17| test_df.drop(['price','predictions','error'],inplace=True,axis=1) 18| return test_df 19| test_df = create_test_df(X_test,y_test,pred) 20| 21| explainer_model = shap.TreeExplainer(model) 22| shap_values = shap.TreeExplainer(model).shap_values(test_df) 23| 24| def force_plot(index): 25| shap.force_plot(explainer_model.expected_value, shap_values[index], test_df.iloc[[index]]) 26| 27| """ 28| Outputs largest error prediction. 29| Change 0 to 1 to view force plot of prediction with 2nd largest error. 30| """ 31| force_plot(0)
133
121
117
109