Using Sklearn RFE to Select Features
Recursive Feature Elimination (RFE) is a feature selection method that enables fitting a model based on a selected number of top ranking features from the training data. Sklearn works with any model type that assigns weights to features.
In the example below we use a Random Forest Regression algorithm. Once initialised, this gets passed to the RFE method along with the number of features we want to select, in this case 8. After this, like any other Sklearn we can fit the model and make predictions by calling fit and predict.
Lastly, we combine the feature names from our training data and the ranking array from RFE (rfe.ranking_) into a dataframe and print this to analyse how RFE has ranked the features.
1| from sklearn.ensemble import RandomForestRegressor 2| from sklearn.feature_selection import RFE 3| 4| rf = RandomForestRegressor(random_state=101) 5| 6| rfe = RFE(rf, n_features_to_select=8) 7| rfe = rfe.fit(X_train, y_train) 8| 9| predictions = rfe.predict(X_test) 10| 11| #Print feature rankings 12| feature_rankings = pd.DataFrame({'feature_names':np.array(X_train.columns),'feature_ranking':rfe.ranking_}) 13| print(feature_rankings)