Training a Model Using Stratified K-Fold
Python
Training a logistic regression model using stratified 5-fold cross validation. For a full tutorial on this method then visit www.analyseup.com
1| import pandas as pd 2| from sklearn.model_selection import StratifiedKFold 3| 4| df = pd.read_csv('data/processed_data.csv') 5| 6| skf = StratifiedKFold(n_splits=5) 7| target = df.loc[:,'Returned_Units'] 8| 9| from sklearn.linear_model import LogisticRegression 10| from sklearn.metrics import accuracy_score 11| 12| model = LogisticRegression() 13| def train_model(train, test, fold_no): 14| X = ['Retail_Price','Discount'] 15| y = ['Returned_Units'] 16| X_train = train[X] 17| y_train = train[y] 18| X_test = test[X] 19| y_test = test[y] 20| model.fit(X_train,y_train) 21| predictions = model.predict(X_test) 22| print('Fold',str(fold_no),'Accuracy:',accuracy_score(y_test,predictions)) 23| 24| fold_no = 1 25| for train_index, test_index in skf.split(df, target): 26| train = df.loc[train_index,:] 27| test = df.loc[test_index,:] 28| train_model(train,test,fold_no) 29| fold_no += 1
133
121
116
109