Logistic Regression Using Gradient Descent from Scratch
In this code snippet we implement logistic regression from scratch using gradient descent to optimise our algorithm.
As we intend to build a logistic regression model, we will use the Sigmoid Function as our hypothesis function where we will take the exponent to be the negative of a linear function g(x) that is comprised of our features represented by x0, x1... and associated feature weights w0, w1... (which will be found using gradient descent) to give g(x) = w0x0 + w1x1....
1| import Numpy as np 2| 3| """ 4| Note: X_train, y_train, X_test and y_test are in the form of a Numpy matrix 5| """ 6| 7| """ 8| Create Constant Term Feature: 9| Insert column of 1s as weight for the constant term x0. 10| """ 11| X_0 = np.ones((len(X_train),1)) 12| X_train = np.insert(X_train,[0],X_0, axis=1) 13| X_0 = np.ones((len(X_test),1)) 14| X_test = np.insert(X_test,[0],X_0, axis=1) 15| 16| """ 17| Hypothesis Function: 18| Function that returns an array that is the result of applying 19| the sigmoid function to the dot product of the training features 20| array and array of associated weights. 21| """ 22| 23| def hypothesis(X,w): 24| w = np.array(w.reshape((8,1))) 25| z = np.dot(X,w) 26| h = 1 / (1 + np.exp(-1 * z)) 27| h = np.round(h) 28| h = np.clip(h, 0.00001,0.99999) 29| return h 30| 31| """ 32| Cost Function: 33| Function that returns the cost for regularised logistic regression 34| given training features X, target y, feature weights w, number of 35| training examples m and regularisation term r 36| """ 37| 38| def cost(X,y,w,m,r): 39| h = hypothesis(X,w) 40| c = (-1/m)*sum(y*np.log(h) + (1-y)*np.log(1-h)) + (r/(2*m))*sum(np.square(w)) 41| return c 42| 43| """ 44| Model Training Using Gradient Descent Function: 45| Function to train model for training features X, target y given 46| intial weights w, with number of training examples m, learning rate 47| lr and regularisation term r for e number of epochs. The cost of each 48| epoch will be printed if verbose is set to true otherwise just the model 49| summary will be printed after training is complete. 50| """ 51| 52| def train_model(X, y, w, m, lr, e, r, verbose=False): 53| #Get cost for initial weights and set to current minimum cost, 54| c = cost(X,y,w,m,r) 55| cost_min = c 56| #Set current optimum weights to inital weights 57| optimum_weight = w 58| epoch_min = 1 59| #Perform gradient descent for e number of epochs 60| for i in range(1,e): 61| #Initlialise empty weights list for epoch i 62| w_epoch = [] 63| #Calculate new weights for each feature j 64| for j in range(0,len(w)): 65| h = hypothesis(X,w) 66| #New weight j 67| w_j = w[j] - lr*(1/m)*sum((h - y)*X[:,[j]]) + (r/m)*w[j] 68| #Append new weight to list of weights for epoch i 69| w_epoch.append(w_j) 70| #Assign epoch i weights as model weights 71| w = np.array(w_epoch) 72| #Calculate cost for new weights derived in epoch i 73| c = cost(X,y,w,m,r) 74| #If cost for the weights derived in this epoch are lower than the previous 75| #lowest cost then set optimum_weight to this and min_cost to the cost 76| if c < cost_min: 77| optimum_weight = w 78| cost_min = c 79| epoch_min = i 80| #Print cost of epoch if v set to True 81| if verbose == True: 82| print('epoch ' + str(i) + ': Cost=' + str(c)) 83| #Print model summary 84| print('Final Summary:') 85| print('Min cost: ' + str(cost_min)) 86| print('Minimum found at epoch ' + str(epoch_min)) 87| print('Optimum weights: ' + str(optimum_weight)) 88| #Return feature weights 89| return optimum_weight 90| 91| """ 92| Train Model 93| """ 94| 95| learning_rate = 0.1 96| reg_term = 0.2 97| weights = np.zeros((8,1)) 98| epochs = 500 99| m = len(X_train) 100| model = train_model(X_train, y_train, weights, m, learning_rate, epochs, reg_term) 101| 102| """ 103| Predict Target for Training and Test Sets 104| """ 105| 106| def pred(X,w): 107| h = hypothesis(X,w) 108| return h 109| y_train_pred = pred(X_train,model) 110| y_train_pred = np.round(y_train_pred) 111| y_test_pred = pred(X_test,model) 112| y_test_pred = np.round(y_test_pred)