Logistic Regression Using Gradient Descent from Scratch

In this code snippet we implement logistic regression from scratch using gradient descent to optimise our algorithm.

As we intend to build a logistic regression model, we will use the Sigmoid Function as our hypothesis function where we will take the exponent to be the negative of a linear function g(x) that is comprised of our features represented by x0, x1... and associated feature weights w0, w1... (which will be found using gradient descent) to give g(x) = w0x0 + w1x1....

import Numpy as np

Note: X_train, y_train, X_test and y_test are in the form of a Numpy matrix

Create Constant Term Feature:
Insert column of 1s as weight for the constant term x0.
X_0 = np.ones((len(X_train),1))
X_train = np.insert(X_train,[0],X_0, axis=1)
X_0 = np.ones((len(X_test),1))
X_test = np.insert(X_test,[0],X_0, axis=1)

Hypothesis Function:
Function that returns an array that is the result of applying 
the sigmoid function to the dot product of the training features 
array and array of associated weights.

def hypothesis(X,w):
    w = np.array(w.reshape((8,1)))
    z =,w)
    h =  1 / (1 + np.exp(-1 * z))
    h = np.round(h)
    h = np.clip(h, 0.00001,0.99999)
    return h

Cost Function:
Function that returns the cost for regularised logistic regression 
given training features X, target y, feature weights w, number of 
training examples m and regularisation term r

def cost(X,y,w,m,r):
    h = hypothesis(X,w)
    c = (-1/m)*sum(y*np.log(h) + (1-y)*np.log(1-h)) + (r/(2*m))*sum(np.square(w))
    return c

Model Training Using Gradient Descent Function:
Function to train model for training features X, target y given 
intial weights w, with number of training examples m, learning rate 
lr and regularisation term r for e number of epochs. The cost of each 
epoch will be printed if verbose is set to true otherwise just the model 
summary will be printed after training is complete.

def train_model(X, y, w, m, lr, e, r, verbose=False):
    #Get cost for initial weights and set to current minimum cost,
    c = cost(X,y,w,m,r)
    cost_min = c
    #Set current optimum weights to inital weights
    optimum_weight = w
    epoch_min = 1
    #Perform gradient descent for e number of epochs
    for i in range(1,e):
        #Initlialise empty weights list for epoch i
        w_epoch = []
        #Calculate new weights for each feature j 
        for j in range(0,len(w)):
            h = hypothesis(X,w)
            #New weight j
            w_j = w[j] - lr*(1/m)*sum((h - y)*X[:,[j]]) + (r/m)*w[j] 
            #Append new weight to list of weights for epoch i
        #Assign epoch i weights as model weights   
        w = np.array(w_epoch)
        #Calculate cost for new weights derived in epoch i
        c = cost(X,y,w,m,r)
        #If cost for the weights derived in this epoch are lower than the previous
        #lowest cost then set optimum_weight to this and min_cost to the cost
        if c < cost_min:
            optimum_weight = w
            cost_min = c
            epoch_min = i
        #Print cost of epoch if v set to True
        if verbose == True:
            print('epoch ' + str(i) + ': Cost=' + str(c))  
    #Print model summary
    print('Final Summary:')
    print('Min cost: ' + str(cost_min))
    print('Minimum found at epoch ' + str(epoch_min))
    print('Optimum weights: ' + str(optimum_weight))
    #Return feature weights
    return optimum_weight

Train Model

learning_rate = 0.1
reg_term = 0.2
weights = np.zeros((8,1))
epochs = 500
m = len(X_train)
model = train_model(X_train, y_train, weights, m, learning_rate, epochs, reg_term) 

Predict Target for Training and Test Sets

def pred(X,w):
    h = hypothesis(X,w)
    return h
y_train_pred = pred(X_train,model)
y_train_pred = np.round(y_train_pred)
y_test_pred = pred(X_test,model)
y_test_pred = np.round(y_test_pred)

By detro - Last Updated Dec. 5, 2021, 8:48 p.m.

Search Snippets by Tag: