Machine Learning Tutorial

By KnowledgeHut .

Regression is a part of machine learning that helps in solving tasks which can’t be explicitly programmed. There are various techniques that are used in machine learning. This includes supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning. Supervised learning algorithms It is one of the most popular learning methods, since it is easy to understand and relatively easier to implement ad get relevant outputs. Consider this example: How does a child learn? It is taught how to walk, run, talk, and it is made to understand the difference between walking and running. Supervised learning works in a similar way, there is human supervision involved in the form of features being labelled, feedback given to the data (whether it predicted correctly, and if not what the right prediction has to be) and so on. Once the algorithm has been fully trained on such data, it can predict outputs for never-before-seen inputs in-line with the data on which the model was trained with good accuracy. It is also understood as a task-oriented algorithm since it focuses on a single task and is trained on huge number of examples until it predicts output accurately. Supervised learning algorithms can be classified into regression and classification problems. Regression problems include linear regression, logistic regression and classification problems include multi-class classification, decision trees, and much more. Regression problem basically means the model would yield a real value or a continuous value. The simplest model which is used to predict continuous variables is Linear Regression. Linear Regression Linear Regression refers to an approach/algorithm that helps establish a linear relationship between the dependant and the independent variable. As the name indicates, it is a linear process, which means it is 2 dimensional, i.e. it has 2 variables associated with it. These variables have continuous values (in contrast to 0s and 1s in logistic regression). The word ‘regression’ refers to finding relationship between two variables amongst which one is a dependant variable and the other one is independent. How can this relationship be established? In simple words, it goes like this- we will be provided with a basic linear equation, say y = 3x-1. Here ‘y’ is considered to be the dependant variable (since it depends on the value of x) and ‘x’ (trivially) is the independent variable. This means, as and when ‘x’ changes, the value of ‘y’ keeps changing according to the above-mentioned linear equation. Different values for ‘x’ are supplied, which helps calculate various values for ‘y’. The values for ‘x’ and ‘y’ have been shown in a table below: XY122538411514617720These values are plotted on a graph and we try to fit all these points (or most of them) to a straight line. During the process of fitting these values to a straight line, we try and grab most of the points whose vertical distance from the straight line (that is being fit) is minimum. Some points don’t make it on the straight line since they don’t contribute in forming a straight line. These are the ones whose vertical distance from the straight line isn’t the smallest. The idea is to grab all the points in the graph and fit them on a straight line that have minimum vertical distance from the line. Below is an example illustrating the same: When the number of points that don’t contribute to fitting a straight line are more in comparison to the ones that contribute to fitting the line, it is considered that the ‘prediction error’ is more. The ‘error’ basically refers to the shortest distance (vertical distance) between the line and the point. From the above graph, it can be observed that points 1,2,3 and 4 beginning from the bottom left corner don’t really fit the line, and don’t contribute to forming the straight line. When such a linear regression model is trained, it helps calculate an attribute called ‘cost function’ that helps in measuring the ‘Root Mean Squared Error’ or RMSE in short. RMSE basically gives the difference between the values that are predicted and the input values. These values are then normalized by squaring them so as to remove any negative values and calculating the average of these values (i.e. dividing them by the total number of observations) and taking the square root of this value. The resultant is a single number that is used to understand how well the regression algorithm has predicted output for a given input value and how close it is to the actual output. The ‘cost function’ needs to be minimal, thereby corresponding to a minimum difference between the actual value and the predicted value. Logistic Regression It is a supervised classification algorithm that is used to differentiate between different events or values. For example- filtering spam emails, classifying a transaction as legit or fraudulent, and much more. The variable in question is classified as 0 or 1, True or False, Yes or No depending on the input. It is a regression model that helps in building a model that predicts the probability of a data item belonging to a certain category. Logistic Regression uses a ‘sigmoid’ function, which has been defined below: g(z) = 1/ (1+ − ) Note: The outcome of a Logistic Regression lies between the values 0 and 1, it can’t be greater than 1,and can’t be less than 0. The logistic regression becomes a classification problem when a decision threshold comes into play. Other types of regression include: Polynomial regression Stepwise regression Ridge regression Lasso regression ElasticNet regression The sigmoid function/logistic function looks like below: Note: The outcome of a Logistic Regression lies between the values 0 and 1, it can’t be greater than 1,and can’t be less than 0. The logistic regression becomes a classification problem when a decision threshold comes into play. Logistic Regression from scratch From scratch, it can be implemented without using the scikit-learn module. import numpy as np import matplotlib.pyplot as plt import pandas as pd import scipy.optimize as opt def data_loading(path, header): marks_data_frame = pd.read_csv(path, header=header) return marks_data_frame if __name__ == "__main__": # load data from the file data = data_loading("path to marks.csv file",None) X = feature values, all columns except the last one X_data = data.iloc[:, :-1] y = target values, last column of data frame y_data = data.iloc[:, -1] filter out the applicants who were eligible admitted = data.loc[y_data == 1] filter out the applicants who weren’t eligible not_admitted = data.loc[y_data == 0] plot the insights plt.scatter(admitted.iloc[:, 0], admitted.iloc[:, 1], s=10, label='Eligible') plt.scatter(not_admitted.iloc[:, 0], not_admitted.iloc[:, 1], s=10, label='Not eligible') plt.legend() plt.show() X_data = np.c_[np.ones((X.shape[0], 1)), X_data] y_data = y_data[:, np.newaxis] theta = np.zeros((X_data.shape[1], 1)) def sigmoid(x): Activation function that maps a real value between 0 and 1 return 1 / (1 + np.exp(-x)) def total_input(theta, x): Computes weighted sum ofinputs return np.dot(x, theta) def probability(theta, x): Returns probability after it goes through sigmoid function return sigmoid(total_input(theta, x)) def cost_function( theta, x, y): Cost function for all the training samples is computed m = x.shape[0] total_cost = -(1 / m) * np.sum(y * np.log(probability(theta, x)) + (1 - y) * np.log(1 - probability(theta,x))) return total_cost def gradient( theta, x, y): Computes the gradient of the cost function at the point theta m = x.shape[0] return (1 / m) * np.dot(x.T, sigmoid(total_input(theta, x)) - y) def fit(x, y, theta): opt_weights = opt.fmin_tnc(func=cost_function,x0=theta,fprime=gradient,args=(x, y.flatten())) return opt_weights[0] parameters = fit(X_data, y_data, theta) x_values = [np.min(X_data[:, 1] - 5), np.max(X_data[:, 2] + 5)] y_values = - (parameters[0] + np.dot(parameters[1], x_values)) / parameters[2] plt.plot(x_values, y_values, label='Decision Boundary') plt.xlabel('Marks in 1st Exam') plt.ylabel('Marks in 2nd Exam') plt.legend() plt.show() def predict( x): theta = parameters[:, np.newaxis] return probability(theta, x) def accuracy( x, actual_classes, prob_threshold=0.5): predicted_classes = (predict(x) >= prob_threshold).astype(int) predicted_classes = predicted_classes.flatten() accuracy = np.mean(predicted_classes == actual_classes) return accuracy * 100 accuracy(X_data, y_data.flatten()) 88.88888888888889 Logistic Regression implemented using scikit-learn module It is implemented using MLE (Maximum Likelihood Estimation), which is an iterative process. A random weight/value is provided for the independent variable and this process goes on until an optimal weight is reached after which there is less to no change in the output when the weights change. import numpy as np import matplotlib.pyplot as plt import pandas as pd import scipy def data_loading(path, header): marks_data_frame = pd.read_csv(path, header=header) return marks_data_frame if __name__ == "__main__": # load data from the file data = data_loading("path-to-marks.csv file", None) X = feature values, all columns except the last one X_data = data.iloc[:, :-1] y = target values, last column of the data frame y_data = data.iloc[:, -1] filter out applicants who are eligible admitted = data.loc[y_data == 1] filter out applicants who aren’t eligible not_admitted = data.loc[y_data == 0] plot the insights plt.scatter(admitted.iloc[:, 0], admitted.iloc[:, 1], s=10, label='Eligible') plt.scatter(not_admitted.iloc[:, 0], not_admitted.iloc[:, 1], s=10, label='Not eligible') plt.legend() plt.show() from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score model = LogisticRegression() model.fit(X_data, y_data) predicted_classes = model.predict(X_data) accuracy = accuracy_score(y_data,predicted_classes) parameters = model.coef_ Output:Applications of logistic regression Weather forecasting Stock prediction Election poll results Conclusion In this post, we understood what Logistic Regression means, and its Python implementation using scikit-learn library as well as from scratch.

1. Machine Learning Overview

2. Machine Learning Terminologies

3. Demystifying Machine Learning

4. Applications of Machine Learning

5. Methods for Machine Learning

6. Underfitting and Overfitting in Machine Learning

7. Data Loading for ML Projects

8. Introduction to Data in Machine Learning

9. Why Data Pre-processing?

10. Normalization

11. Numpy

12. K-Nearest Neighbors (KNN)

13. Hyperparameter Tuning

14. Pre-procesing Data

15. What is Clustering in Machine Learning?

16. Overview - Regression & Logistic Regression

17. Linear Regression(Python Implementation)

18. Softmax Regression using TensorFlow

19. What is Linear Regression?

20. Linear Regression using PyTorch

21. Decision Trees

22. Introduction To Machine Learning using Python

23. Learning Model Building in Scikit-learn: A Python Machine Learning Library

24. Confusion matrix

25. Machine learning metrics

26. Improving Performance of ML Models

27. How to get synonyms/antonyms from NLTK WordNet in Python?

28. Removing stop words with NLTK in Python

29. Tokenize text using NLTK in Python

Overview - Regression & Logistic Regression

Regression is a part of machine learning that helps in solving tasks which can’t be explicitly programmed.

There are various techniques that are used in machine learning. This includes supervised learning, unsupervised learning, semi-supervised learning and reinforcement learning.

Supervised learning algorithms

It is one of the most popular learning methods, since it is easy to understand and relatively easier to implement ad get relevant outputs.

Consider this example: How does a child learn? It is taught how to walk, run, talk, and it is made to understand the difference between walking and running.

Supervised learning works in a similar way, there is human supervision involved in the form of features being labelled, feedback given to the data (whether it predicted correctly, and if not what the right prediction has to be) and so on.

Once the algorithm has been fully trained on such data, it can predict outputs for never-before-seen inputs in-line with the data on which the model was trained with good accuracy. It is also understood as a task-oriented algorithm since it focuses on a single task and is trained on huge number of examples until it predicts output accurately.

Supervised learning algorithms can be classified into regression and classification problems. Regression problems include linear regression, logistic regression and classification problems include multi-class classification, decision trees, and much more.

Regression problem basically means the model would yield a real value or a continuous value. The simplest model which is used to predict continuous variables is Linear Regression.

Linear Regression

Linear Regression refers to an approach/algorithm that helps establish a linear relationship between the dependant and the independent variable.

As the name indicates, it is a linear process, which means it is 2 dimensional, i.e. it has 2 variables associated with it. These variables have continuous values (in contrast to 0s and 1s in logistic regression). The word ‘regression’ refers to finding relationship between two variables amongst which one is a dependant variable and the other one is independent.

How can this relationship be established?

In simple words, it goes like this- we will be provided with a basic linear equation, say y = 3x-1. Here ‘y’ is considered to be the dependant variable (since it depends on the value of x) and ‘x’ (trivially) is the independent variable. This means, as and when ‘x’ changes, the value of ‘y’ keeps changing according to the above-mentioned linear equation. Different values for ‘x’ are supplied, which helps calculate various values for ‘y’. The values for ‘x’ and ‘y’ have been shown in a table below:

X	Y
1	2
2	5
3	8
4	11
5	14
6	17
7	20

These values are plotted on a graph and we try to fit all these points (or most of them) to a straight line. During the process of fitting these values to a straight line, we try and grab most of the points whose vertical distance from the straight line (that is being fit) is minimum. Some points don’t make it on the straight line since they don’t contribute in forming a straight line. These are the ones whose vertical distance from the straight line isn’t the smallest.

The idea is to grab all the points in the graph and fit them on a straight line that have minimum vertical distance from the line. Below is an example illustrating the same:

When the number of points that don’t contribute to fitting a straight line are more in comparison to the ones that contribute to fitting the line, it is considered that the ‘prediction error’ is more. The ‘error’ basically refers to the shortest distance (vertical distance) between the line and the point.

From the above graph, it can be observed that points 1,2,3 and 4 beginning from the bottom left corner don’t really fit the line, and don’t contribute to forming the straight line.

When such a linear regression model is trained, it helps calculate an attribute called ‘cost function’ that helps in measuring the ‘Root Mean Squared Error’ or RMSE in short. RMSE basically gives the difference between the values that are predicted and the input values. These values are then normalized by squaring them so as to remove any negative values and calculating the average of these values (i.e. dividing them by the total number of observations) and taking the square root of this value.

The resultant is a single number that is used to understand how well the regression algorithm has predicted output for a given input value and how close it is to the actual output. The ‘cost function’

needs to be minimal, thereby corresponding to a minimum difference between the actual value and the predicted value.

Logistic Regression

It is a supervised classification algorithm that is used to differentiate between different events or values. For example- filtering spam emails, classifying a transaction as legit or fraudulent, and much more. The variable in question is classified as 0 or 1, True or False, Yes or No depending on the input.

It is a regression model that helps in building a model that predicts the probability of a data item belonging to a certain category. Logistic Regression uses a ‘sigmoid’ function, which has been defined below:

g(z) = 1/ (1+  −  )

Note: The outcome of a Logistic Regression lies between the values 0 and 1, it can’t be greater than 1,and can’t be less than 0.

The logistic regression becomes a classification problem when a decision threshold comes into play.

Other types of regression include:

Polynomial regression
Stepwise regression
Ridge regression
Lasso regression
ElasticNet regression

The sigmoid function/logistic function looks like below:

Note: The outcome of a Logistic Regression lies between the values 0 and 1, it can’t be greater than 1,and can’t be less than 0.

The logistic regression becomes a classification problem when a decision threshold comes into play.

Logistic Regression from scratch

From scratch, it can be implemented without using the scikit-learn module.

import numpy as np 
import matplotlib.pyplot as plt 
import pandas as pd 
import scipy.optimize as opt 
def data_loading(path, header): 
marks_data_frame = pd.read_csv(path, header=header) 
return marks_data_frame 
if __name__ == "__main__": 
# load data from the file 
data = data_loading("path to marks.csv file",None) 
X = feature values, all columns except the last 
one X_data = data.iloc[:, :-1] 
y = target values, last column of data frame 
y_data = data.iloc[:, -1] 
filter out the applicants who were eligible admitted = data.loc[y_data == 1] 
filter out the applicants who weren’t eligible not_admitted = data.loc[y_data == 0] 
plot the insights 
plt.scatter(admitted.iloc[:, 0], admitted.iloc[:, 1], s=10, label='Eligible') 
plt.scatter(not_admitted.iloc[:, 0], not_admitted.iloc[:, 1], s=10, label='Not eligible') 
plt.legend() 
plt.show() 
X_data = np.c_[np.ones((X.shape[0], 1)), X_data] 
y_data = y_data[:, np.newaxis] 
theta = np.zeros((X_data.shape[1], 1)) 
def sigmoid(x):

Activation function that maps a real value between 0 and

1 return 1 / (1 + np.exp(-x)) 
def total_input(theta, x):

Computes weighted sum of

inputs return np.dot(x, theta) 
def probability(theta, x): 
Returns probability after it goes through sigmoid 
function return sigmoid(total_input(theta, x)) 
def cost_function( theta, x, y): 
Cost function for all the training samples is 
computed m = x.shape[0] 
total_cost = -(1 / m) * np.sum(y * np.log(probability(theta, x)) + (1 - y) * np.log(1 - 
probability(theta,x))) return total_cost 
def gradient( theta, x, y): 
Computes the gradient of the cost function at the point
theta m = x.shape[0] 
return (1 / m) * np.dot(x.T, sigmoid(total_input(theta, x)) - y) 
def fit(x, y, theta): 
opt_weights = opt.fmin_tnc(func=cost_function,x0=theta,fprime=gradient,args=(x,
y.flatten())) return opt_weights[0] 
parameters = fit(X_data, y_data, theta) 
x_values = [np.min(X_data[:, 1] - 5), np.max(X_data[:, 2] + 5)] 
y_values = - (parameters[0] + np.dot(parameters[1], x_values)) / parameters[2] 
plt.plot(x_values, y_values, label='Decision Boundary') 
plt.xlabel('Marks in 1st Exam') 
plt.ylabel('Marks in 2nd Exam') 
plt.legend() 
plt.show() 
def predict( x): 
theta = parameters[:, np.newaxis] 
return probability(theta, x) 
def accuracy( x, actual_classes, prob_threshold=0.5): 
predicted_classes = (predict(x) >= prob_threshold).astype(int) 
predicted_classes = predicted_classes.flatten() 
accuracy = np.mean(predicted_classes == actual_classes) 
return accuracy * 100 
accuracy(X_data, y_data.flatten())

88.88888888888889

Logistic Regression implemented using scikit-learn module

It is implemented using MLE (Maximum Likelihood Estimation), which is an iterative process. A random weight/value is provided for the independent variable and this process goes on until an optimal weight is reached after which there is less to no change in the output when the weights change.

import numpy as np 
import matplotlib.pyplot as plt 
import pandas as pd 
import scipy 
def data_loading(path, header): 
marks_data_frame = pd.read_csv(path, header=header) 
return marks_data_frame 
if __name__ == "__main__": 
# load data from the file 
data = data_loading("path-to-marks.csv file", None) 
X = feature values, all columns except the last one X_data = data.iloc[:, :-1] 
y = target values, last column of the data frame 
y_data = data.iloc[:, -1] 
filter out applicants who are eligible admitted = data.loc[y_data == 1] 
filter out applicants who aren’t eligible not_admitted = data.loc[y_data == 0] 
plot the insights 
plt.scatter(admitted.iloc[:, 0], admitted.iloc[:, 1], s=10, label='Eligible') 
plt.scatter(not_admitted.iloc[:, 0], not_admitted.iloc[:, 1], s=10, label='Not eligible') 
plt.legend() 
plt.show() 
from sklearn.linear_model import LogisticRegression from sklearn.metrics import accuracy_score model = LogisticRegression() model.fit(X_data, y_data) 
predicted_classes = model.predict(X_data) 
accuracy = accuracy_score(y_data,predicted_classes) 
parameters = model.coef_

Output:

Applications of logistic regression

Weather forecasting
Stock prediction
Election poll results

Conclusion

In this post, we understood what Logistic Regression means, and its Python implementation using scikit-learn library as well as from scratch.

15-A What is Clustering in Machine Learning?

17-A Linear Regression(Python Implementation)

Your email address will not be published. Required fields are marked *

Comments

Vinu

After reading your article, I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article. Thanks for sharing.

Johnson M

Good and informative article.

Vinu

I enjoyed reading your articles. This is truly a great read for me. Keep up the good work!

Vinu

Awesome blog. I enjoyed reading this article. This is truly a great read for me. Keep up the good work!

best data science courses in India

Thanks for sharing this article!! Machine learning is a branch of artificial intelligence (AI) and computer science that focus on the uses of data and algorithms. I came to know a lot of information from this article.

View More Comments

Search

Machine Learning Tutorial

By KnowledgeHut .

Machine Learning Tutorial

Overview - Regression & Logistic Regression

Supervised learning algorithms

Linear Regression

How can this relationship be established?

Logistic Regression

Logistic Regression from scratch

Conclusion

Leave a Reply

Comments

Vinu

Johnson M

Vinu

Vinu

best data science courses in India