Machine Learning Tutorial

By KnowledgeHut .

Before jumping into the implementation details of Logistic Regression using PyTorch, it is essential to understand what Logistic Regression is, what PyTorch is, how they work together in implementing regression, what type of output they give, and what type of values they help predict. What is Logistic Regression? It is a supervised classification algorithm that is used to differentiate between different events or values. For example- filtering spam emails, classifying a transaction as legit or fraudulent, and much more. The variable in question is classified as 0 or 1, True or False, Yes or No depending on the input. It is a regression model that helps in building a model that predicts the probability of a data item belonging to a certain category. Logistic Regression uses a ‘sigmoid’ function, which has been defined below: g(z) = 1/ (1+ − ) The sigmoid function/logistic function looks like below: Note: The outcome of a Logistic Regression lies between the values 0 and 1, it can’t be greater than 1,and can’t be less than 0. The logistic regression becomes a classification problem when a decision threshold comes into play. What is PyTorch? PyTorch is an open source machine learning library, which was developed (is currently being updated as well as maintained) by social media giant Facebook. It is based on the Torch library (Torch is open-source, ML based library, scripting language as well as a scientific computing framework), which is currently not being actively developed. Hence, PyTorch came into existence. It is widely used in building deep-learning models, and natural language processing tasks (NLP) since it comes with features including Python support, easy-to-use API, and support to build on-the-go computational graphs. It contains multiple machine learning libraries that could be used with Python to build interesting applications and solve real-life problems. It comes with CUDA support, which helps in delivering higher speed by enabling it to make use of GPU and its computing resources. The CUDA characteristic can be ignored as well, based on our requirement. Now, let us dive into implementing Logistic Regression using PyTorch. Implementing Logistic Regression using PyTorch to identify MNIST dataset The MNIST dataset is first downloaded and placed in the /data folder. It is then loaded into the environment and the hyperparameters are initialized. Once this is done, the Logistic Regression model is defined and instantiated. Next the model is trained on the MNIST dataset and tested on 5 epochs. import torch import torch.nn as nn import torchvision.datasets as dsets import torchvision.transforms as transforms from torch.autograd import Variable # Downloading the MNIST dataset train_dataset = dsets.MNIST(root ='./data', train = True, transform = transforms.ToTensor(), download = True) test_dataset = dsets.MNIST(root ='./data', train = False, transform = transforms.ToTensor()) # Loading the dataset train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, shuffle = True) test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, shuffle = False) Initializing the hyperparameters input_size = 784 num_classes = 10 num_epochs = 5 batch_size = 100 learning_rate = 0.001 Model definition class LogisticRegression(nn.Module): def __init__(self, input_size, num_classes): super(LogisticRegression, self).__init__() self.linear = nn.Linear(input_size, num_classes) def forward(self, x): out = self.linear(x) return out model = LogisticRegression(input_size, num_classes) Loss and Optimizer Softmax computed internally Parameters which need to be updated are set criterion = nn.CrossEntropyLoss() optimizer = torch.optim.SGD(model.parameters(), lr = learning_rate) Model is being trained for epoch in range(num_epochs): for i, (images, labels) in enumerate(train_loader): images = Variable(images.view(-1, 28 * 28)) labels = Variable(labels) Forward + Backward + Optimize optimizer.zero_grad() outputs = model(images) loss = criterion(outputs, labels) loss.backward() optimizer.step() if (i + 1) % 100 == 0: print('Epoch: [% d/% d], Step: [% d/% d], Loss: %.4f' % (epoch + 1, num_epochs, i + 1, len(train_dataset) // batch_size, loss.data)) Model is being tested correct = 0 total = 0 for images, labels in test_loader: images = Variable(images.view(-1, 28 * 28)) outputs = model(images) _, predicted = torch.max(outputs.data, 1) total += labels.size(0) correct += (predicted == labels).sum() print('Accuracy of the model on test images: % d %%' % ( 100 * correct / total)) Output: Epoch: [ 1/ 5], Step: [ 100/ 600], Loss: 2.1282 Epoch: [ 1/ 5], Step: [ 200/ 600], Loss: 2.0498 Epoch: [ 1/ 5], Step: [ 300/ 600], Loss: 1.9539 Epoch: [ 1/ 5], Step: [ 400/ 600], Loss: 1.8876 Epoch: [ 1/ 5], Step: [ 500/ 600], Loss: 1.8286 Epoch: [ 1/ 5], Step: [ 600/ 600], Loss: 1.8078 Epoch: [ 2/ 5], Step: [ 100/ 600], Loss: 1.6117 Epoch: [ 2/ 5], Step: [ 200/ 600], Loss: 1.6151 Epoch: [ 2/ 5], Step: [ 300/ 600], Loss: 1.5423 Epoch: [ 2/ 5], Step: [ 400/ 600], Loss: 1.5010 Epoch: [ 2/ 5], Step: [ 500/ 600], Loss: 1.4743 Epoch: [ 2/ 5], Step: [ 600/ 600], Loss: 1.3641 Epoch: [ 3/ 5], Step: [ 100/ 600], Loss: 1.4000 Epoch: [ 3/ 5], Step: [ 200/ 600], Loss: 1.4146 Epoch: [ 3/ 5], Step: [ 300/ 600], Loss: 1.4325 Epoch: [ 3/ 5], Step: [ 400/ 600], Loss: 1.2283 Epoch: [ 3/ 5], Step: [ 500/ 600], Loss: 1.2623 Epoch: [ 3/ 5], Step: [ 600/ 600], Loss: 1.2492 Epoch: [ 4/ 5], Step: [ 100/ 600], Loss: 1.2188 Epoch: [ 4/ 5], Step: [ 200/ 600], Loss: 1.3165 Epoch: [ 4/ 5], Step: [ 300/ 600], Loss: 1.1442 Epoch: [ 4/ 5], Step: [ 400/ 600], Loss: 1.1946 Epoch: [ 4/ 5], Step: [ 500/ 600], Loss: 1.1096 Epoch: [ 4/ 5], Step: [ 600/ 600], Loss: 1.0626 Epoch: [ 5/ 5], Step: [ 100/ 600], Loss: 1.0550 Epoch: [ 5/ 5], Step: [ 200/ 600], Loss: 1.1386 Epoch: [ 5/ 5], Step: [ 300/ 600], Loss: 1.0494 Epoch: [ 5/ 5], Step: [ 400/ 600], Loss: 0.9888 Epoch: [ 5/ 5], Step: [ 500/ 600], Loss: 0.9876 Epoch: [ 5/ 5], Step: [ 600/ 600], Loss: 1.0121 Accuracy of the model on test images: 82 % Conclusion In this post, we understood how MNIST handwritten digits dataset can be identified with the help of Logistic Regression using PyTorch.

1. Machine Learning Overview

2. Machine Learning Terminologies

3. Demystifying Machine Learning

4. Applications of Machine Learning

5. Methods for Machine Learning

6. Underfitting and Overfitting in Machine Learning

7. Data Loading for ML Projects

8. Introduction to Data in Machine Learning

9. Why Data Pre-processing?

10. Normalization

11. Numpy

12. K-Nearest Neighbors (KNN)

13. Hyperparameter Tuning

14. Pre-procesing Data

15. What is Clustering in Machine Learning?

16. Overview - Regression & Logistic Regression

17. Linear Regression(Python Implementation)

18. Softmax Regression using TensorFlow

19. What is Linear Regression?

20. Linear Regression using PyTorch

21. Decision Trees

22. Introduction To Machine Learning using Python

23. Learning Model Building in Scikit-learn: A Python Machine Learning Library

24. Confusion matrix

25. Machine learning metrics

26. Improving Performance of ML Models

27. How to get synonyms/antonyms from NLTK WordNet in Python?

28. Removing stop words with NLTK in Python

29. Tokenize text using NLTK in Python

Linear Regression using PyTorch

What is Logistic Regression?

It is a supervised classification algorithm that is used to differentiate between different events or values. For example- filtering spam emails, classifying a transaction as legit or fraudulent, and much more. The variable in question is classified as 0 or 1, True or False, Yes or No depending on the input.

It is a regression model that helps in building a model that predicts the probability of a data item belonging to a certain category. Logistic Regression uses a ‘sigmoid’ function, which has been defined below:

g(z) = 1/ (1+  −  )

The sigmoid function/logistic function looks like below:

Note: The outcome of a Logistic Regression lies between the values 0 and 1, it can’t be greater than 1,and can’t be less than 0.

The logistic regression becomes a classification problem when a decision threshold comes into play.

What is PyTorch?

PyTorch is an open source machine learning library, which was developed (is currently being updated as well as maintained) by social media giant Facebook. It is based on the Torch library (Torch is open-source, ML based library, scripting language as well as a scientific computing framework), which is currently not being actively developed.

Hence, PyTorch came into existence. It is widely used in building deep-learning models, and natural language processing tasks (NLP) since it comes with features including Python support, easy-to-use API,

and support to build on-the-go computational graphs. It contains multiple machine learning libraries that could be used with Python to build interesting applications and solve real-life problems. It comes with CUDA support, which helps in delivering higher speed by enabling it to make use of GPU and its computing resources. The CUDA characteristic can be ignored as well, based on our requirement.

Now, let us dive into implementing Logistic Regression using PyTorch.

Implementing Logistic Regression using PyTorch to identify MNIST dataset

The MNIST dataset is first downloaded and placed in the /data folder. It is then loaded into the environment and the hyperparameters are initialized. Once this is done, the Logistic Regression model is defined and instantiated. Next the model is trained on the MNIST dataset and tested on 5 epochs.

import torch 
import torch.nn as nn 
import torchvision.datasets as dsets 
import torchvision.transforms as transforms 
from torch.autograd import Variable 
# Downloading the MNIST dataset 
train_dataset = dsets.MNIST(root ='./data', 
train = True, 
transform = transforms.ToTensor(), 
download = True) 
test_dataset = dsets.MNIST(root ='./data', 
train = False, 
transform = transforms.ToTensor()) 
# Loading the dataset 
train_loader = torch.utils.data.DataLoader(dataset = train_dataset, batch_size = batch_size, 
shuffle = True) 
test_loader = torch.utils.data.DataLoader(dataset = test_dataset, batch_size = batch_size, 
shuffle = False)

Initializing the hyperparameters input_size = 784 num_classes = 10 num_epochs = 5 
batch_size = 100 learning_rate = 0.001

Model definition

class LogisticRegression(nn.Module): 
def __init__(self, input_size, num_classes): 
super(LogisticRegression, self).__init__() 
self.linear = nn.Linear(input_size, num_classes) 
def forward(self, x): 
out = self.linear(x) 
return out 
model = LogisticRegression(input_size, num_classes)

Loss and Optimizer
Softmax computed internally
Parameters which need to be updated are set criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(model.parameters(), lr = learning_rate)

```
Model is being trained 
```

for epoch in range(num_epochs): 
for i, (images, labels) in enumerate(train_loader): 
images = Variable(images.view(-1, 28 * 28)) 
labels = Variable(labels) 
Forward + Backward + Optimize optimizer.zero_grad() 
outputs = model(images) 
loss = criterion(outputs, labels) loss.backward() optimizer.step() 
if (i + 1) % 100 == 0: 
print('Epoch: [% d/% d], Step: [% d/% d], Loss: %.4f' 
% (epoch + 1, num_epochs, i + 1, 
len(train_dataset) // batch_size, loss.data)) 
Model is being tested correct = 0 
total = 0 
for images, labels in test_loader: 
images = Variable(images.view(-1, 28 * 28)) outputs = model(images) 
_, predicted = torch.max(outputs.data, 1) total += labels.size(0) 
correct += (predicted == labels).sum() 
print('Accuracy of the model on test images: % d %%' % ( 100 * correct / total))

Output:

Epoch: [ 1/ 5], Step: [ 100/ 600], Loss: 2.1282 
Epoch: [ 1/ 5], Step: [ 200/ 600], Loss: 2.0498 
Epoch: [ 1/ 5], Step: [ 300/ 600], Loss: 1.9539 
Epoch: [ 1/ 5], Step: [ 400/ 600], Loss: 1.8876 
Epoch: [ 1/ 5], Step: [ 500/ 600], Loss: 1.8286 
Epoch: [ 1/ 5], Step: [ 600/ 600], Loss: 1.8078 
Epoch: [ 2/ 5], Step: [ 100/ 600], Loss: 1.6117 
Epoch: [ 2/ 5], Step: [ 200/ 600], Loss: 1.6151 
Epoch: [ 2/ 5], Step: [ 300/ 600], Loss: 1.5423 
Epoch: [ 2/ 5], Step: [ 400/ 600], Loss: 1.5010 
Epoch: [ 2/ 5], Step: [ 500/ 600], Loss: 1.4743 
Epoch: [ 2/ 5], Step: [ 600/ 600], Loss: 1.3641 
Epoch: [ 3/ 5], Step: [ 100/ 600], Loss: 1.4000 
Epoch: [ 3/ 5], Step: [ 200/ 600], Loss: 1.4146 
Epoch: [ 3/ 5], Step: [ 300/ 600], Loss: 1.4325 
Epoch: [ 3/ 5], Step: [ 400/ 600], Loss: 1.2283 
Epoch: [ 3/ 5], Step: [ 500/ 600], Loss: 1.2623 
Epoch: [ 3/ 5], Step: [ 600/ 600], Loss: 1.2492 
Epoch: [ 4/ 5], Step: [ 100/ 600], Loss: 1.2188 
Epoch: [ 4/ 5], Step: [ 200/ 600], Loss: 1.3165 
Epoch: [ 4/ 5], Step: [ 300/ 600], Loss: 1.1442 
Epoch: [ 4/ 5], Step: [ 400/ 600], Loss: 1.1946 
Epoch: [ 4/ 5], Step: [ 500/ 600], Loss: 1.1096 
Epoch: [ 4/ 5], Step: [ 600/ 600], Loss: 1.0626 
Epoch: [ 5/ 5], Step: [ 100/ 600], Loss: 1.0550 
Epoch: [ 5/ 5], Step: [ 200/ 600], Loss: 1.1386 
Epoch: [ 5/ 5], Step: [ 300/ 600], Loss: 1.0494 
Epoch: [ 5/ 5], Step: [ 400/ 600], Loss: 0.9888 
Epoch: [ 5/ 5], Step: [ 500/ 600], Loss: 0.9876 
Epoch: [ 5/ 5], Step: [ 600/ 600], Loss: 1.0121 
Accuracy of the model on test images: 82 %

Conclusion

In this post, we understood how MNIST handwritten digits dataset can be identified with the help of Logistic Regression using PyTorch.

19-A What is Linear Regression?

21-A Decision Trees

Your email address will not be published. Required fields are marked *

Comments

Vinu

After reading your article, I was amazed. I know that you explain it very well. And I hope that other readers will also experience how I feel after reading your article. Thanks for sharing.

Johnson M

Good and informative article.

Vinu

I enjoyed reading your articles. This is truly a great read for me. Keep up the good work!

Vinu

Awesome blog. I enjoyed reading this article. This is truly a great read for me. Keep up the good work!

best data science courses in India

Thanks for sharing this article!! Machine learning is a branch of artificial intelligence (AI) and computer science that focus on the uses of data and algorithms. I came to know a lot of information from this article.

View More Comments

Search

Machine Learning Tutorial

By KnowledgeHut .

Machine Learning Tutorial

Linear Regression using PyTorch

What is Logistic Regression?

What is PyTorch?

Leave a Reply

Comments

Vinu

Johnson M

Vinu

Vinu

best data science courses in India