Welcome to Machinfy Academy
We Can Make computer Learn to recognize Handwritten digit Using Deep learning. Deep learning is part of a broader family of machine learning methods based on artificial neural networks with representation learning.
In this article, We will develop a handwritten digit classifier from scratch. We will be using PyTorch.
Dataset
The Dataset we will use is the MNIST dataset, The MNIST database of handwritten digits, has a training set of 60,000 examples, and a test set of 10,000 examples. The data set is originally available on Yann Lecun’s website.
Step 1: Import Necessary Packages
#Scientific computing
import numpy as np
#Pytorch packages
import torch
from torch import nn
import torch.optim as optim
import torchvision
#Visulization
import matplotlib.pyplot as plt
#Others
import time
import copy
Step 2: Download The Dataset & Preprocessing
Before downloading the data, let us define transformations that will be applied to images before feeding it into the pipeline.
transform = transforms.Compose([transforms.ToTensor(),
transforms.Normalize((0.5,), (0.5,)),
])
transforms.ToTensor() : Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.Tensor
(C x H x W) in the range [0.0, 1.0].
transforms.Normalize((0.5,), (0.5,)) : Normalize a tensor image with mean and standard deviation. Given mean: (mean[1],...,mean[n])
and std: (std[1],..,std[n])
for n
channels, this transform will normalize each channel of the input torch.*Tensor
i.e., output = (input - mean) / st
.
Downloading the dataset then applying the transformations
trainset = datasets.MNIST('data/', download=True, train=True, transform=transform)
valset = datasets.MNIST('data/', download=True, train=False, transform=transform)
Shuffling and provides an iterable over the given dataset.
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
valloader = torch.utils.data.DataLoader(valset, batch_size=64, shuffle=True)
Step 3: Building The Neural Network
PyTorch’s torch.nn
module allows us to build the network very simply. It is easy to understand as well. Look at the code below.
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv_block1 = nn.Sequential(
nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1),
nn.BatchNorm2d(32),
nn.ReLU(inplace=True),
nn.Conv2d(32, 32, kernel_size=3, stride=1, padding=1),
nn.BatchNorm2d(32),
nn.ReLU(inplace=True),
nn.MaxPool2d(3, stride=2),
)
self.conv_block2 = nn.Sequential(
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.Conv2d(64, 64, kernel_size=3, stride=1, padding=1),
nn.BatchNorm2d(64),
nn.ReLU(inplace=True),
nn.MaxPool2d(3, stride=2),
)
self.fcs = nn.Sequential(
nn.Linear(2304, 1152),
nn.ReLU(inplace=True),
nn.Dropout(0.5),
nn.Linear(1152, 576),
nn.ReLU(inplace=True),
nn.Dropout(0.5),
nn.Linear(576, 10)
)
def forward(self, x):
x = self.conv_block1(x)
x = self.conv_block2(x)
x = x.reshape(x.shape[0], -1)
x = self.fcs(x)
return x
model = Net()
Adam as optimiaztion function.
Softmax(CrossEntropyLoss) as loss function.
optimizer = optim.Adam(params=model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()
Step 3: Training and Evaluating
train_model function available here.
def train_model(model, criterion, optimizer, scheduler, dataset_sizes, dataloaders, num_epochs=7 ):
since = time.time()
best_model_wts = copy.deepcopy(model.state_dict())
best_acc = 0.0
for epoch in range(num_epochs):
print('Epoch {}/{}'.format(epoch, num_epochs - 1))
print('-' * 10)
# Each epoch has a training and validation phase
for phase in ['train', 'val']:
if phase == 'train':
model.train() # Set model to training mode
else:
model.eval() # Set model to evaluate mode
running_loss = 0.0
running_corrects = 0
# Iterate over data.
for inputs, labels in dataloaders[phase]:
inputs = inputs.to(device)
labels = labels.to(device)
# zero the parameter gradients
optimizer.zero_grad()
# forward
# track history if only in train
with torch.set_grad_enabled(phase == 'train'):
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
loss = criterion(outputs, labels)
# backward + optimize only if in training phase
if phase == 'train':
loss.backward()
optimizer.step()
# statistics
running_loss += loss.item() * inputs.size(0)
running_corrects += torch.sum(preds == labels.data)
if phase == 'train':
scheduler.step()
epoch_loss = running_loss / dataset_sizes[phase]
epoch_acc = running_corrects.double() / dataset_sizes[phase]
print('{} Loss: {:.4f} Acc: {:.4f}'.format(
phase, epoch_loss, epoch_acc))
# deep copy the model
if phase == 'val' and epoch_acc > best_acc:
best_acc = epoch_acc
best_model_wts = copy.deepcopy(model.state_dict())
print()
time_elapsed = time.time() - since
print('Training complete in {:.0f}m {:.0f}s'.format(
time_elapsed // 60, time_elapsed % 60))
print('Best val Acc: {:4f}'.format(best_acc))
# load best model weights
model.load_state_dict(best_model_wts)
return model
model_ft = train_model(model, criterion, optimizer, exp_lr_scheduler,dataset_sizes,dataloaders,num_epochs=14)
Lets see our results our model accuracy > 99% .
Epoch 0/6
----------
train Loss: 0.1817 Acc: 0.9446
val Loss: 0.0368 Acc: 0.9890
Epoch 1/6
----------
train Loss: 0.0724 Acc: 0.9808
val Loss: 0.0290 Acc: 0.9914
Epoch 2/6
----------
train Loss: 0.0589 Acc: 0.9841
val Loss: 0.0362 Acc: 0.9909
Epoch 3/6
----------
train Loss: 0.0490 Acc: 0.9873
val Loss: 0.0281 Acc: 0.9925
Epoch 4/6
----------
train Loss: 0.0382 Acc: 0.9897
val Loss: 0.0328 Acc: 0.9907
Epoch 5/6
----------
train Loss: 0.0355 Acc: 0.9907
val Loss: 0.0248 Acc: 0.9932
Epoch 6/6
----------
train Loss: 0.0329 Acc: 0.9917
val Loss: 0.0228 Acc: 0.9942
Training complete in 2m 53s
Best val Acc: 0.994200
Step 4: Visualizing the model predictions
This function for visualize single image with title.
def imshow(inp, title=None):
"""Imshow for Tensor."""
inp = inp.numpy().squeeze()
plt.imshow(inp, cmap='gray_r')
if title is not None:
plt.title(title)
This function predict values of some images from validation data.
def visualize_model(model, num_images=6):
was_training = model.training
model.eval()
images_so_far = 0
with torch.no_grad():
for i, (inputs, labels) in enumerate(dataloaders['val']):
inputs = inputs.to(device)
labels = labels.to(device)
outputs = model(inputs)
_, preds = torch.max(outputs, 1)
for j in range(inputs.size()[0]):
images_so_far += 1
ax = plt.subplot(num_images//2, 2, images_so_far)
ax.axis('off')
ax.set_title('predicted: {}'.format(class_names[preds[j]]))
imshow(inputs.cpu().data[j])
if images_so_far == num_images:
model.train(mode=was_training)
return
model.train(mode=was_training)
Step 5: Saving The Model
Now that we are done with everything, we do not want to lose the trained model. We don’t want to train it every time we use it. For this purpose, we will be saving the model. When we need it in the future, we can load it and use it directly without further training.
torch.save(model, './my_mnist_model.pt')
You can find Entire code here.
References
[1] MNIST.
[2] Pytorch Docs.
[3] Adam.
[4] Entire Code on GitHub.