iPython Notebook for this tutorial is available here.

The examples in this notebook assume that you are familiar with the theory of the neural networks. To learn more about the neural networks, you can refer the resources mentioned here.

Convolutional Neural Networks (CNNs) have been used for several image classification tasks. They require a lot of data and time to train. However, sometimes the dataset may be limited and not enough to train a CNN from scratch. In such a scenerio it is helpful to use a pre-trained CNN, which has been trained on a large dataset. We will use VGG-19 pre-trained CNN, which is a 19-layer network trained on Imagenet. Details about VGG-19 model architecture are available here. Other pre-trained models in Keras are available here.

In this notebook, we will learn to use a pre-trained model for:

  • Image Classification: If the new dataset has the same classes as the training dataset, then the pre-trained CNN can be used directly to predict the class of the images from the new dataset.

  • Feature Extraction: CNNs can also be used as a feature extractor instead of a classifier. The last layer of the CNN can be removed and an image can be passed through the rest of the network to obtain its feature vector. For example, in VGG-19 model the last layer (1000-dimensional) can be removed and the fully connected layer (fc2) results in a 4096-dimesnional feature vector representation of an input image. After extracting features from all the training images, a classfier like SVM or logistic regression can be trained for image classification.

Another way of using pre-trained CNNs for transfer learning is to fine-tune CNNs by initializing network weights from a pre-trained network and then re-training the network with the new dataset. Fine-tuning CNNs will be covered in next tutorial.

Import necessary modules

# Use GPU for Theano, comment to use CPU instead of GPU
# Tensorflow uses GPU by default
import os
os.environ["THEANO_FLAGS"] = "mode=FAST_RUN,device=gpu,floatX=float32"
# import necessary modules
import time
import matplotlib.pyplot as plt
import numpy as np
% matplotlib inline
np.random.seed(2017) 
from keras.applications.vgg19 import VGG19
from keras.applications.vgg19 import preprocess_input, decode_predictions
from keras.preprocessing import image
from keras.models import Model
import cv2
Using TensorFlow backend.

Pre-trained model for image classification

# load pre-trained model
model = VGG19(weights='imagenet', include_top=True)
# display model layers
model.summary()
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
====================================================================================================
input_1 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_1[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
____________________________________________________________________________________________________
block2_conv1 (Convolution2D)     (None, 112, 112, 128) 73856       block1_pool[0][0]                
____________________________________________________________________________________________________
block2_conv2 (Convolution2D)     (None, 112, 112, 128) 147584      block2_conv1[0][0]               
____________________________________________________________________________________________________
block2_pool (MaxPooling2D)       (None, 56, 56, 128)   0           block2_conv2[0][0]               
____________________________________________________________________________________________________
block3_conv1 (Convolution2D)     (None, 56, 56, 256)   295168      block2_pool[0][0]                
____________________________________________________________________________________________________
block3_conv2 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv1[0][0]               
____________________________________________________________________________________________________
block3_conv3 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv2[0][0]               
____________________________________________________________________________________________________
block3_conv4 (Convolution2D)     (None, 56, 56, 256)   590080      block3_conv3[0][0]               
____________________________________________________________________________________________________
block3_pool (MaxPooling2D)       (None, 28, 28, 256)   0           block3_conv4[0][0]               
____________________________________________________________________________________________________
block4_conv1 (Convolution2D)     (None, 28, 28, 512)   1180160     block3_pool[0][0]                
____________________________________________________________________________________________________
block4_conv2 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv1[0][0]               
____________________________________________________________________________________________________
block4_conv3 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv2[0][0]               
____________________________________________________________________________________________________
block4_conv4 (Convolution2D)     (None, 28, 28, 512)   2359808     block4_conv3[0][0]               
____________________________________________________________________________________________________
block4_pool (MaxPooling2D)       (None, 14, 14, 512)   0           block4_conv4[0][0]               
____________________________________________________________________________________________________
block5_conv1 (Convolution2D)     (None, 14, 14, 512)   2359808     block4_pool[0][0]                
____________________________________________________________________________________________________
block5_conv2 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv1[0][0]               
____________________________________________________________________________________________________
block5_conv3 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv2[0][0]               
____________________________________________________________________________________________________
block5_conv4 (Convolution2D)     (None, 14, 14, 512)   2359808     block5_conv3[0][0]               
____________________________________________________________________________________________________
block5_pool (MaxPooling2D)       (None, 7, 7, 512)     0           block5_conv4[0][0]               
____________________________________________________________________________________________________
flatten (Flatten)                (None, 25088)         0           block5_pool[0][0]                
____________________________________________________________________________________________________
fc1 (Dense)                      (None, 4096)          102764544   flatten[0][0]                    
____________________________________________________________________________________________________
fc2 (Dense)                      (None, 4096)          16781312    fc1[0][0]                        
____________________________________________________________________________________________________
predictions (Dense)              (None, 1000)          4097000     fc2[0][0]                        
====================================================================================================
Total params: 143,667,240
Trainable params: 143,667,240
Non-trainable params: 0
____________________________________________________________________________________________________
# display the image
img_disp = cv2.imread('./data/peacock.jpg')
img_disp = cv2.cvtColor(img_disp, cv2.COLOR_BGR2RGB)
plt.imshow(img_disp)
plt.axis("off")  
plt.show()

png

# pre-process the image
img = image.load_img('./data/peacock.jpg', target_size=(224, 224))
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
# predict the output 
preds = model.predict(img)
# decode the prediction
pred_class = decode_predictions(preds, top=3)[0][0]
print "Predicted Class: %s"%pred_class[1]
print "Confidance: %s"%pred_class[2]
Predicted Class: peacock
Confidance: 0.999984

Pre-trained model as a feature extractor

# load pre-trained model
base_model = VGG19(weights='imagenet')
# pre-process the image
img = image.load_img('./data/peacock.jpg', target_size=(224, 224))
img = image.img_to_array(img)
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
# define model from base model for feature extraction from fc2 layer
model = Model(input=base_model.input, output=base_model.get_layer('fc2').output)
# obtain the outpur of fc2 layer
fc2_features = model.predict(img)
print "Feature vector dimensions: ",fc2_features.shape
Feature vector dimensions:  (1, 4096)