In this tutorial, we demonstrate how to manually create a Convolutional Neural Network using the library, MatConvNet. This network will be a SimpleNN type and will use the Cifar10 dataset.

Part 1: Building Training and Testing Data

Before we build the network, we need to set up our training and testing data. Lets begin by first clearing our workspace and closing all previous outputs.


clc, clear all, close all;

Next, we can run the required MatConvNet setup file. To do this, just provide the absolute or relative file path to the vl_setupnn.m file. Afterwards, we can download and organize the Cifar10 data using a short script. It is to note that helperCIFAR10Data() is only available for matlab versions of 2016 and higher. For manual install, the cifar data is found at website: https://www.cs.toronto.edu/~kriz/cifar.html


run ../../../matconvnet-1.0-beta24/matlab/vl_setupnn ;                                                      % Run Required MatConvNet files, aboslute or relative path
%-----------------------------------------------------------------------------------
% Load in training data
%-----------------------------------------------------------------------------------

cifar10Data = 'data';                                                                           % Download CIFAR-10 data

url = 'https://www.cs.toronto.edu/~kriz/cifar-10-matlab.tar.gz';                                % URL for Cifar 10

helperCIFAR10Data.download(url, cifar10Data);                                                   % Useful function

% Load the CIFAR-10 training and test data.
[trainingImages, trainingLabels, testImages, testLabels] = helperCIFAR10Data.load(cifar10Data);
info = load('data/cifar-10-batches-mat/batches.meta.mat');


Now we can partition the training data into training and validation sub-sets. We are doing it randomly, with a greater percentage of training data to validation data. You can also use the testing data for the validation set.



%--------------------------------------------------------------------
% Assigning train/validation set randomly, 60% train, 40% validation
%--------------------------------------------------------------------

set = [];
for i = 1:10000                                                                                 % Loop over number of samples
    if(randi(10, 1) > 6)
        set(end + 1) = 1;                                                                       % Training = 1
    else
        set(end + 1) = 2;                                                                       % Validation = 2, arbitrary
    end
end


Lastly we can build and save the dataset (imdb.mat) of normalized data (single precision and zero mean), labels, and miscellaneous (meta) information.


%------------------------------------------------------------------------------------
% Setting up IMDB structure
%------------------------------------------------------------------------------------

trainingImages_ = single(trainingImages);
dataMean = mean(trainingImages_(:,:,:,set == 1), 4);                                             % Zero Mean
data = bsxfun(@minus, trainingImages_, dataMean);                                                % Finalize Zero Mean

images.data = data;                                                                             % IMDB train data parameter
imdb.images.data_mean = dataMean;                                                               % IMDB data mean
images.labels = double(trainingLabels');                                                        % IMDB labels parameter
images.set = set;                                                                               % IMDB set train/val set parameter
images.original = trainingImages;

%----------------------------------------------------------------------------------
% Finalizng IMDB
%----------------------------------------------------------------------------------

% Setting Misc parameters for IMDB
meta.sets = {'train', 'val'};                                                                   % Assigning IMDB set names
meta.classes = info.label_names;                                                                % Assigning IMDB class names

% Completing IMDB
imdb.meta = meta;                                                                               % Assign IMDB meta (miscellaneous)
imdb.images = images;                                                                           % Assign IMDB images

fprintf('\n***** IMBD.mat has been created! *****\n');
save('imdb.mat', 'imdb', '-v7.3');                                                              % Saving IMDB as imdb.mat


Part 2: Create Convolutional Neural Network (CNN)

First clear your workspace and close all previous outputs. Afterwards run the required MatConvNet setup file by providing the relative or absolute file path.


clc, clear all, close all;

run ../../../matconvnet-1.0-beta24/matlab/vl_setupnn;                       % Run required MatConvNet Setup


In order to create the CNN, we must load the previously created database, initialize MatConvNets SimpleNN network, and then define important initialization parameters. Batch Size determines how many samples are loaded into memory for the training phase of the CNN. The CNN will process all the training data, but only in increments of the specified batch size. Batch size is used for computational efficiency and its value will be completely dependent on the user's available hardware. An epoch is a successful forward pass and a backwards pass through the network. Its usually beneficial to set its value high and then to reduce the value once one is satisfied with the convergence at a particular state (chosen epoch) in the network. Learning Rate is a very sensitive parameter that pushes the network towards convergence, so finding its best value will be an empirical process unless one invokes more powerful techniques such as adadelta, batch normalization, etc. For more training parameter details, explore popular related readings and literature. It is important to note that GPU training will dramatically help training time for the CNN, but it has to be set up manually by the user.


%--------------------------------------------------------------
% Initialize Parameters
%--------------------------------------------------------------

opts.train = struct();                                                          % Initalize SimgpleNN
opts.train.gpus = [1];                                                          % Training with GPU, so setting to [1], otherwise set to []
opts.train.continue = true;                                                     % Keep true, so will continue every epoch
opts.expDir = 'epoch_data';                                                     % Folder used for storing each epoch

% opts.whitenData = true ;                                                      % Optional parameter for whitening data
% opts.contrastNormalization = true ;                                           % Optional parameter for contrast normalization

% -------------------------------------------------------------------------
% Prepare model and data
% -------------------------------------------------------------------------

load('imdb.mat');                                                               % Load in Cifar Data or IMDB
net.meta.classes.name = imdb.meta.classes(:)';                                  % Assign class names from IMDB

% -------------------------------------------------------------------------
% Create CNN Architecture and Train
% -------------------------------------------------------------------------

net.layers = {};                                                                % Initalize CNN Architecture, layers

net.meta.inputSize = [32 32 3];                                                 % Assign data input size, cifar sample is [32, 32, 3]
net.meta.trainOpts.learningRate = 1e-4 ;                                        % Assign learning rate
net.meta.trainOpts.weightDecay = 3e-4 ;                                         % Assign weight decay
net.meta.trainOpts.batchSize = 100 ;                                            % Assign batch size
net.meta.trainOpts.momentum = 0.9;                                              % Assign momentum
net.meta.trainOpts.numEpochs = 251 ;                                            % Assign number of epochs

Now we can initialize the CNN by creating each layer individually. For this CNN, we will use will Convolutional Layers, ReLU Layers, and Pooling Layers and a Softmaxloss Layer. For the convolutional layers, the 'weights' are initialized through Gaussian Distribution. The filters for the convolutional layers are in the format: [Height x Width x Depth x Neurons]. The 'stride' format is [Height x Width], and the 'pad' or zero padding format is [Top Bottom Left Right]. The pooling is very similar, except one can specify the type of pooling and its values being restricted to [Height x Width]. Lastly, with the SimpleNN network, one can easily append to the network by using the {end + 1} indexing. Unlike the DagNN network, the simple is by default linear computational, and thus gives less freedom for architectural design.


%-------------------------------------------------------------------
% CNN Architecture Design
% - Each block is just a set of layers,
% - Padding goes [TOP BOTTOM LEFT RIGHT]
%-------------------------------------------------------------------

% Block 1
net.layers{end+1} = struct('type', 'conv', 'weights', {{0.05*randn(5,5,3,32, 'single'), zeros(1, 32, 'single')}}, 'stride', [1 1], 'pad', [0 0 0 0]) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', 'method', 'max', 'pool', [2 2], 'stride', [2 2], 'pad', [0 0 0 0]);

% Block 2
net.layers{end+1} = struct('type', 'conv', 'weights', {{0.05*randn(5,5,32,32, 'single'), zeros(1, 32, 'single')}}, 'stride', [1 1], 'pad', [0 0 0 0]) ;
net.layers{end+1} = struct('type', 'relu') ;
net.layers{end+1} = struct('type', 'pool', 'method', 'avg', 'pool', [2 2], 'stride', [2 2], 'pad', [0 0 0 0]);

% Block 3
net.layers{end+1} = struct('type', 'conv', 'weights', {{0.05*randn(5,5,32,64, 'single'), zeros(1,64,'single')}}, 'stride', [1 1], 'pad', [0 0 0 0]) ;
net.layers{end+1} = struct('type', 'relu') ;

% Block Multi-Layer-Perceptron
net.layers{end+1} = struct('type', 'conv', 'weights', {{0.05*randn(1,1,64,10, 'single'), zeros(1,10,'single')}}, 'stride', [1 1], 'pad', [0 0 0 0]) ;

% Loss layer
net.layers{end+1} = struct('type', 'softmaxloss') ;

net = vl_simplenn_tidy(net) ;                                                               % Fill in default values


Next, one can view how the data dimensionality is reduced throughout the network using the vl_simplenn_display() command. This is a useful tool that will monitor how each layer modifies the spatial size of the input data as it's passed throughout the network. vl_simplenn_display() can be thought of as a network debugging tool that gives insight on convolutional filter decisions, pooling decisions, and more.


vl_simplenn_display(net)


     layer|    0|     1|     2|     3|     4|     5|     6|     7|     8|     9|     10|
      type|input|  conv|  relu| mpool|  conv|  relu| apool|  conv|  relu|  conv|softmxl|
      name|  n/a|layer1|layer2|layer3|layer4|layer5|layer6|layer7|layer8|layer9|layer10|
----------|-----|------|------|------|------|------|------|------|------|------|-------|
   support|  n/a|     5|     1|     2|     5|     1|     2|     5|     1|     1|      1|
  filt dim|  n/a|     3|   n/a|   n/a|    32|   n/a|   n/a|    32|   n/a|    64|    n/a|
filt dilat|  n/a|     1|   n/a|   n/a|     1|   n/a|   n/a|     1|   n/a|     1|    n/a|
 num filts|  n/a|    32|   n/a|   n/a|    32|   n/a|   n/a|    64|   n/a|    10|    n/a|
    stride|  n/a|     1|     1|     2|     1|     1|     2|     1|     1|     1|      1|
       pad|  n/a|     0|     0|     0|     0|     0|     0|     0|     0|     0|      0|
----------|-----|------|------|------|------|------|------|------|------|------|-------|
   rf size|  n/a|     5|     5|     6|    14|    14|    16|    32|    32|    32|     32|
 rf offset|  n/a|     3|     3|   3.5|   7.5|   7.5|   8.5|  16.5|  16.5|  16.5|   16.5|
 rf stride|  n/a|     1|     1|     2|     2|     2|     4|     4|     4|     4|      4|
----------|-----|------|------|------|------|------|------|------|------|------|-------|
 data size|   32|    28|    28|    14|    10|    10|     5|     1|     1|     1|      1|
data depth|    3|    32|    32|    32|    32|    32|    32|    64|    64|    10|      1|
  data num|    1|     1|     1|     1|     1|     1|     1|     1|     1|     1|      1|
----------|-----|------|------|------|------|------|------|------|------|------|-------|
  data mem| 12KB|  98KB|  98KB|  24KB|  12KB|  12KB|   3KB|  256B|  256B|   40B|     4B|
 param mem|  n/a|  10KB|    0B|    0B| 100KB|    0B|    0B| 200KB|    0B|   3KB|     0B|

parameter memory|312KB (8e+04 parameters)|
     data memory|261KB (for batch size 1)|

Lastly, we can begin the training of the neural network by supplying the created database, the constructed network, and the current 'batch' of data. The term 'val' in this code stands for validation data. However, it is important to note that validation data can also be used as testing data. When training the CNN, only the data specified for training (set = 1) actually plays a role in minimizing error in the CNN. The training data is fed through the network for the forward pass and backwards pass. The validation data is just used to see how the CNN responds to new similar data that it's not currently being trained on, so it only is fed through the forward pass of the network. Afterwards, we save the trained CNN and prepare for the testing phase.


% -------------------------------------------------------------------------
% Call CNN_Train Function
% -------------------------------------------------------------------------

trainfn = @cnn_train ;

[net, info] = trainfn(net, imdb, getBatch(opts), 'expDir', opts.expDir, net.meta.trainOpts, opts.train, 'val', find(imdb.images.set == 2));

fprintf('\n***** CNN.mat has been created! *****\n');
save('CNN.mat', 'net', '-v7.3');                                                              % Saving CNN as .mat file


Functions used to get the next batch of data. This is determined by the Batch Size.


%-----------------------------------------------
% Creating Batch of Images + Labels
%-----------------------------------------------

function fn = getBatch(opts)                                                                  % Function to get next batch for SGD
    fn = @(x,y) getSimpleNNBatch(x,y) ;
end

function [images, labels] = getSimpleNNBatch(imdb, batch)

    images = imdb.images.data(:,:,:,batch) ;                                                  % Get next batch of data
    labels = imdb.images.labels(1,batch) ;                                                    % Get corresponding labels for data
    if rand > 0.5, images=fliplr(images) ; end
end

During the training phase of the CNN, the SimpleNN network will produce 3 plots (Top1 error, Top5 error, and objective) for each successful epoch. The top1 error is the chance that class with the highest probability is the true correct target. In other words, the CNN guesses the target correctly. The top5 error is the chance that the true target is one of the five top probabilities. Objective (unlike objective in the DagNN network) is the energy of the network vs training epochs. The objective for the SimpleNN network should mirror the form of the top1 and top5 error. In all the plots, the training error is represented by blue and the validation error is represented by orange.

Part 3: Testing Convolutional Neural Network (CNN)

First clear your workspace and close all previous outputs. Afterwards run the required MatConvNet setup file by providing the relative or absolute file path.


clc, clear all, close all;

run ../../../matconvnet-1.0-beta24/matlab/vl_setupnn;                       % Run required MatConvNet Setup


Next load the created database and the trained SimpleNN network into memory. Assign the class names from the database to the trained CNN.


%--------------------------------------------------------------------
% Loading created CNN and IMDB
%--------------------------------------------------------------------

load('imdb_cifar.mat');                                                                             % Load IMDB
load('CNN_Simple.mat');                                                                             % Load in CNN
net = vl_simplenn_tidy(net) ;                                                                       % Trained Cifar CNN is a simpleNN, thus treat it as such
net.layers{end}.type = 'softmax';                                                                   % Train with softmaxlog, but evaluate as softmax
net.meta.classes.name = imdb.meta.classes(:)' ;                                                     % Give CNN the needed class descriptions for testing

Lastly, use the created database as testing data. We are using the database for testing values here, but one can also test with the testing values provided from the helperCIFAR10Data() function above.


%--------------------------------------------------------------------
% Loading IMDB test data
%--------------------------------------------------------------------

i = 1;
while(1)                                                                                            % Run until satisfied

    im = imdb.images.data(:,:,:,i);                                                                 % Initalize current sample from IMDB (Zero mean + single precision)
    orig = imdb.images.original(:,:,:,i);                                                           % Load in corresponding image to go with sample (For viewing)
    i = i + 1;                                                                                      % Next Sample

    res = vl_simplenn(net, im);                                                                     % Run test sample through CNN
    scores = squeeze(gather(res(end).x));                                                           % Gather all scores

    % show the classification results
    [bestScore, best] = max(scores);                                                                % Acquire best score and assocaited class name
    figure(1) ; clf ; imagesc(orig);                                                                % Plot Results from sample, use original image for visual assessment
    title(sprintf('%s (%d), score %.3f', net.meta.classes.name{best}, best, bestScore));
    output = net.meta.classes.name{best};                                                           % Outputs associated class name of best score
    keyboard;

 end


The following are example classification outputs from the SimpleNN network on the Cifar10 data. Due to the [32 x 32] size of the Cifar images, the images look as expected.