In this tutorial, we demonstrate how to manually create a Convolutional Neural Network using the library, MatConvNet. This network will be a DagNN type and will use the MNST dataset.

Part 1: Building Training and Testing Data

Before we build the network, we need to set up our training and testing data. Let us begin by first clearing our workspace and closing all previous outputs.

		  

clc, clear all, close all;

Next, we can run the required MatConvNet setup file. To do this, just provide the absolute or relative file path to the vl_setupnn.m file. Afterwards, we can initialize all the MNST file names and download them using an easy script. For manual install, the MNST data is found at website: http://yann.lecun.com/exdb/mnist/

		  

run ../../../matconvnet-1.0-beta24/matlab/vl_setupnn ;                                                      % Run Required MatConvNet files, aboslute or relative path
files = {'train-images-idx3-ubyte', ...                                                                     % All required filenames that will be downloaded online
         'train-labels-idx1-ubyte', ...
         't10k-images-idx3-ubyte', ...
         't10k-labels-idx1-ubyte'} ;
for i=1:4                                                                                                   % Loop over each file
    
    url = sprintf('http://yann.lecun.com/exdb/mnist/%s.gz',files{i}) ;                                      % Create the URL
    fprintf('downloading %s\n', url) ;                                                                      % Download the URL
    gunzip(url, 'data') ;                                                                                   % Download the URL into data folder

end

Now, we invoke the following commands to read all the information for both training and testing data:

		  


f=fopen(fullfile('data', 'train-images-idx3-ubyte'),'r') ;                                                  % Open training images for reading.  
x1=fread(f,inf,'uint8');                                                                                    % Read file and store
fclose(f) ;                                                                                                 % Close training images
x1=permute(reshape(x1(17:end),28,28,60e3),[2 1 3]) ;                                                        % Format and store

f=fopen(fullfile('data', 't10k-images-idx3-ubyte'),'r') ;                                                   % Open testing images for reading.  
x2=fread(f,inf,'uint8');                                                                                    % Read file and store
fclose(f) ;                                                                                                 % Close testing images
x2=permute(reshape(x2(17:end),28,28,10e3),[2 1 3]) ;                                                        % Format and store

f=fopen(fullfile('data', 'train-labels-idx1-ubyte'),'r') ;                                                  % Open training labels file for reading 
y1=fread(f,inf,'uint8');                                                                                    % Read file and store
fclose(f) ;                                                                                                 % Close training labels
y1=double(y1(9:end)')+1 ;                                                                                   % Format and store

f=fopen(fullfile('data', 't10k-labels-idx1-ubyte'),'r') ;                                                   % Open testing labels for reading
y2=fread(f,inf,'uint8');                                                                                    % Read file and store
fclose(f) ;                                                                                                 % Close testing labels
y2=double(y2(9:end)')+1 ;                                                                                   % Format and store


Assign downloaded data for training (set = 1) and validation/testing (set = 3). Combine data, combine labels, and reshape into the appropriate size. Lastly build and save the dataset (imdb.mat) of normalized data (single precision and zero mean), labels, and miscellaneous (meta) information.

		  

set = [ones(1,numel(y1)) 3*ones(1,numel(y2))];                                                              % Assign sets to the data (Training, Validation, Testing)
data = single(reshape(cat(3, x1, x2),28,28,1,[]));                                                          % Convert to single precision and reshape

dataMean = mean(data(:,:,:,set == 1), 4);                                                                   % Zero Mean
data = bsxfun(@minus, data, dataMean) ;                                                                     % Finalize Zero Mean

% data = imresize(data, [128, 128]);                                                                        % If one wants to resize the data 
imdb.images.data = data ;                                                                                   % Assign IMDB images data 
imdb.images.data_mean = dataMean;                                                                           % Assign IMDB data mean 
imdb.images.labels = double(cat(2, y1, y2)) ;                                                               % Assign IMDB labels
imdb.images.set = set ;                                                                                     % Assign IMDB Set
imdb.meta.sets = {'train', 'val', 'test'} ;                                                                 % meta = miscellaneous, in this case set names 
imdb.meta.classes = arrayfun(@(x)sprintf('%d',x),0:9,'uniformoutput',false) ;                               % Assigning Class names

fprintf('\n***** IMBD.mat has been created! *****\n');
save('imdb.mat', 'imdb', '-v7.3');                                                                          % Saving IMDB as imdb.mat


	

Part 2: Create Convolutional Neural Network (CNN)

First clear your workspace and close all previous outputs. Afterwards run the required MatConvNet setup file by providing the relative or absolute file path.

		  

clc, clear all, close all;
    
run ../../../matconvnet-1.0-beta24/matlab/vl_setupnn;                       % Run required MatConvNet Setup


In order to create the CNN, we must load the previously created database, initialize MatConvNets DagNN network, and then define important initialization parameters. Batch Size determines how many samples are loaded into memory for the training phase of the CNN. The CNN will process all the training data, but only in increments of the specified batch size. Batch size is used for computational efficiency and its value will be completely dependent on the user's available hardware. An epoch is a successful forward pass and a backwards pass through the network. It's usually beneficial to set its value high and then to reduce the value once one is satisfied with the convergence at a particular state (chosen epoch) in the network. Learning Rate is a very sensitive parameter that pushes the network towards convergence, so finding its best value will be an empirical process unless one invokes more powerful techniques such as adadelta, batch normalization, etc. For more training parameter details, explore popular related readings and literature. It is important to note that GPU training will dramatically help training time for the CNN, but it has to be set up manually by the user.

		  

%--------------------------------------------------------------
% Initialize Parameters
%--------------------------------------------------------------

opts.train.batchSize = 80;                                                  % How many batch of samples to train on at once
opts.train.numEpochs = 15 ;                                                 % Number of epochs used for training
opts.train.continue = true ;                                                % Flag, set true to continue training after every epoch
opts.train.gpus = [1] ;                                                     % GPU training set to 1, and [] for CPU training. MUST have GPU setup
opts.train.learningRate = 1e-3;                                             % Learning rate value, very sensitive unless batch normalization present
opts.train.expDir = 'epoch_data';                                           % Directory for storing epoch files, will create if not already made
opts.train.numSubBatches = 1;                                               % Keep at 1
 
% opts.train.weightDecay  = 3e-4;                                           % Weight decay value, usually set small. Default is present in cnn_train_dag.m
% opts.train.momentum = 0.9;                                                % Incorporate this, once convergence is reached, Default is present in cnn_train_dag.m

bopts.useGpu = numel(opts.train.gpus) >  0 ;                                % Usually keep at 0, seems to only work with 3D data.

load('imdb.mat');                                                           % Load Created IMDB
net = dagnn.DagNN() ;                                                       % Define Dag network, see documentation for DAG vs Simple

Now we can initialize the CNN by creating each layer individually. For this CNN, we will use will Convolutional Layers, ReLU Layers, and Pooling Layers and a Softmax Layer. 'Fc1' will be known as a fully connected layer or a multi layer perceptron, but is essentially a convolutional layer at its core. Afterwards, we will invoke objective and error layers that will provide a graphical visualization of the training and validation convergence, each epoch. For the convolutional layers, the 'size' is the filter size: [Height x Width x Depth x Neurons], 'stride' [Height x Width], 'pad' zero padding [Top Bottom Left Right]. The pooling is very similar, except one specifies the pooling type (max, average, etc.) and its 'size' being restricted to [Height x Width]. Lastly, unlike the SimpleNN network, DagNN network can easily specify the inputs and outputs of each layer. This gives it a lot of flexibility when creating very complex architectures such as GoogleNet.

 
%-------------------------------------------------------------------
% CNN Architecture Design
% - Each block is just a set of layers, 
% - Padding goes [TOP BOTTOM LEFT RIGHT]
% - Dagnn.Loss computes the loss incurred by the prediction scores 
% - ConvNf and ConvNb stand for Convolution filters and bias
%-------------------------------------------------------------------

% Block #1
net.addLayer('conv1', dagnn.Conv('size', [5 5 1 32], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'input'}, {'conv1'},  {'conv1f'  'conv1b'}); 
net.addLayer('relu1', dagnn.ReLU(), {'conv1'}, {'relu1'}, {});
net.addLayer('pool1', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'relu1'}, {'pool1'}, {});

% Block #2
net.addLayer('conv2', dagnn.Conv('size', [5 5 32 32], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'pool1'}, {'conv2'},  {'conv2f'  'conv2b'});
net.addLayer('relu2', dagnn.ReLU(), {'conv2'}, {'relu2'}, {});
net.addLayer('pool2', dagnn.Pooling('method', 'max', 'poolSize', [2, 2], 'stride', [2 2], 'pad', [0 0 0 0]), {'relu2'}, {'pool2'}, {});

% Block #3
net.addLayer('conv3', dagnn.Conv('size', [4 4 32 64], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'pool2'}, {'conv3'},  {'conv3f'  'conv3b'}); 

% Muli-Layer-Perceptron
net.addLayer('fc1', dagnn.Conv('size', [1 1 64 256], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'conv3'}, {'fc1'},  {'conv5f'  'conv5b'});
net.addLayer('relu5', dagnn.ReLU(), {'fc1'}, {'relu5'}, {});

net.addLayer('classifier', dagnn.Conv('size', [1 1 256 10], 'hasBias', true, 'stride', [1, 1], 'pad', [0 0 0 0]), {'relu5'}, {'classifier'},  {'conv6f'  'conv6b'});
net.addLayer('prediction', dagnn.SoftMax(), {'classifier'}, {'prediction'}, {});
     
net.addLayer('objective', dagnn.Loss('loss', 'log'), {'prediction', 'label'}, {'objective'}, {});                           % MatConvNet is minimizing objective
net.addLayer('error', dagnn.Loss('loss', 'classerror'), {'prediction','label'}, 'error') ;                                  % Used for calculating other error statistics    

		

Next, one can view how the data dimensionality is reduced throughout the network using the net.print() command. This is a useful tool that will monitor how each layer modifies the spatial size of the input data as it's passed throughout the network. Therefore, net.print() can be thought of as a network debugging tool that gives insight on convolutional filter decisions, pooling decisions, and more.

 
net.print({'input',[28 28 1 40000]})
		

		  

   func|        conv1|relu1|  pool1|        conv2|relu2|  pool2|        conv3|          fc1|relu5|   classifier|prediction|       objective|           error|
-------|-------------|-----|-------|-------------|-----|-------|-------------|-------------|-----|-------------|----------|----------------|----------------|
   type|         Conv| ReLU|Pooling|         Conv| ReLU|Pooling|         Conv|         Conv| ReLU|         Conv|   SoftMax|            Loss|            Loss|
 inputs|        input|conv1|  relu1|        pool1|conv2|  relu2|        pool2|        conv3|  fc1|        relu5|classifier|prediction label|prediction label|
outputs|        conv1|relu1|  pool1|        conv2|relu2|  pool2|        conv3|          fc1|relu5|   classifier|prediction|       objective|           error|
 params|conv1f conv1b|     |       |conv2f conv2b|     |       |conv3f conv3b|conv5f conv5b|     |conv6f conv6b|          |                |                |
    pad|            0|  n/a|      0|            0|  n/a|      0|            0|            0|  n/a|            0|       n/a|             n/a|             n/a|
 stride|            1|  n/a|      2|            1|  n/a|      2|            1|            1|  n/a|            1|       n/a|             n/a|             n/a|


   var|        input|         conv1|         relu1|         pool1|       conv2|       relu2|       pool2|       conv3|          fc1|        relu5|  classifier|  prediction|label|  objective|      error|
------|-------------|--------------|--------------|--------------|------------|------------|------------|------------|-------------|-------------|------------|------------|-----|-----------|-----------|
  dims|28x28x1x4e+04|24x24x32x4e+04|24x24x32x4e+04|12x12x32x4e+04|8x8x32x4e+04|8x8x32x4e+04|4x4x32x4e+04|1x1x64x4e+04|1x1x256x4e+04|1x1x256x4e+04|1x1x10x4e+04|1x1x10x4e+04|  n/a|1x1x1x4e+04|1x1x1x4e+04|
   mem|        120MB|           3GB|           3GB|         703MB|       312MB|       312MB|        78MB|        10MB|         39MB|         39MB|         2MB|         2MB|  NaN|      156KB|      156KB|
 fanin|            0|             1|             1|             1|           1|           1|           1|           1|            1|            1|           1|           1|    0|          1|          1|
fanout|            1|             1|             1|             1|           1|           1|           1|           1|            1|            1|           1|           2|    2|          0|          0|


params| 0B|
  vars|7GB|
 total|7GB|

After the network layers have been made, the weights need to be initialized. MatConvNet does this with Gaussian Distribution.

 
initNet(net, 1/100);
		

		  

function initNet(net, f)                                                                                                    % Function to initalize weights
	
    net.initParams();                                                                                                       % Call initialization members of network

	f_ind = net.layers(1).paramIndexes(1);                                                                              % Get paramIndex from 1st Conv layer, Filters
	b_ind = net.layers(1).paramIndexes(2);                                                                              % Get paramIndex from 1st Conv Layer, Biases
	net.params(f_ind).value = 10*f*randn(size(net.params(f_ind).value), 'single');                                      % Initalize random Guassian Distribution
	net.params(f_ind).learningRate = 1;                                                                                 % Initalize local learning rate
	net.params(f_ind).weightDecay = 1;                                                                                  % Initalize local weight decay

	for l=2:length(net.layers)                                                                                          % Iterate through all layers
		if(strcmp(class(net.layers(l).block), 'dagnn.Conv'))                                                        % If there is a convolution layer
			f_ind = net.layers(l).paramIndexes(1);                                                              % Get paramIndex 
			b_ind = net.layers(l).paramIndexes(2);

			[h,w,in,out] = size(net.params(f_ind).value);
			net.params(f_ind).value = f*randn(size(net.params(f_ind).value), 'single');                         % Initalize random Guassian Distribution, filters
			net.params(f_ind).learningRate = 1;                                                                 % Local learning rate
			net.params(f_ind).weightDecay = 1;                                                                  % Local weight decay

			net.params(b_ind).value = f*randn(size(net.params(b_ind).value), 'single');                         % Initalize random Guassian Distribution, bias
			net.params(b_ind).learningRate = 0.5;  
			net.params(b_ind).weightDecay = 1;
		end
	end
end

Lastly, we can begin the training of the neural network by supplying the created database, the constructed network, and the current 'batch' of data. The term 'val' in this code stands for validation data. However, it is important to note that validation data can also be used as testing data. When training the CNN, only the data specified for training (set = 1) actually plays a role in minimizing error in the CNN. The training data is fed through the network for the forward pass and backwards pass. The validation data is just used to see how the CNN responds to new similar data that it's not currently being trained on, so it only is fed through the forward pass of the network. Afterwards, we save the trained CNN and prepare for the testing phase.

 

%-----------------------------------------------------------
% Train the CNN Network
% - Make sure your validation set value data matches below
%-----------------------------------------------------------
	
info = cnn_train_dag(net, imdb, @(i,b) getBatch(bopts,i,b), opts.train, 'val', find(imdb.images.set == 3)) ;                % MatConvNet DagNN Training Function
 
myCNN = net.saveobj();                                                                                                      % Save the DagNN trained CNN
save('myCNNdag.mat', '-struct', 'myCNN')                                                                                    % Store it as .mat file
	 
		

Function used to get the next batch of data. This is determined by the Batch Size.

		  

%-----------------------------------------------
% Creating Batch of Images + Labels
%-----------------------------------------------

function inputs = getBatch(opts, imdb, batch)                                                                               % This fucntion is what generates new mini batch
	images = imdb.images.data(:,:,:,batch) ;                                                                                % Generates specified number of images 
	labels = imdb.images.labels(1,batch) ;                                                                                  % Gets the associated labels
	if opts.useGpu > 0
  		images = gpuArray(images) ;
	end

	inputs = {'input', images, 'label', labels} ;                                                                           % Assigns images and label to inputs
end


During the training phase of the CNN, each successful epoch will produce up to two plots (objective and error layers). For the DagNN network, MatConvNet minimizes the objective, while the error plot allows one to compute more statistical inference. In both the objective and the error plots the training error is represented by blue and the validation error is represented by orange. The error graph should show similarity to the objective graph.

Part 3: Testing Convolutional Neural Network (CNN)

First clear your workspace and close all previous outputs. Afterwards run the required MatConvNet setup file by providing the relative or absolute file path.

		  

clc, clear all, close all;
    
run ../../../matconvnet-1.0-beta24/matlab/vl_setupnn;                       % Run required MatConvNet Setup


Next load the created database and the trained DagNN network into memory. Assign the class names from the database to the trained CNN.

		  

%--------------------------------------------------------------------
% Loading created CNN and IMDB
%--------------------------------------------------------------------
 
load('imdb_MNST.mat');                                                                              % Load IMDB
netStruct = load('MNST_dagCNN.mat');                                                                % Load in Trained CNN                                                       
net = dagnn.DagNN.loadobj(netStruct);                                                               % Unwrap into DagNN Object
net.meta.classes.description = imdb.meta.classes;                                                   % Set up Class Descriptions

Lastly, use the created database as testing data. We can test testing/validation data (set = 3) or training data (set = 1).

		  

%--------------------------------------------------------------------
% Loading IMDB test data
%--------------------------------------------------------------------

i = 1;
while(1)                                                                                            % Run until satisfied
    
    im = imdb.images.data(:,:,:,i);                                                                 % Initalize current sample from IMDB
    i = i + 1;                                                                                      % Next Sample
    
    net.eval({'input', im});                                                                        % Run the sample through the trained CNN

    % obtain the CNN otuput
    scores = net.vars(net.getVarIndex('prediction')).value;                                         % obtain the CNN otuput, make sure type matches, ex prob or prediction
    scores = squeeze(gather(scores));                                                               % Gather necessary details
    
    % show the classification results
    [bestScore, best] = max(scores);                                                                % Acquire best score and assocaited class name
    figure(1) ; clf ; imagesc(im);
    title(sprintf('%s (%d), score %.3f', net.meta.classes.description{best}, best, bestScore));
    output = net.meta.classes.description{best};                                                    % Outputs associated class name of best score
    
    keyboard;
   
 end   

The following are example classification outputs from the DagNN network on the MNST data.