CNN做爲DL中最成功的模型之一,有必要對其更進一步研究它。雖然在前面的博文Stacked CNN簡單介紹中有大概介紹過CNN的使用,不過那是有個前提的:CNN中的參數必須已提早學習好。而本文的主要目的是介紹CNN參數在使用bp算法時該怎麼訓練,畢竟CNN中有卷積層和下采樣層,雖然和MLP的bp算法本質上相同,但形式上仍是有些區別的,很顯然在完成CNN反向傳播前瞭解bp算法是必須的。本文的實驗部分是參考斯坦福UFLDL新教程UFLDL:Exercise: Convolutional Neural Network裏面的內容。完成的是對MNIST數字的識別,採用有監督的CNN網絡,固然了,實驗不少參數結構都按照教程上給的,這裏並無去調整。html
CNN反向傳播求導時的具體過程能夠參考論文Notes on Convolutional Neural Networks, Jake Bouvrie,該論文講得很全面,好比它考慮了pooling層也加入了權值、偏置值及非線性激發(由於這2種值也須要learn),對該論文的解讀可參考zouxy09的博文CNN卷積神經網絡推導和實現。除了bp算法外,本人認爲理解了下面4個子問題,基本上就能夠弄懂CNN的求導了(bp算法這裏就很少作介紹,網上資料實在是太多了),另外爲了講清楚一些細節過程,本文中舉的例子都是簡化了一些條件的,且linux下文本和公式編輯太難弄了,文中不免有些地方會出錯,你們原諒下。linux
首先咱們來看看CNN系統的目標函數,設有樣本(xi, yi)共m個,CNN網絡共有L層,中間包含若干個卷積層和pooling層,最後一層的輸出爲f(xi),則系統的loss表達式爲(對權值進行了懲罰,通常分類都採用交叉熵形式):git
如今只考慮個一個輸入樣本(x, y)的情形,loss函數和上面的公式相似是用交叉熵來表示的,暫時不考慮權值規則項,樣本標籤採用one-hot編碼,CNN網絡的最後一層採用softmax全鏈接(多分類時輸出層通常用softmax),樣本(x,y)通過CNN網絡後的最終的輸出用f(x)表示,則對應該樣本的loss值爲:算法
其推導過程以下(先求出對輸出層某個節點c的偏差敏感性,參考Larochelle關於DL的課件:Output layer gradient),求出輸出層中c節點的偏差敏感值:
假設第l(小寫的l,不要當作數字’1’了)層爲卷積層,第l+1層爲pooling層,且pooling層的偏差敏感項爲: ,卷積層的偏差敏感項爲:
, 則二者的關係表達式爲:
假設卷積層的矩形大小爲4×4, pooling區域大小爲2×2, 很容易知道pooling後獲得的矩形大小也爲2*2(本文默認pooling過程是沒有重疊的,卷積過程是每次移動一個像素,便是有重疊的,後續再也不聲明),若是此時pooling後的矩形偏差敏感值以下:
mean-pooling時的unsample操做可使用matalb中的函數kron()來實現,由於是採用的矩陣Kronecker乘積。C=kron(A, B)表示的是矩陣B分別與矩陣A中每一個元素相乘,而後將相乘的結果放在C中對應的位置。好比:
那麼這樣問題3這樣解的依據是什麼呢?其實很簡單,本質上仍是bp算法,即第l層的偏差敏感值等於第l+1層的偏差敏感值乘以二者之間的權值,只不過這裏因爲是用了卷積,且是有重疊的,l層中某個點會對l+1層中的多個點有影響。好比說最終的結果矩陣中最中間那個0.3是怎麼來的呢?在用2×2的核對3×3的輸入矩陣進行卷積時,一共進行了4次移動,而3×3矩陣最中間那個值在4次移動中均對輸出結果有影響,且4次的影響分別在右下角、左下角、右上角、左上角。因此它的值爲2×0.2+1×0.1+1×0.1-1×0.3=0.3, 建議你們用筆去算一下,那樣就能夠明白爲何這裏能夠採用帶’full’類型的conv2()實現。
而此時偏置值bj的導數爲1.2 ,將j區域的偏差敏感值相加便可(0.8+0.1-0.6+0.3+0.5+0.7-0.4-0.2=1.2),由於b對j中的每一個節點都有貢獻,按照多項式的求導規則(和的導數等於導數的和)很容易獲得。
7. matlab中conv2()函數在卷積過程當中默認是每次移動一個像素,即重疊度最高的卷積。
按照網頁教程UFLDL:Exercise: Convolutional Neural Network和UFLDL:Optimization: Stochastic Gradient Descent,對MNIST數據庫進行識別,完成練習中YOU CODE HERE部分後,該CNN網絡的識別率爲:
只採用了一個卷積層+一個pooling層+一個softmax層。卷積層的特徵個數爲20,卷積核大小爲9×9, pooling區域大小爲2×2.
%% Convolution Neural Network Exercise % Instructions % ------------ % % This file contains code that helps you get started in building a single. % layer convolutional nerual network. In this exercise, you will only % need to modify cnnCost.m and cnnminFuncSGD.m. You will not need to % modify this file. %%====================================================================== %% STEP 0: Initialize Parameters and Load Data % Here we initialize some parameters used for the exercise. % Configuration imageDim = 28; numClasses = 10; % Number of classes (MNIST images fall into 10 classes) filterDim = 9; % Filter size for conv layer,9*9 numFilters = 20; % Number of filters for conv layer poolDim = 2; % Pooling dimension, (should divide imageDim-filterDim+1) % Load MNIST Train addpath ./common/; images = loadMNISTImages('./common/train-images-idx3-ubyte'); images = reshape(images,imageDim,imageDim,[]); labels = loadMNISTLabels('./common/train-labels-idx1-ubyte'); labels(labels==0) = 10; % Remap 0 to 10 % Initialize Parameters,theta=(2165,1),2165=9*9*20+20+100*20*10+10 theta = cnnInitParams(imageDim,filterDim,numFilters,poolDim,numClasses); %%====================================================================== %% STEP 1: Implement convNet Objective % Implement the function cnnCost.m. %%====================================================================== %% STEP 2: Gradient Check % Use the file computeNumericalGradient.m to check the gradient % calculation for your cnnCost.m function. You may need to add the % appropriate path or copy the file to this directory. DEBUG=false; % set this to true to check gradient %DEBUG = true; if DEBUG % To speed up gradient checking, we will use a reduced network and % a debugging data set db_numFilters = 2; db_filterDim = 9; db_poolDim = 5; db_images = images(:,:,1:10); db_labels = labels(1:10); db_theta = cnnInitParams(imageDim,db_filterDim,db_numFilters,... db_poolDim,numClasses); [cost grad] = cnnCost(db_theta,db_images,db_labels,numClasses,... db_filterDim,db_numFilters,db_poolDim); % Check gradients numGrad = computeNumericalGradient( @(x) cnnCost(x,db_images,... db_labels,numClasses,db_filterDim,... db_numFilters,db_poolDim), db_theta); % Use this to visually compare the gradients side by side disp([numGrad grad]); diff = norm(numGrad-grad)/norm(numGrad+grad); % Should be small. In our implementation, these values are usually % less than 1e-9. disp(diff); assert(diff < 1e-9,... 'Difference too large. Check your gradient computation again'); end; %%====================================================================== %% STEP 3: Learn Parameters % Implement minFuncSGD.m, then train the model. % 由於是採用的mini-batch梯度降低法,因此總共對樣本的循環次數次數比標準梯度降低法要少 % 不少,由於每次循環中權值已經迭代屢次了 options.epochs = 3; options.minibatch = 256; options.alpha = 1e-1; options.momentum = .95; opttheta = minFuncSGD(@(x,y,z) cnnCost(x,y,z,numClasses,filterDim,... numFilters,poolDim),theta,images,labels,options); save('theta.mat','opttheta'); %%====================================================================== %% STEP 4: Test % Test the performance of the trained model using the MNIST test set. Your % accuracy should be above 97% after 3 epochs of training testImages = loadMNISTImages('./common/t10k-images-idx3-ubyte'); testImages = reshape(testImages,imageDim,imageDim,[]); testLabels = loadMNISTLabels('./common/t10k-labels-idx1-ubyte'); testLabels(testLabels==0) = 10; % Remap 0 to 10 [~,cost,preds]=cnnCost(opttheta,testImages,testLabels,numClasses,... filterDim,numFilters,poolDim,true); acc = sum(preds==testLabels)/length(preds); % Accuracy should be around 97.4% after 3 epochs fprintf('Accuracy is %f\n',acc);
function convolvedFeatures = cnnConvolve(filterDim, numFilters, images, W, b) %cnnConvolve Returns the convolution of the features given by W and b with %the given images % % Parameters: % filterDim - filter (feature) dimension % numFilters - number of feature maps % images - large images to convolve with, matrix in the form % images(r, c, image number) % W, b - W, b for features from the sparse autoencoder,傳進來的w已是矩陣的形式 % W is of shape (filterDim,filterDim,numFilters) % b is of shape (numFilters,1) % % Returns: % convolvedFeatures - matrix of convolved features in the form % convolvedFeatures(imageRow, imageCol, featureNum, imageNum) numImages = size(images, 3); imageDim = size(images, 1); convDim = imageDim - filterDim + 1; convolvedFeatures = zeros(convDim, convDim, numFilters, numImages); % Instructions: % Convolve every filter with every image here to produce the % (imageDim - filterDim + 1) x (imageDim - filterDim + 1) x numFeatures x numImages % matrix convolvedFeatures, such that % convolvedFeatures(imageRow, imageCol, featureNum, imageNum) is the % value of the convolved featureNum feature for the imageNum image over % the region (imageRow, imageCol) to (imageRow + filterDim - 1, imageCol + filterDim - 1) % % Expected running times: % Convolving with 100 images should take less than 30 seconds % Convolving with 5000 images should take around 2 minutes % (So to save time when testing, you should convolve with less images, as % described earlier) for imageNum = 1:numImages for filterNum = 1:numFilters % convolution of image with feature matrix convolvedImage = zeros(convDim, convDim); % Obtain the feature (filterDim x filterDim) needed during the convolution %%% YOUR CODE HERE %%% filter = W(:,:,filterNum); bc = b(filterNum); % Flip the feature matrix because of the definition of convolution, as explained later filter = rot90(squeeze(filter),2); % Obtain the image im = squeeze(images(:, :, imageNum)); % Convolve "filter" with "im", adding the result to convolvedImage % be sure to do a 'valid' convolution %%% YOUR CODE HERE %%% convolvedImage = conv2(im, filter, 'valid'); % Add the bias unit % Then, apply the sigmoid function to get the hidden activation %%% YOUR CODE HERE %%% convolvedImage = sigmoid(convolvedImage+bc); convolvedFeatures(:, :, filterNum, imageNum) = convolvedImage; end end end function sigm = sigmoid(x) sigm = 1./(1+exp(-x)); end
function pooledFeatures = cnnPool(poolDim, convolvedFeatures) %cnnPool Pools the given convolved features % % Parameters: % poolDim - dimension of pooling region % convolvedFeatures - convolved features to pool (as given by cnnConvolve) % convolvedFeatures(imageRow, imageCol, featureNum, imageNum) % % Returns: % pooledFeatures - matrix of pooled features in the form % pooledFeatures(poolRow, poolCol, featureNum, imageNum) % numImages = size(convolvedFeatures, 4); numFilters = size(convolvedFeatures, 3); convolvedDim = size(convolvedFeatures, 1); pooledFeatures = zeros(convolvedDim / poolDim, ... convolvedDim / poolDim, numFilters, numImages); % Instructions: % Now pool the convolved features in regions of poolDim x poolDim, % to obtain the % (convolvedDim/poolDim) x (convolvedDim/poolDim) x numFeatures x numImages % matrix pooledFeatures, such that % pooledFeatures(poolRow, poolCol, featureNum, imageNum) is the % value of the featureNum feature for the imageNum image pooled over the % corresponding (poolRow, poolCol) pooling region. % % Use mean pooling here. %%% YOUR CODE HERE %%% % convolvedFeatures(imageRow, imageCol, featureNum, imageNum) % pooledFeatures(poolRow, poolCol, featureNum, imageNum) for numImage = 1:numImages for numFeature = 1:numFilters for poolRow = 1:convolvedDim / poolDim offsetRow = 1+(poolRow-1)*poolDim; for poolCol = 1:convolvedDim / poolDim offsetCol = 1+(poolCol-1)*poolDim; patch = convolvedFeatures(offsetRow:offsetRow+poolDim-1, ... offsetCol:offsetCol+poolDim-1,numFeature,numImage); %取出一個patch pooledFeatures(poolRow,poolCol,numFeature,numImage) = mean(patch(:)); end end end end end
function [cost, grad, preds] = cnnCost(theta,images,labels,numClasses,... filterDim,numFilters,poolDim,pred) % Calcualte cost and gradient for a single layer convolutional % neural network followed by a softmax layer with cross entropy % objective. % % Parameters: % theta - unrolled parameter vector % images - stores images in imageDim x imageDim x numImges % array % numClasses - number of classes to predict % filterDim - dimension of convolutional filter % numFilters - number of convolutional filters % poolDim - dimension of pooling area % pred - boolean only forward propagate and return % predictions % % % Returns: % cost - cross entropy cost % grad - gradient with respect to theta (if pred==False) % preds - list of predictions for each example (if pred==True) if ~exist('pred','var') pred = false; end; imageDim = size(images,1); % height/width of image numImages = size(images,3); % number of images lambda = 3e-3; % weight decay parameter %% Reshape parameters and setup gradient matrices % Wc is filterDim x filterDim x numFilters parameter matrix % bc is the corresponding bias % Wd is numClasses x hiddenSize parameter matrix where hiddenSize % is the number of output units from the convolutional layer % bd is corresponding bias [Wc, Wd, bc, bd] = cnnParamsToStack(theta,imageDim,filterDim,numFilters,... poolDim,numClasses); %the theta vector cosists wc,wd,bc,bd in order % Same sizes as Wc,Wd,bc,bd. Used to hold gradient w.r.t above params. Wc_grad = zeros(size(Wc)); Wd_grad = zeros(size(Wd)); bc_grad = zeros(size(bc)); bd_grad = zeros(size(bd)); %%====================================================================== %% STEP 1a: Forward Propagation % In this step you will forward propagate the input through the % convolutional and subsampling (mean pooling) layers. You will then use % the responses from the convolution and pooling layer as the input to a % standard softmax layer. %% Convolutional Layer % For each image and each filter, convolve the image with the filter, add % the bias and apply the sigmoid nonlinearity. Then subsample the % convolved activations with mean pooling. Store the results of the % convolution in activations and the results of the pooling in % activationsPooled. You will need to save the convolved activations for % backpropagation. convDim = imageDim-filterDim+1; % dimension of convolved output outputDim = (convDim)/poolDim; % dimension of subsampled output % convDim x convDim x numFilters x numImages tensor for storing activations activations = zeros(convDim,convDim,numFilters,numImages); % outputDim x outputDim x numFilters x numImages tensor for storing % subsampled activations activationsPooled = zeros(outputDim,outputDim,numFilters,numImages); %%% YOUR CODE HERE %%% convolvedFeatures = cnnConvolve(filterDim, numFilters, images, Wc, bc); %前向傳播,已經通過了激發函數 activationsPooled = cnnPool(poolDim, convolvedFeatures); % Reshape activations into 2-d matrix, hiddenSize x numImages, % for Softmax layer activationsPooled = reshape(activationsPooled,[],numImages); %% Softmax Layer % Forward propagate the pooled activations calculated above into a % standard softmax layer. For your convenience we have reshaped % activationPooled into a hiddenSize x numImages matrix. Store the % results in probs. % numClasses x numImages for storing probability that each image belongs to % each class. probs = zeros(numClasses,numImages); %%% YOUR CODE HERE %%% %Wd=(numClasses,hiddenSize),probs的每一列表明一個輸出 M = Wd*activationsPooled+repmat(bd,[1,numImages]); M = bsxfun(@minus,M,max(M,[],1)); M = exp(M); probs = bsxfun(@rdivide, M, sum(M)); %why rdivide? %%====================================================================== %% STEP 1b: Calculate Cost % In this step you will use the labels given as input and the probs % calculate above to evaluate the cross entropy objective. Store your % results in cost. cost = 0; % save objective into cost %%% YOUR CODE HERE %%% % cost = -1/numImages*labels(:)'*log(probs(:)); % 首先須要把labels弄成one-hot編碼 groundTruth = full(sparse(labels, 1:numImages, 1)); cost = -1./numImages*groundTruth(:)'*log(probs(:))+(lambda/2.)*(sum(Wd(:).^2)+sum(Wc(:).^2)); %加入一個懲罰項 % cost = -1./numImages*groundTruth(:)'*log(probs(:)); % Makes predictions given probs and returns without backproagating errors. if pred [~,preds] = max(probs,[],1); preds = preds'; grad = 0; return; end; %% 將c步和d步合成在一塊兒了 %====================================================================== % STEP 1c: Backpropagation % Backpropagate errors through the softmax and convolutional/subsampling % layers. Store the errors for the next step to calculate the gradient. % Backpropagating the error w.r.t the softmax layer is as usual. To % backpropagate through the pooling layer, you will need to upsample the % error with respect to the pooling layer for each filter and each image. % Use the kron function and a matrix of ones to do this upsampling % quickly. % STEP 1d: Gradient Calculation % After backpropagating the errors above, we can use them to calculate the % gradient with respect to all the parameters. The gradient w.r.t the % softmax layer is calculated as usual. To calculate the gradient w.r.t. % a filter in the convolutional layer, convolve the backpropagated error % for that filter with each image and aggregate over images. %%% YOUR CODE HERE %%% %%% YOUR CODE HERE %%% % 網絡結構: images--> convolvedFeatures--> activationsPooled--> probs % Wd = (numClasses,hiddenSize) % bd = (hiddenSize,1) % Wc = (filterDim,filterDim,numFilters) % bc = (numFilters,1) % activationsPooled = zeros(outputDim,outputDim,numFilters,numImages); % convolvedFeatures = (convDim,convDim,numFilters,numImages) % images(imageDim,imageDim,numImges) delta_d = -(groundTruth-probs); % softmax layer's preactivation,每個樣本都對應有本身每層的偏差敏感性。 Wd_grad = (1./numImages)*delta_d*activationsPooled'+lambda*Wd; bd_grad = (1./numImages)*sum(delta_d,2); %注意這裏是要求和 delta_s = Wd'*delta_d; %the pooling/sample layer's preactivation delta_s = reshape(delta_s,outputDim,outputDim,numFilters,numImages); %進行unsampling操做,因爲kron函數只能對二維矩陣操做,因此還得分開弄 %delta_c = convolvedFeatures.*(1-convolvedFeatures).*(1./poolDim^2)*kron(delta_s, ones(poolDim)); delta_c = zeros(convDim,convDim,numFilters,numImages); for i=1:numImages for j=1:numFilters delta_c(:,:,j,i) = (1./poolDim^2)*kron(squeeze(delta_s(:,:,j,i)), ones(poolDim)); end end delta_c = convolvedFeatures.*(1-convolvedFeatures).*delta_c; % Wc_grad = convn(images,rot90(delta_c,2,'valid'))+ lambda*Wc; for i=1:numFilters Wc_i = zeros(filterDim,filterDim); for j=1:numImages Wc_i = Wc_i+conv2(squeeze(images(:,:,j)),rot90(squeeze(delta_c(:,:,i,j)),2),'valid'); end % Wc_i = convn(images,rot180(squeeze(delta_c(:,:,i,:))),'valid'); % add penalize Wc_grad(:,:,i) = (1./numImages)*Wc_i+lambda*Wc(:,:,i); bc_i = delta_c(:,:,i,:); bc_i = bc_i(:); bc_grad(i) = sum(bc_i)/numImages; end %% Unroll gradient into grad vector for minFunc grad = [Wc_grad(:) ; Wd_grad(:) ; bc_grad(:) ; bd_grad(:)]; end function X = rot180(X) X = flipdim(flipdim(X, 1), 2); end
function [opttheta] = minFuncSGD(funObj,theta,data,labels,... options) % Runs stochastic gradient descent with momentum to optimize the % parameters for the given objective. % % Parameters: % funObj - function handle which accepts as input theta, % data, labels and returns cost and gradient w.r.t % to theta. % theta - unrolled parameter vector % data - stores data in m x n x numExamples tensor % labels - corresponding labels in numExamples x 1 vector % options - struct to store specific options for optimization % % Returns: % opttheta - optimized parameter vector % % Options (* required) % epochs* - number of epochs through data % alpha* - initial learning rate % minibatch* - size of minibatch % momentum - momentum constant, defualts to 0.9 %%====================================================================== %% Setup assert(all(isfield(options,{'epochs','alpha','minibatch'})),... 'Some options not defined'); if ~isfield(options,'momentum') options.momentum = 0.9; end; epochs = options.epochs; alpha = options.alpha; minibatch = options.minibatch; m = length(labels); % training set size % Setup for momentum mom = 0.5; momIncrease = 20; velocity = zeros(size(theta)); %%====================================================================== %% SGD loop it = 0; for e = 1:epochs % randomly permute indices of data for quick minibatch sampling rp = randperm(m); for s=1:minibatch:(m-minibatch+1) it = it + 1; % increase momentum after momIncrease iterations if it == momIncrease mom = options.momentum; end; % get next randomly selected minibatch mb_data = data(:,:,rp(s:s+minibatch-1)); % 取出當前的mini-batch的訓練樣本 mb_labels = labels(rp(s:s+minibatch-1)); % evaluate the objective function on the next minibatch [cost grad] = funObj(theta,mb_data,mb_labels); % Instructions: Add in the weighted velocity vector to the % gradient evaluated above scaled by the learning rate. % Then update the current weights theta according to the % sgd update rule %%% YOUR CODE HERE %%% velocity = mom*velocity+alpha*grad; % 見ufldl教程Optimization: Stochastic Gradient Descent theta = theta-velocity; fprintf('Epoch %d: Cost on iteration %d is %f\n',e,it,cost); end; % aneal learning rate by factor of two after each epoch alpha = alpha/2.0; end; opttheta = theta; end
function numgrad = computeNumericalGradient(J, theta) % numgrad = computeNumericalGradient(J, theta) % theta: a vector of parameters % J: a function that outputs a real-number. Calling y = J(theta) will return the % function value at theta. % Initialize numgrad with zeros numgrad = zeros(size(theta)); %% ---------- YOUR CODE HERE -------------------------------------- % Instructions: % Implement numerical gradient checking, and return the result in numgrad. % (See Section 2.3 of the lecture notes.) % You should write code so that numgrad(i) is (the numerical approximation to) the % partial derivative of J with respect to the i-th input argument, evaluated at theta. % I.e., numgrad(i) should be the (approximately) the partial derivative of J with % respect to theta(i). % % Hint: You will probably want to compute the elements of numgrad one at a time. epsilon = 1e-4; for i =1:length(numgrad) oldT = theta(i); theta(i)=oldT+epsilon; pos = J(theta); theta(i)=oldT-epsilon; neg = J(theta); numgrad(i) = (pos-neg)/(2*epsilon); theta(i)=oldT; if mod(i,100)==0 fprintf('Done with %d\n',i); end; end; %% --------------------------------------------------------------- end
function theta = cnnInitParams(imageDim,filterDim,numFilters,... poolDim,numClasses) % Initialize parameters for a single layer convolutional neural % network followed by a softmax layer. % % Parameters: % imageDim - height/width of image % filterDim - dimension of convolutional filter % numFilters - number of convolutional filters % poolDim - dimension of pooling area % numClasses - number of classes to predict % % % Returns: % theta - unrolled parameter vector with initialized weights %% Initialize parameters randomly based on layer sizes. assert(filterDim < imageDim,'filterDim must be less that imageDim'); Wc = 1e-1*randn(filterDim,filterDim,numFilters); outDim = imageDim - filterDim + 1; % dimension of convolved image % assume outDim is multiple of poolDim assert(mod(outDim,poolDim)==0,... 'poolDim must divide imageDim - filterDim + 1'); outDim = outDim/poolDim; hiddenSize = outDim^2*numFilters; % we'll choose weights uniformly from the interval [-r, r] r = sqrt(6) / sqrt(numClasses+hiddenSize+1); Wd = rand(numClasses, hiddenSize) * 2 * r - r; bc = zeros(numFilters, 1); %初始化爲0 bd = zeros(numClasses, 1); % Convert weights and bias gradients to the vector form. % This step will "unroll" (flatten and concatenate together) all % your parameters into a vector, which can then be used with minFunc. theta = [Wc(:) ; Wd(:) ; bc(:) ; bd(:)]; end
function [Wc, Wd, bc, bd] = cnnParamsToStack(theta,imageDim,filterDim,... numFilters,poolDim,numClasses) % Converts unrolled parameters for a single layer convolutional neural % network followed by a softmax layer into structured weight % tensors/matrices and corresponding biases % % Parameters: % theta - unrolled parameter vectore % imageDim - height/width of image % filterDim - dimension of convolutional filter % numFilters - number of convolutional filters % poolDim - dimension of pooling area % numClasses - number of classes to predict % % % Returns: % Wc - filterDim x filterDim x numFilters parameter matrix % Wd - numClasses x hiddenSize parameter matrix, hiddenSize is % calculated as numFilters*((imageDim-filterDim+1)/poolDim)^2 % bc - bias for convolution layer of size numFilters x 1 % bd - bias for dense layer of size hiddenSize x 1 outDim = (imageDim - filterDim + 1)/poolDim; hiddenSize = outDim^2*numFilters; %% Reshape theta indS = 1; indE = filterDim^2*numFilters; Wc = reshape(theta(indS:indE),filterDim,filterDim,numFilters); indS = indE+1; indE = indE+hiddenSize*numClasses; Wd = reshape(theta(indS:indE),numClasses,hiddenSize); indS = indE+1; indE = indE+numFilters; bc = theta(indS:indE); bd = theta(indE+1:end); end
tmp = conv2(convolvedFeatures(:,:,numFeature,numImage), ones(poolDim),'valid');
pooledFeatures(:,:,numFeature,numImage) =1./(poolDim^2)*tmp(1:poolDim:end,1:poolDim:end);
