[⋆] warmUpExercise.m - Simple example function in Octave/MATLAB
[⋆] plotData.m - Function to display the dataset
[⋆] computeCost.m - Function to compute the cost of linear regression
[⋆] gradientDescent.m - Function to run gradient descent
[†] computeCostMulti.m - Cost function for multiple variables
[†] gradientDescentMulti.m - Gradient descent for multiple variables
[†] featureNormalize.m - Function to normalize features
[†] normalEqn.m - Function to compute the normal equationshtml
在文件warmUpExercise.m中,您將看到Octave / MATLAB函數的概要。經過填寫如下代碼修改它以返回5 x 5單位矩陣:算法
function A = warmUpExercise()
%WARMUPEXERCISE Example function in octave
% A = WARMUPEXERCISE() is an example function that returns the 5x5 identity matrix
% ============= YOUR CODE HERE ==============
A = eye(5);
% ========================================== end
完成以後,運行ex1.m,就能夠看到相似於如下內容的輸出:編程
繪製圖形能夠幫助咱們可視化數據。對於此數據集,咱們可使用散點圖來繪製數據,由於它只有收益和人口兩個數據(在現實生活中不少問題都是多維數據表示,並不能繪製成二維圖)。微信
加載數據:ide
data = load('ex1data1.txt'); X = data(:, 1); y = data(:, 2); m = length(y); % number of training examples
接下來,該程序調用plotData子函數來繪製數據的散點圖。 你的工做是完成plotData.m函數來繪製圖像;,修改代碼:函數
function plotData(x, y)
% ====================== YOUR CODE HERE ====================== % Hint: 您能夠在繪圖中使用'rx'選項,使標記顯示爲紅色十字。 % 此外,您可使用plot(...,'rx','MarkerSize',10) % 使標記更大 figure; % 打開一個新的數字窗口 plot(x, y, 'rx','MarkerSize', 10); %r表明red; x 表明十字標記 ;10是標記的大小 ylabel('Profit in $10,000s'); %加y軸的標籤 xlabel('Population of City in 10,000s'); % ============================================================ end
運行ex1.m以後,你能夠看到Figure 1。優化
線性迴歸的目的就是最小化代價函數:spa
其中假設函數hθ(x)是一個線性模型:hθ(x) = θT x = θ0 + θ1x1。debug
當您執行梯度降低從而最小化成本函數J(θ)時,經過計算成本cost有助於監視是否收斂。 在本節中,您將實現一個計算J(θ)的函數,以便檢查梯度降低實現的收斂性。
function J = computeCost(X, y, theta) %COMPUTECOST Compute cost for linear regression % J = COMPUTECOST(X, y, theta) computes the cost of using theta as the % parameter for linear regression to fit the data points in X and y % Initialize some useful values m = length(y); % number of training examples % You need to return the following variables correctly % ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost of a particular choice of theta % You should set J to the cost. J = (1/(2*m))*sum(((X*theta) - y).^2) % ========================================================================= end
完成該子程序後,ex1.m中將使用零初始化的 θ 運行computeCost,您將該結果:
接下來你的任務是完成gradientDescent.m,從而實現梯度降低。在編程時,請確保你瞭解要優化的內容和正在更新的內容。 請記住,代價函數J(θ)參數 θ 的函數,而不是X和y。 也就是說,咱們經過改變參數θ的值而不是經過改變X或y來最小化J(θ)。驗證梯度降低是否正常工做的一種好方法是查看J(θ)的值,並檢查它是否隨着每一次迭代減少。 若是正確實現了梯度降低和computeCost,則J(θ)的值不該該增長,而且應該在算法結束時收斂到穩定值。
咱們經過調整參數 θ 的值從而最小化代價函數 J(θ)。經過batch梯度降低能夠達到目的。在梯度降低中,每次迭代都執行下面的這個更新:
function [theta, J_history] = gradientDescent(X, y, theta, alpha, num_iters) %GRADIENTDESCENT Performs gradient descent to learn theta % theta = GRADIENTDESCENT(X, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ====================== % Instructions: Perform a single gradient step on the parameter vector % theta. % % Hint: While debugging, it can be useful to print out the values % of the cost function (computeCost) and gradient here. % theta = theta-alpha*(1/m)*X'*(X*theta)-y); % ============================================================ % Save the cost J in every iteration J_history(iter) = computeCost(X, y, theta); end end
當你完成以後, 完成後,ex1.m將使用最終參數來繪製線性逼近。 結果應以下圖所示:
數據中,房屋大小的數量級約爲臥室數量的1000倍。 當特徵相差幾個數量級時,首先執行特徵縮放可使梯度降低更快地收斂。特徵縮放的方式有如下幾種:【特徵縮放】(4.3節)
你的任務是完成featureNormalize.m
function [X_norm, mu, sigma] = featureNormalize(X) %FEATURENORMALIZE Normalizes the features in X % FEATURENORMALIZE(X) returns a normalized version of X where % the mean value of each feature is 0 and the standard deviation % is 1. This is often a good preprocessing step to do when % working with learning algorithms. % You need to set these values correctly X_norm = X; mu = zeros(1, size(X, 2)); sigma = zeros(1, size(X, 2)); % ====================== YOUR CODE HERE ====================== % Instructions: First, for each feature dimension, compute the mean % of the feature and subtract it from the dataset, % storing the mean value in mu. Next, compute the % standard deviation of each feature and divide % each feature by it's standard deviation, storing % the standard deviation in sigma. % % Note that X is a matrix where each column is a % feature and each row is an example. You need % to perform the normalization separately for % each feature. % % Hint: You might find the 'mean' and 'std' functions useful. % mu = mean(X_norm); sigma = std(X_norm ); X_norm(:,1) = ((X_norm(:,1)-mu(1)))./sigma(1); X_norm(:,2) = ((X_norm(:,2)-mu(2)))./sigma(2); % ============================================================ end
以前已經完成了單變量線性迴歸中的計算代價函數和梯度降低,多變量的和單變量基本一致,惟一不一樣的是數據X矩陣中多了一個特徵。
注意:在多變量的狀況下,代價函數也可使用如下的向量化形式編寫:(向量化形式會使得計算更加高效)
function J = computeCostMulti(X, y, theta) %COMPUTECOSTMULTI Compute cost for linear regression with multiple variables % J = COMPUTECOSTMULTI(X, y, theta) computes the cost of using theta as the % parameter for linear regression to fit the data points in X and y % Initialize some useful values m = length(y); % number of training examples % You need to return the following variables correctly J = 0; % ====================== YOUR CODE HERE ====================== % Instructions: Compute the cost of a particular choice of theta % You should set J to the cost. J = (X*theta-y)'*(X*theta-y); %或者 J = (1/(2*m))*sum(((X*theta) - y).^2); % ========================================================================= end
function [theta, J_history] = gradientDescentMulti(X, y, theta, alpha, num_iters) %GRADIENTDESCENTMULTI Performs gradient descent to learn theta % theta = GRADIENTDESCENTMULTI(x, y, theta, alpha, num_iters) updates theta by % taking num_iters gradient steps with learning rate alpha % Initialize some useful values m = length(y); % number of training examples J_history = zeros(num_iters, 1); for iter = 1:num_iters % ====================== YOUR CODE HERE ====================== % Instructions: Perform a single gradient step on the parameter vector % theta. % % Hint: While debugging, it can be useful to print out the values % of the cost function (computeCostMulti) and gradient here. % theta = theta - (alpha/m)*(X')*(X*theta - y); % ============================================================ % Save the cost J in every iteration J_history(iter) = computeCostMulti(X, y, theta); end end
區別於迭代的方式,還能夠用正規方程來求解:
使用此公式不須要任何特徵縮放,能夠在一次計算中獲得一個精確的解決方案。
function [theta] = normalEqn(X, y) %NORMALEQN Computes the closed-form solution to linear regression % NORMALEQN(X,y) computes the closed-form solution to linear % regression using the normal equations. theta = zeros(size(X, 2), 1); % ====================== YOUR CODE HERE ====================== % Instructions: Complete the code to compute the closed form solution % to linear regression and put the result in theta. % % ---------------------- Sample Solution ---------------------- theta = pinv(X'*X)*(X'*y) % ------------------------------------------------------------- % ============================================================ end