"A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E."app
Example: playing checkers.less
E = the experience of playing many games of checkerside
T = the task of playing checkers.函數
P = the probability that the program will win the next game.學習
In general, any machine learning problem can be assigned to one of two broad classifications:優化
supervised learning, ORthis
unsupervised learning.idea
監督學習定義spa
In supervised learning, we are given a data set and already know what our correct output should look like, having the idea that there is a relationship between the input and the output.3d
We could turn this example into a classification problem by instead making our output about whether the house "sells for more or less than the asking price." Here we are classifying the houses based on price into two discrete categories.
非監督學習定義
Unsupervised learning, on the other hand, allows us to approach problems with little or no idea what our results should look like. We can derive structure from data where we don't necessarily know the effect of the variables.
We can derive this structure by clustering the data based on relationships among the variables in the data.
With unsupervised learning there is no feedback based on the prediction results, i.e., there is no teacher to correct you.
單輸入參數線性迴歸定義
Recall that in regression problems, we are taking input variables and trying to fit the output onto a continuous expected result function.
Linear regression with one variable is also known as "univariate linear regression."
Univariate linear regression is used when you want to predict a single output value y from a single input value x. We're doing supervised learning here, so that means we already have an idea about what the input/output cause and effect should be.
假設函數的定義
y^=hθ(x)=θ0+θ1x
input x | output y |
---|---|
0 | 4 |
1 | 7 |
2 | 7 |
3 | 8 |
經過x,y擬合出一條y=kx+b的一元一次方程,這裏使用的其實最小二乘法來完成k與b的解,最終將k與b帶入到假設函數中,就是線性規劃擬合出來的方程解,這裏的是否讓你想起高中課本中正態分佈,沒錯這就正態分佈的第一步,而後求假設函數的方差,再將方差帶入到高斯函數中就是正態分佈。
將b和k看做一個2*1的矩陣,將x看做一個1*2的矩陣第二列的值爲1,將這兩個足證相乘就能夠獲得假設函數
代價公式以下,代價函數其實就是對最小二乘法乘以1/2m進行優化。代價函數與方差公式很類似但又不是。這裏也有人提出過爲何不用平方根,平方根增長了計算成本,因此將方差進行優化。
梯度降低就是重複求代價函數的斜率,由於代價函數在不考慮假設函數b狀況下至關於一個凹函數,因此代價函數有一個全局最小值,經過不斷的減少斜率,能夠得出代價函數的最小值,當代價函數最小時,其斜率爲0.