核函數K(kernel function)就是指K(x, y) = <f(x), f(y)>,其中x和y是n維的輸入值,f(·) 是從n維到m維的映射(一般,m>>n)。<x, y>是x和y的內積(inner product)(也稱點積(dot product))。算法
舉個小小栗子。
令 x = (x1, x2, x3, x4); y = (y1, y2, y3, y4);
令 f(x) = (x1x1, x1x2, x1x3, x1x4, x2x1, x2x2, x2x3, x2x4, x3x1, x3x2, x3x3, x3x4, x4x1, x4x2, x4x3, x4x4); f(y)亦然;
令核函數 K(x, y) = (<x, y>)^2.
接下來,讓咱們帶幾個簡單的數字進去看看是個什麼效果:x = (1, 2, 3, 4); y = (5, 6, 7, 8). 那麼:
f(x) = ( 1, 2, 3, 4, 2, 4, 6, 8, 3, 6, 9, 12, 4, 8, 12, 16) ;
f(y) = (25, 30, 35, 40, 30, 36, 42, 48, 35, 42, 49, 56, 40, 48, 56, 64) ;
<f(x), f(y)> = 25+60+105+160+60+144+252+384+105+252+441+672+160+384+672+1024
= 4900.
若是咱們用核函數呢?
K(x, y) = (5+12+21+32)^2 = 70^2 = 4900.
就是這樣!機器學習
因此如今你看出來了吧,kernel其實就是幫咱們省去在高維空間裏進行繁瑣計算的「簡便運算法」。甚至,它能解決無限維空間沒法計算的問題!由於有時f(·)會把n維空間映射到無限維空間去。函數
那麼kernel在SVM究竟扮演着什麼角色?
初學SVM時經常可能對kernel有一個誤讀,那就是誤覺得是kernel使得低維空間的點投射到高位空間後實現了線性可分。其實否則。這是把kernel和feature space transformation混爲了一談。(這個錯誤其實很蠢,只要你把SVM從頭至尾認真推導一遍就不會犯我這個錯。)post
咱們成功地找到了那個分界線,這就是最直觀的kernel啦!
可能不太嚴謹,可是kernel大概就是這個意思,詳細的數學定義樓上說的很好,就不贅述了。
引用一句這門課的教授的話:
「你在你的一輩子中可能會經歷不少變故,可能會變成徹底不一樣的另外一我的,可是這個世界上只有一個你,我要怎樣才能把不一樣的「你」分開呢?最直觀的方法就是增長「時間」這個維度,雖然這個地球上只有一個你,這個你是不可分割的,可是「昨天在中國的你」和「今天在美國的你」在時間+空間這個維度倒是能夠被分割的。」學習
We know that everything in the world can be decomposed into the combination of the basic elements. For example, water is the combination of hydrogen and oxygen. Similarly, in mathematics, basis is used to represent various things in a simple and unified way.atom
In RnRn space, we can use n independent vectors to represent any vector by linear combination. The n independent vectors can be viewed as a set of basis. There are infinite basis sets in RnRn space. Among them, basis vectors that are orthogonal to each other are of special interests. For example, {ei}ni=1{ei}i=1n is a special basis set with mutually orthogonal basis vectors in the same length, where eiei is a vector that has all zero entries except the iith entry which equals 1.
The inner product operator measures the similarity between vectors. For two vectors x and y , the inner product is the projection of one vector to the other.spa
A function f(x)f(x) can be viewed as an infinite vector, then for a function with two independent variables K(x,y)K(x,y) , we can view it as an infinite matrix. Among them, if K(x,y)=K(y,x)K(x,y)=K(y,x) and .net
for any function ff , then K(x,y)K(x,y) is symmetric and positive definite, in which case K(x,y)K(x,y) is a kernel function.rest
Here are some commonly used kernels:orm
The hyperbolic functions are:
Treat {λi−−√ψi}∞i=1{λiψi}i=1∞ as a set of orthogonal basis and construct a Hilbert space HH . Any function or vector in the space can be represented as the linear combination of the basis. Suppose f=∑∞i=1fiλi−−√ψif=∑i=1∞fiλiψi we can denote ff as an infinite vector in HH : f=(f1,f2,...)THf=(f1,f2,...)HT For another function g=(g1,g2,...)THg=(g1,g2,...)HT , we have
線性核函數
參考文獻:
[1] 機器學習裏的kernel是指什麼? - 算法 - 知乎. http://www.zhihu.com/question/30371867 [2016-9-6]
[2] http://songcy.net/posts/story-of-basis-and-kernel-part-1/
[3] http://songcy.net/posts/story-of-basis-and-kernel-part-2/