俗話說的好,三個臭皮匠頂個諸葛亮。Boosting大概就是作這種事情,也能夠理解是一個很牛的企業管理者,充分的利用底下員工的各類優勢,合理發揮每一個人的能量。html
簡單的理解boosting就是把不少個不一樣的classifiers使用不一樣的權重組合起來成一個新的strong classifier進行分類工做。git
把個人學習筆記記錄一下吧,以供之後使用這個算法的時候參考。算法
How to obtain the rough rule of thumb?api
How to combine this rule of thumbs to get a good thumb?less
What is the procedure to choose examples in each of rounds?ide
a)What is the procedure to choose examples in each of rounds?學習
We'll focus on the hardest examples so that we will choose the examples that the previous thumb misclassified.this
b)How to combine this rule of thumbs to get a good thumb?.net
To take a majority vote or weighted majority vote.code
Change a subset of examples to construct distribution D_t
That's the same thing.
The weight of D_t stands for how we concentrate on a particular round of boosting.
Combine all of h_t into a H_final classifier.
Q:
1.how do you construct the distribution D_t
2.how do we combine all of h_t into a H_final classifier?
AdaBoosting can solve this two questions.
A:
1.At the very first round, we don't have any information. We use uniform distribution weight.
At the following rounds, we try to focus on incorrectly classified examples. We cut the weight of the examples that are correctly classified.
2.
This formular is the weight vote of weak thumbs.
There's ten samples and three samples are misclassified.
So epsilon_1 = 3 / 10 = 0.30.
The alpha_t may bigger if one classifier has less error rate.
The classifier is a black box, so that you can choose whatever method that sitting around.
最後附上我本身使用Python實現的adaboosting的一個package或者說class供參考。若有錯誤請多不吝指正。
https://gitcafe.com/NeighborhoodGuo/Ada_boosting.git
Reference:
2.Boosting resources collected by Lyon