scikit-learn,又寫做sklearn,是一個開源的基於python語言的機器學習工具包。它經過NumPy, SciPy和
Matplotlib等python數值計算的庫實現高效的算法應用,而且涵蓋了幾乎全部主流機器學習算法。
http://scikit-learn.org/stable/index.htmlhtml
https://sklearn.apachecn.org/python
安裝必要的包:算法
pip install numpy pandas matplotlib scikit-learn graphviz scipy jupyter
本例在jupyter裏運行,直接複製到jupyter裏運行便可。apache
# -*- coding:utf-8 -*- from sklearn import tree from sklearn.datasets import load_wine from sklearn.model_selection import train_test_split wine = load_wine() print(wine.data.shape) print(wine.target) #若是wine是一張表,應該長這樣: import pandas as pd pd.concat([pd.DataFrame(wine.data),pd.DataFrame(wine.target)],axis=1) print(wine.feature_names) print(wine.target_names) Xtrain, Xtest, Ytrain, Ytest = train_test_split(wine.data,wine.target,test_size=0.3) print(Xtrain.shape) print(Xtest.shape) clf = tree.DecisionTreeClassifier(criterion="entropy") clf = clf.fit(Xtrain, Ytrain) score = clf.score(Xtest, Ytest) #返回預測的準確度 print(score) feature_name = ['酒精','蘋果酸','灰','灰的鹼性','鎂','總酚','類黃酮','非黃烷類酚類','花青素','顏色強度','色調','od280/od315稀釋葡萄酒','脯氨酸'] import graphviz dot_data = tree.export_graphviz(clf ,feature_names= feature_name ,class_names=["琴酒","雪莉","貝爾摩德"] ,filled=True ,rounded=True ) graph = graphviz.Source(dot_data) graph
運行結果:機器學習
(178, 13) [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2] ['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline'] ['class_0' 'class_1' 'class_2'] (124, 13) (54, 13) 0.9629629629629629
沒有jupyter的同窗看這裏:http://www.javashuo.com/article/p-nnxitinc-ng.html工具
機器學習不能沒有它,嘿嘿!學習