第一個機器學習scikit-learn可視化例子

時間 2020-03-19

標籤第一個機器學習 scikit learn 可視化例子简体版

原文原文鏈接

scikit-learn，又寫做sklearn，是一個開源的基於python語言的機器學習工具包。它經過NumPy, SciPy和
Matplotlib等python數值計算的庫實現高效的算法應用，而且涵蓋了幾乎全部主流機器學習算法。
http://scikit-learn.org/stable/index.htmlhtml

https://sklearn.apachecn.org/python

安裝必要的包：算法

pip install numpy pandas matplotlib scikit-learn  graphviz  scipy jupyter

本例在jupyter裏運行，直接複製到jupyter裏運行便可。apache

# -*- coding:utf-8 -*-
from sklearn import tree
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split

wine = load_wine()
print(wine.data.shape)
print(wine.target)
#若是wine是一張表，應該長這樣：
import pandas as pd
pd.concat([pd.DataFrame(wine.data),pd.DataFrame(wine.target)],axis=1)
print(wine.feature_names)
print(wine.target_names)
Xtrain, Xtest, Ytrain, Ytest = train_test_split(wine.data,wine.target,test_size=0.3)
print(Xtrain.shape)
print(Xtest.shape)

clf = tree.DecisionTreeClassifier(criterion="entropy")
clf = clf.fit(Xtrain, Ytrain)
score = clf.score(Xtest, Ytest) #返回預測的準確度
print(score)

feature_name = ['酒精','蘋果酸','灰','灰的鹼性','鎂','總酚','類黃酮','非黃烷類酚類','花青素','顏色強度','色調','od280/od315稀釋葡萄酒','脯氨酸']

import graphviz
dot_data = tree.export_graphviz(clf
                               ,feature_names= feature_name
                               ,class_names=["琴酒","雪莉","貝爾摩德"]
                               ,filled=True
                               ,rounded=True
                               )
graph = graphviz.Source(dot_data)
graph

運行結果：機器學習

(178, 13)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2]
['alcohol', 'malic_acid', 'ash', 'alcalinity_of_ash', 'magnesium', 'total_phenols', 'flavanoids', 'nonflavanoid_phenols', 'proanthocyanins', 'color_intensity', 'hue', 'od280/od315_of_diluted_wines', 'proline']
['class_0' 'class_1' 'class_2']
(124, 13)
(54, 13)
0.9629629629629629