ML–神經網絡

主要涉及的知識點有：node

神經網絡的前世此生
神經網絡的原理和非線性矯正
神經網絡的模型參數調節
使用神經網絡訓練手寫數字識別模型

一.神經網絡的前世此生

其實神經網絡並非什麼新鮮事物了，早在1943年，美國神經解剖家沃倫.麥克洛奇(Warren McCulloch)和數學家沃爾特.皮茨(Walter Pitts)就提出了第一個腦神經元的抽象模型，被稱爲M-P模型(McCulloch-Pitts neuron,MCP)python

1.神經網絡的起源

神經元是大腦中相互鏈接的神經細胞，它能夠處理和傳遞化學和電信號。有意思的是，神經元具備兩種常規工做狀態：興畚和抑制，這和計算機中的"1"和"0"原理幾乎徹底同樣。因此將神經元描述爲一個具有二進制輸出的邏輯門：當傳入的神經衝動使細胞膜電位升高超過閾值時，細胞進入興畚狀態，產生神經衝動並由軸突輸出；反之當傳入的衝動使細胞膜電位降低低於閾值時，細胞進入抑制狀態，便沒有神經衝動輸出程序員

2.神經網絡之父–傑弗瑞.欣頓

傑弗瑞.欣頓等人提出了反向傳播算法(Back propagation,BP)，解決了兩層神經網絡所須要的複雜計算問題，從新帶動業界的熱潮算法

二.神經網絡的原理及使用

1.神經網絡中的非線性矯正

從數學的角度來講，若是每個隱藏層只是進行加權求和，獲得的結果和普通的線性模型不會有什麼不一樣。因此爲了讓模型可以比普通線性模型更強大一些，咱們還須要進行一點處理網絡

這種處理方法是：在生成隱藏層以後，咱們要對結果進行非線性矯正(rectifying nonlinearity)，簡稱爲relu(rectified linear unit)或者是進行雙曲正切處理(tangens hyperbolicus)，簡稱爲tanh。咱們用圖像來進行直觀展現app

# 導入numpy
import numpy as np
# 導入畫圖工具
import matplotlib.pyplot as plt

# 生成一個等差數列
line=np.linspace(-5,5,200)

# 畫出非線性矯正的圖形表示
plt.plot(line,np.tanh(line),label='tanh')
plt.plot(line,np.maximum(line,0),label='relu')

# 設置圖注位置
plt.legend(loc='best')

plt.xlabel('x')
plt.ylabel('relu(x) and tanh(x)')

plt.show()

[結果分析] tanh函數把特徵x的值壓縮進-1到1的區間內，-1表明的是x中較小的數值，而1表明x中較大的數值。relu函數則索性把小於0的x值所有去掉，用0來代替。這兩種非線性處理的方法，都是爲了將樣本特徵進行簡化，從而使神經網絡能夠對複雜的非線性數據集進行學習dom

2.神經網絡的參數設置

# 導入MLP神經網絡
from sklearn.neural_network import MLPClassifier
# 導入紅酒數據集
from sklearn.datasets import load_wine
# 導入數據集拆分工具
from sklearn.model_selection import train_test_split

wine=load_wine()
X=wine.data[:,:2]
y=wine.target

# 拆分數據集
X_train,X_test,y_train,y_test=train_test_split(X,y,random_state=0)

# 定義分類器
mlp=MLPClassifier(solver='lbfgs')
mlp.fit(X_train,y_train)

MLPClassifier(activation='relu', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=(100,), learning_rate='constant',
       learning_rate_init=0.001, max_iter=200, momentum=0.9,
       n_iter_no_change=10, nesterovs_momentum=True, power_t=0.5,
       random_state=None, shuffle=True, solver='lbfgs', tol=0.0001,
       validation_fraction=0.1, verbose=False, warm_start=False)

下面咱們重點看一下各個參數的含義：

alpha值和線性模型的alpha值是同樣的，是一個L2懲罰項，用來控制正則化的程度，默認的數值是0.0001

hidden_layer,sizes參數，默認狀況下，hidden_layer_sizes的值是[100,]，這意味着模型中只有一個隱藏層，而隱藏層中的節點數是100.若是咱們給hidden_layer_sizes定義爲[10,10]，那就意味着模型中有兩個隱藏層，每層有10個節點機器學習

# 導入畫圖工具
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

# 使用不一樣色塊表示不一樣分類
cmap_light=ListedColormap(['#FFAAAA','#AAFFAA','#AAAAFF'])
cmap_bold=ListedColormap(['#FF0000','#00FF00','#0000FF'])

x_min,x_max=X_train[:,0].min()-1,X_train[:,0].max()+1
y_min,y_max=X_train[:,1].min()-1,X_train[:,1].max()+1

xx,yy=np.meshgrid(np.arange(x_min,x_max,.02),np.arange(y_min,y_max,.02))

Z=mlp.predict(np.c_[xx.ravel(),yy.ravel()])

Z=Z.reshape(xx.shape)

plt.figure()
plt.pcolormesh(xx,yy,Z,cmap=cmap_light)

# 將數據特徵用散點圖表示出來
plt.scatter(X[:,0],X[:,1],c=y,edgecolor='k',s=60)
plt.xlim(xx.min(),xx.max())
plt.ylim(yy.min(),yy.max())

plt.title('MLPClassifier:solver=lbfgs')

plt.show()

下面咱們試試吧隱藏層的節點數變少，如減小至10個，看會發生什麼socket

# 設定隱藏層中的節點數爲10
mlp_10=MLPClassifier(solver='lbfgs',hidden_layer_sizes=[10])
mlp_10.fit(X_train,y_train)

Z10=mlp_10.predict(np.c_[xx.ravel(),yy.ravel()])

Z10=Z10.reshape(xx.shape)

plt.figure()
plt.pcolormesh(xx,yy,Z10,cmap=cmap_light)

# 使用散點圖畫出X
plt.scatter(X[:,0],X[:,1],c=y,edgecolor='k',s=60)
plt.xlim(xx.min(),xx.max())
plt.ylim(yy.min(),yy.max())

plt.title("MLPClassifier:nodes=10")

plt.show()

[結果分析] 在每個隱藏層當中，節點數就表明了決定邊界中最大的直線數，這個數值越大，則決定邊界看起來越平滑。固然，除了增長單個隱藏層中的節點數以外，還有兩種方法可讓邊界看起來更細膩：一個是增長隱藏層的數量；另外一個是把activation參數改成tanh函數

如今咱們試着給MLP分類器增長隱藏層數量，如增長到2層

# 設置神經網絡有兩個節點數爲10的隱藏層
mlp_2L=MLPClassifier(solver='lbfgs',hidden_layer_sizes=[10,10])
mlp_2L.fit(X_train,y_train)

Z2L=mlp_2L.predict(np.c_[xx.ravel(),yy.ravel()])

Z2L=Z2L.reshape(xx.shape)

plt.figure()
plt.pcolormesh(xx,yy,Z2L,cmap=cmap_light)

# 使用散點圖畫出X
plt.scatter(X[:,0],X[:,1],c=y,edgecolor='k',s=60)
plt.xlim(xx.min(),xx.max())
plt.ylim(yy.min(),yy.max())

plt.title("MLPClassifier:layers=2")

plt.show()

下面使用activation=tanh實驗一下

# 設置激活函數爲tanh
mlp_tanh=MLPClassifier(solver='lbfgs',hidden_layer_sizes=[10,10],activation='tanh')
mlp_tanh.fit(X_train,y_train)

Z2=mlp_tanh.predict(np.c_[xx.ravel(),yy.ravel()])

Z2=Z2.reshape(xx.shape)

plt.figure()
plt.pcolormesh(xx,yy,Z2,cmap=cmap_light)

# 使用散點圖畫出X
plt.scatter(X[:,0],X[:,1],c=y,edgecolor='k',s=60)
plt.xlim(xx.min(),xx.max())
plt.ylim(yy.min(),yy.max())

plt.title("MLPClassifier:layers=2 with tanh")

plt.show()

調節alpha值來進行模型複雜度控制

# 修改模型的alpha參數
mlp_alpha=MLPClassifier(solver='lbfgs',hidden_layer_sizes=[10,10],activation='tanh',alpha=1)
mlp_alpha.fit(X_train,y_train)

Z3=mlp_alpha.predict(np.c_[xx.ravel(),yy.ravel()])

Z3=Z3.reshape(xx.shape)

plt.figure()
plt.pcolormesh(xx,yy,Z3,cmap=cmap_light)

# 使用散點圖畫出X
plt.scatter(X[:,0],X[:,1],c=y,edgecolor='k',s=60)
plt.xlim(xx.min(),xx.max())
plt.ylim(yy.min(),yy.max())

plt.title("MLPClassifier:alpha=1")

plt.show()

到目前爲止，我麼有四種方法能夠調節模型的複雜程度了，第1種是跳轉神經網絡每個隱藏層上的節點數，第2種是調節神經網絡隱藏層的層數，第3種是調節activation的方式，第4種是經過調整alpha值來改變模型正則化的程度

[注意] 因爲神經網絡算法中，樣本特徵的權重是在模型開始學習以前，就已經隨機生成了。而隨機生成的權重會致使模型的形態也徹底不同。因此若是咱們不指定random_state的話，即使模型全部的參數都是相同的，生成的決定邊界也不同。因此若是從新運行咱們以前的代碼，，也會獲得不一樣的結果。不過不用擔憂，只要模型的複雜度不變，其預測結果的準確率不會受什麼影響

三.神經網絡實例–手寫識別

在神經網絡的學習中，使用MNIST數據集訓練圖像識別，就如同程序員剛入門時要寫"hello world"同樣，是很是基礎的必修課

1.使用MNIST數據集

MNIST數據集是一個專門用來訓練各類圖形處理系統的龐大數據集，它包含70000個手寫數字圖像，其中60000個是訓練數據，另外10000個是測試數據。而在機器學習領域，該數據集也被普遍用於模型的訓練和測試。MNIST數據集其實是從NIST原始數據集中提取的，其訓練集和測試集有一半是來自NIST數據集的訓練集，而另外一半是來自NIST的測試集

接下來咱們就用scikit-learn的fetch_mldata來獲取MNIST數據集，輸入代碼以下：

# 導入數據集獲取工具
from sklearn.datasets import fetch_mldata

# 加載MNIST手寫數字數據集
mnist=fetch_mldata('MNIST original')
mnist

E:\Anaconda\envs\mytensorflow\lib\site-packages\sklearn\utils\deprecation.py:77: DeprecationWarning: Function fetch_mldata is deprecated; fetch_mldata was deprecated in version 0.20 and will be removed in version 0.22
  warnings.warn(msg, category=DeprecationWarning)
E:\Anaconda\envs\mytensorflow\lib\site-packages\sklearn\utils\deprecation.py:77: DeprecationWarning: Function mldata_filename is deprecated; mldata_filename was deprecated in version 0.20 and will be removed in version 0.22
  warnings.warn(msg, category=DeprecationWarning)



---------------------------------------------------------------------------

TimeoutError                              Traceback (most recent call last)

<ipython-input-25-c42d12ebe31a> in <module>()
      3 
      4 # 加載MNIST手寫數字數據集
----> 5 mnist=fetch_mldata('MNIST original')
      6 mnist


E:\Anaconda\envs\mytensorflow\lib\site-packages\sklearn\utils\deprecation.py in wrapped(*args, **kwargs)
     76         def wrapped(*args, **kwargs):
     77             warnings.warn(msg, category=DeprecationWarning)
---> 78             return fun(*args, **kwargs)
     79 
     80         wrapped.__doc__ = self._update_doc(wrapped.__doc__)


E:\Anaconda\envs\mytensorflow\lib\site-packages\sklearn\datasets\mldata.py in fetch_mldata(dataname, target_name, data_name, transpose_data, data_home)
    131         urlname = MLDATA_BASE_URL % quote(dataname)
    132         try:
--> 133             mldata_url = urlopen(urlname)
    134         except HTTPError as e:
    135             if e.code == 404:


E:\Anaconda\envs\mytensorflow\lib\urllib\request.py in urlopen(url, data, timeout, cafile, capath, cadefault, context)
    221     else:
    222         opener = _opener
--> 223     return opener.open(url, data, timeout)
    224 
    225 def install_opener(opener):


E:\Anaconda\envs\mytensorflow\lib\urllib\request.py in open(self, fullurl, data, timeout)
    524             req = meth(req)
    525 
--> 526         response = self._open(req, data)
    527 
    528         # post-process response


E:\Anaconda\envs\mytensorflow\lib\urllib\request.py in _open(self, req, data)
    542         protocol = req.type
    543         result = self._call_chain(self.handle_open, protocol, protocol +
--> 544                                   '_open', req)
    545         if result:
    546             return result


E:\Anaconda\envs\mytensorflow\lib\urllib\request.py in _call_chain(self, chain, kind, meth_name, *args)
    502         for handler in handlers:
    503             func = getattr(handler, meth_name)
--> 504             result = func(*args)
    505             if result is not None:
    506                 return result


E:\Anaconda\envs\mytensorflow\lib\urllib\request.py in http_open(self, req)
   1344 
   1345     def http_open(self, req):
-> 1346         return self.do_open(http.client.HTTPConnection, req)
   1347 
   1348     http_request = AbstractHTTPHandler.do_request_


E:\Anaconda\envs\mytensorflow\lib\urllib\request.py in do_open(self, http_class, req, **http_conn_args)
   1319             except OSError as err: # timeout error
   1320                 raise URLError(err)
-> 1321             r = h.getresponse()
   1322         except:
   1323             h.close()


E:\Anaconda\envs\mytensorflow\lib\http\client.py in getresponse(self)
   1329         try:
   1330             try:
-> 1331                 response.begin()
   1332             except ConnectionError:
   1333                 self.close()


E:\Anaconda\envs\mytensorflow\lib\http\client.py in begin(self)
    295         # read until we get a non-100 response
    296         while True:
--> 297             version, status, reason = self._read_status()
    298             if status != CONTINUE:
    299                 break


E:\Anaconda\envs\mytensorflow\lib\http\client.py in _read_status(self)
    256 
    257     def _read_status(self):
--> 258         line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
    259         if len(line) > _MAXLINE:
    260             raise LineTooLong("status line")


E:\Anaconda\envs\mytensorflow\lib\socket.py in readinto(self, b)
    584         while True:
    585             try:
--> 586                 return self._sock.recv_into(b)
    587             except timeout:
    588                 self._timeout_occurred = True


TimeoutError: [WinError 10060] 因爲鏈接方在一段時間後沒有正確答覆或鏈接的主機沒有反應，鏈接嘗試失敗。

使用fetch_mldata加載MNIST數據集時，能夠出現下列錯誤，能夠參考：參考文檔
從新運行代碼以下：

# 導入數據集獲取工具
from sklearn.datasets import fetch_mldata

# 加載MNIST手寫數字數據集
mnist=fetch_mldata('MNIST original')
mnist

{'COL_NAMES': ['label', 'data'],
 'DESCR': 'mldata.org dataset: mnist-original',
 'data': array([[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ..., 
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]], dtype=uint8),
 'target': array([ 0.,  0.,  0., ...,  9.,  9.,  9.])}

print("樣本數量：{},樣本特徵數：{}".format(mnist.data.shape[0],mnist.data.shape[1]))

樣本數量：70000,樣本特徵數：784

[結果分析] 數據集中有70000個樣本，每一個樣本有784個特徵。這是由於，數據集中存儲的樣本是28x28像素的手寫數字圖片的像素信息，所以特徵數爲28x28=784個

在開始訓練MLP神經網絡以前，咱們還須要將數據進行一些預處理，因爲樣本特徵是從0–255的灰度值，爲了讓特徵的數值更利於建模，咱們把特徵向量的值所有除以255，這樣所有數值就會在0和1之間，再用咱們熟悉的train_test_split函數將數據集分爲訓練集和測試集

# 創建訓練數據集和測試數據集
X=mnist.data/255.
y=mnist.target
X_train,X_test,y_train,y_test=train_test_split(X,y,train_size=5000,test_size=1000,random_state=62)

爲了控制神經網絡的訓練時長，咱們只選5000個樣本做爲訓練數據集，選取1000個數據做爲測試數據集。同時爲了每次選取的數據保持一致，咱們指定random_state爲62

2.訓練MLP神經網絡

# 設置神經網絡有兩個100個節點的隱藏層
mlp_hw=MLPClassifier(solver='lbfgs',hidden_layer_sizes=[100,100],activation='relu',alpha=1e-5,random_state=62)

# 使用數據訓練神經網絡模型
mlp_hw.fit(X_train,y_train)

print('測試數據集得分：{:.2f}%'.format(mlp_hw.score(X_test,y_test)*100))

測試數據集得分：93.60%

3.使用模型進行數字識別

注意由於圖像是28x28像素，因此放大後看起來會不夠清晰

# 導入圖像處理工具
from PIL import Image
# 打開圖像
image=Image.open('8.png').convert('F')

# 調整圖像的大小
image=image.resize((28,28))
arr=[]
# 將圖像中的像素做爲預測數據點的特徵
for i in range(28):
    for j in range(28):
        pixel=1.0-float(image.getpixel((j,i)))/255.
        arr.append(pixel)
        
# 因爲只有一個樣本,因此須要進行reshape操做
arr1=np.array(arr).reshape(1,-1)

# 進行圖像識別
print("圖片中的數字是：{:.0f}".format(mlp_hw.predict(arr1)[0]))

圖片中的數字是：8

Image.convert功能將圖片轉化爲32位浮點灰色圖像，也就是說它的每一個像素用32個bit來表示，0表明黑，255表明白。然後將每一個像素的數值都進行除以255的處理，以保持和數據集一致

ML--神經網絡