深度學習（七）U-Net原理以及keras代碼實現醫學圖像眼球血管分割

時間 2019-11-11

標籤深度學習原理以及 keras 代碼實現醫學圖像眼球血管分割简体版

原文原文鏈接

原文做者：aircrafthtml

原文連接：https://www.cnblogs.com/DOMLX/p/9780786.html前端

　　有沒有大佬們的公司招c++開發/圖像處理/opengl/opencv/halcon實習的啊，帶上我一個唄QAQpython

　　最近秦老師叫我研究深度學習與指靜脈結合，我就拿這篇來作敲門磚，而且成功將指靜脈的紋理特徵提取用u-net實現了，而眼球血管分割則給我提供了很大的幫助。如今就分享給你們吧。。雖然很想把指靜脈也寫一篇單獨的博客分享，可是不容許啊hhhhhhh
linux

DRIVE數據集下載百度雲連接：連接：https://pan.baidu.com/s/1C_1ikDwexB0hZvOwMSeDtw
提取碼：關注最下面公衆號添加小編微信發送文章標題源碼獲取

c++

U-net+kears實現眼部血管分割源碼python2.7版本的百度雲連接：連接：https://pan.baidu.com/s/1C_1ikDwexB0hZvOwMSeDtw
提取碼：關注最下面公衆號添加小編微信發送文章標題源碼獲取

git

U-net+kears實現眼部血管分割源碼python3.6版本的百度連接：連接：https://pan.baidu.com/s/1rAf6wuWGCswuBfkDivxjyQ
提取碼：關注最下面公衆號添加小編微信發送文章標題源碼獲取

github

全卷積神經網絡

大名鼎鼎的FCN就很少作介紹了，這裏有一篇很好的博文 http://www.cnblogs.com/gujianhan/p/6030639.html。
不過仍是建議把論文讀一下，這樣才能加深理解。編程

醫學圖像分割框架

醫學圖像分割主要有兩種框架，一個是基於CNN的，另外一個就是基於FCN的。這裏都是經過網絡來進行語義分割。後端

那麼什麼是語義分割？可不是漢字分割句意，在圖像處理中有本身的定義。api

圖像語義分割的意思就是機器自動分割並識別出圖像中的內容，好比給出一我的騎摩托車的照片，機器判斷後應當可以生成右側圖，紅色標註爲人，綠色是車（黑色表示 back ground）。

因此圖像分割對圖像理解的意義，就比如讀古書首先要斷句同樣。

在 Deeplearning 技術快速發展以前，就已經有了不少作圖像分割的技術，其中比較著名的是一種叫作「Normalized cut」的圖劃分方法，簡稱「N-cut」。

N-cut 的計算有一些鏈接權重的公式，這裏就不提了，它的思想主要是經過像素和像素之間的關係權重來綜合考慮，根據給出的閾值，將圖像一分爲二。

基於CNN 的框架

這個想法也很簡單，就是對圖像的每個像素點進行分類，在每個像素點上取一個patch，當作一幅圖像，輸入神經網絡進行訓練，舉個例子：

這是一篇發表在NIPS上的論文Ciresan D, Giusti A, Gambardella L M, et al. Deep neural networks segment neuronal membranes in electron microscopy images[C]//Advances in neural information processing systems. 2012: 2843-2851.

這是一個二分類問題，把圖像中全部label爲0的點做爲負樣本，全部label爲1的點做爲正樣本。

這種網絡顯然有兩個缺點：
1. 冗餘太大，因爲每一個像素點都須要取一個以自己爲中心patch，那麼相鄰的兩個像素點的patch類似度是很是高的，這就致使了很是多的冗餘，致使網絡訓練很慢。
2. 感覺野和定位精度不可兼得，當感覺野選取比較大的時候，後面對應的pooling層的降維倍數就會增大，這樣就會致使定位精度下降，可是若是感覺野比較小，那麼分類精度就會下降。

　　CNN 存在好久了，可是一直受限於過大的數據量和神經網絡的規模，並無得到很大的成功，直至 Krizhevsky 纔開始爆發。可是將 CNN 用於生物醫學圖像存在着兩點困難，首先CNN經常使用於分類，可是生物醫學每每關注的是分割之類的定位任務；其次醫學圖像很難得到那麼大規模的數據

以往解決上面兩點困難的方法是使用滑窗的方法，爲每個待分類的像素點取周圍的一部分鄰域輸入。這樣的方法有兩點好處，首先它完成了定位的工做，其次由於每次取一個像素點周圍的鄰域，因此大大增長了訓練數據的數量。可是這樣的方法也有兩個缺點，首先經過滑窗所取的塊之間具備較大的重疊，因此會致使速度變慢（由FCN的論文分析可知，前向傳播和反向傳播的速度都會變慢）；其次是網絡須要在局部準確性和獲取上下文之間進行取捨。由於更大的塊須要更多的池化層進而下降了定位的準確率，可是小的塊使網絡只看到很小的一部分上下文。如今一種常見的做法是將多個層放在一塊兒進行考慮（好比說FCN）。

基於FCN框架

在醫學圖像處理領域，有一個應用很普遍的網絡結構—-U-net ,網絡結構以下：

　　它包含重複的2個3x3卷積，緊接着是一個RELU，一個max pooling（步長爲2），用來降採樣，每次降採樣咱們都將feature channel減半。擴展路徑包含一個上採樣（2x2上卷積），這樣會減半feature channel，接着是一個對應的收縮路徑的feature map，而後是2個3x3卷積，每一個卷積後面跟一個RELU，由於每次卷積會丟失圖像邊緣，因此裁剪是有必要的，最後來一個1x1的卷積，用來將有64個元素的feature vector映射到一個類標籤，整個網絡一共有23個卷積層。

　　能夠看出來，就是一個全卷積神經網絡，輸入和輸出都是圖像，沒有全鏈接層。較淺的高分辨率層用來解決像素定位的問題，較深的層用來解決像素分類的問題。

好了理解完U-net網絡，咱們就學習一下怎麼用U-net網絡來進行醫學圖像分割。

U-net+kears實現眼部血管分割

原做者的【英文說明】https://github.com/orobix/retina-unet#retina-blood-vessel-segmentation-with-a-convolution-neural-network-u-net

實現環境可直接看這篇博客下載：2018最新win10 安裝tensorflow1.4（GPU/CPU）+cuda8.0+cudnn8.0-v6 + keras 安裝CUDA失敗導入tensorflow失敗報錯問題解決

linux下就環境同樣，配置就要本身去找了。

一、介紹

爲了可以更好的對眼部血管等進行檢測、分類等操做，咱們首先要作的就是對眼底圖像中的血管進行分割，保證最大限度的分割出眼部的血管。從而方便後續對血管部分的操做。

這部分代碼選用的數據集是DRIVE數據集，包括訓練集和測試集兩部分。眼底圖像數據如圖1所示。

圖1 DRIVE數據集的訓練集眼底圖像

DRIVE數據集的優勢是：不只有已經手工分好的的血管圖像（在manual文件夾下，如圖2所示），並且還包含有眼部輪廓的圖像（在mask文件夾下，如圖3所示）。

圖2 DRIVE數據集的訓練集手工標註血管圖像

圖3 DRIVE數據集的訓練集眼部輪廓圖像

DRIVE數據集的缺點是：顯而易見，從上面的圖片中能夠看出，訓練集只有20幅圖片，可見數據量實在是少之又少。。。

因此，爲了獲得更好的分割效果，咱們須要對這20幅圖像進行預處理從而增大其數據量

二、依賴的庫

- numpy >= 1.11.1
- Keras >= 2.1.0
- PIL >=1.1.7
- opencv >=2.4.10
- h5py >=2.6.0
- configparser >=3.5.0b2
- scikit-learn >= 0.17.1

三、數據讀取與保存

數據集中訓練集和測試集各只有20幅眼底圖像（tif格式）。首先要作的第一步就是對生成數據文件，方便後續的處理。因此這裏咱們須要對數據集中的眼底圖像、人工標註的血管圖像、眼部輪廓生成數據文件。這裏使用的是hdf5文件。有關hdf5文件的介紹，請參考CSDN博客（HDF5快速上手全攻略）。

四、網絡解析

　　由於U-net網絡能夠針對不多的數據集來進行語義分割，好比咱們這個眼球血管分割就是用了20張圖片來訓練就能夠達到很好的效果。並且咱們這種眼球血管，或者指靜脈，指紋之類的提取特徵或者血管靜脈在U-net網絡裏就是一個二分類問題，你們一聽，二分類對於目前的神經網絡不是一件很簡單的事情了嗎？還有是什麼能夠說的。

　　的確目前二分類問題是沒有什麼難度了，只要給我足夠的數據集作訓練。而本文用的U-net網絡來實現這個二分類就只須要二十張圖片來做爲數據集。你們能夠看到優點所在了吧。

五、具體實現

首先咱們確定都是要對數據進行一些預處理的。

第一步

　　先將圖像轉爲灰度圖分別讀入數組創建起一個符合咱們本身的tensor的格式纔好傳入神經網絡，這裏咱們是先將數據存入hdf5文件中，在開始運行的時候從文件中讀入。

#將對應的圖像數據存入對應圖像數組
def get_datasets(imgs_dir,groundTruth_dir,borderMasks_dir,train_test="null"):
    imgs = np.empty((Nimgs,height,width,channels))
    groundTruth = np.empty((Nimgs,height,width))
    border_masks = np.empty((Nimgs,height,width))
    for path, subdirs, files in os.walk(imgs_dir): #list all files, directories in the path
        for i in range(len(files)):
            #original
            print("original image: " +files[i])
            img = Image.open(imgs_dir+files[i])
            imgs[i] = np.asarray(img)
            #corresponding ground truth
            groundTruth_name = files[i][0:2] + "_manual1.gif"
            print("ground truth name: " + groundTruth_name)
            g_truth = Image.open(groundTruth_dir + groundTruth_name)
            groundTruth[i] = np.asarray(g_truth)
            #corresponding border masks
            border_masks_name = ""
            if train_test=="train":
                border_masks_name = files[i][0:2] + "_training_mask.gif"
            elif train_test=="test":
                border_masks_name = files[i][0:2] + "_test_mask.gif"
            else:
                print("specify if train or test!!")
                exit()
            print("border masks name: " + border_masks_name)
            b_mask = Image.open(borderMasks_dir + border_masks_name)
            border_masks[i] = np.asarray(b_mask)

    print("imgs max: " +str(np.max(imgs)))
    print("imgs min: " +str(np.min(imgs)))
    assert(np.max(groundTruth)==255 and np.max(border_masks)==255)
    assert(np.min(groundTruth)==0 and np.min(border_masks)==0)
    print("ground truth and border masks are correctly withih pixel value range 0-255 (black-white)")
    #reshaping for my standard tensors
    imgs = np.transpose(imgs,(0,3,1,2))
    assert(imgs.shape == (Nimgs,channels,height,width))
    groundTruth = np.reshape(groundTruth,(Nimgs,1,height,width))
    border_masks = np.reshape(border_masks,(Nimgs,1,height,width))
    assert(groundTruth.shape == (Nimgs,1,height,width))
    assert(border_masks.shape == (Nimgs,1,height,width))
    return imgs, groundTruth, border_masks

if not os.path.exists(dataset_path):
    os.makedirs(dataset_path)
#getting the training datasets
imgs_train, groundTruth_train, border_masks_train = get_datasets(original_imgs_train,groundTruth_imgs_train,borderMasks_imgs_train,"train")
print("saving train datasets")
write_hdf5(imgs_train, dataset_path + "DRIVE_dataset_imgs_train.hdf5")
write_hdf5(groundTruth_train, dataset_path + "DRIVE_dataset_groundTruth_train.hdf5")
write_hdf5(border_masks_train,dataset_path + "DRIVE_dataset_borderMasks_train.hdf5")

#getting the testing datasets
imgs_test, groundTruth_test, border_masks_test = get_datasets(original_imgs_test,groundTruth_imgs_test,borderMasks_imgs_test,"test")
print("saving test datasets")
write_hdf5(imgs_test,dataset_path + "DRIVE_dataset_imgs_test.hdf5")
write_hdf5(groundTruth_test, dataset_path + "DRIVE_dataset_groundTruth_test.hdf5")
write_hdf5(border_masks_test,dataset_path + "DRIVE_dataset_borderMasks_test.hdf5")

第二步

　　是對讀入內存準備開始訓練的圖像數據進行一些加強之類的處理，這裏對其進行了，直方圖均衡化，數據標準化，而且壓縮像素值到0-1，將其的一個數據符合標準正態分佈。固然啦咱們這個數據拿來訓練仍是太少的，因此咱們對每張圖片取patch時，除了正常的每一個patch每一個patch移動的取以外，咱們還在數據範圍內進行隨機取patch，這樣雖然各個patch之間會有一部分數據是相同的，可是這對於網絡而言，你傳入的也是一個新的東西，網絡能從中提取到的特徵也更多了。這一步的目的其實就是在有限的數據集中進行一些數據擴充，這也是在神經網絡訓練中經常使用的手段了。

　　固然了在這個過程當中咱們也能夠隨機組合小的patch來看看。

隨機原圖：

mask圖：

處理待訓練數據的部分代碼：

def get_data_training(DRIVE_train_imgs_original,
                      DRIVE_train_groudTruth,
                      patch_height,
                      patch_width,
                      N_subimgs,
                      inside_FOV):
    train_imgs_original = load_hdf5(DRIVE_train_imgs_original)
    train_masks = load_hdf5(DRIVE_train_groudTruth) #masks always the same
    # visualize(group_images(train_imgs_original[0:20,:,:,:],5),'imgs_train')#.show()  #check original imgs train


    train_imgs = my_PreProc(train_imgs_original)    #直方圖均衡化，數據標準化，壓縮像素值到0-1
    train_masks = train_masks/255.

    train_imgs = train_imgs[:,:,9:574,:]  #cut bottom and top so now it is 565*565
    train_masks = train_masks[:,:,9:574,:]  #cut bottom and top so now it is 565*565
    data_consistency_check(train_imgs,train_masks)

    #check masks are within 0-1
    assert(np.min(train_masks)==0 and np.max(train_masks)==1)

    print("\ntrain images/masks shape:")
    print(train_imgs.shape)
    print("train images range (min-max): " +str(np.min(train_imgs)) +' - '+str(np.max(train_imgs)))
    print("train masks are within 0-1\n")

    #extract the TRAINING patches from the full images
    patches_imgs_train, patches_masks_train = extract_random(train_imgs,train_masks,patch_height,patch_width,N_subimgs,inside_FOV)
    data_consistency_check(patches_imgs_train, patches_masks_train)

    print("\ntrain PATCHES images/masks shape:")
    print(patches_imgs_train.shape)
    print("train PATCHES images range (min-max): " +str(np.min(patches_imgs_train)) +' - '+str(np.max(patches_imgs_train)))

    return patches_imgs_train, patches_masks_train#, patches_imgs_test, patches_masks_test

第三步

按照U-net的網絡結構，使用keras來構造出網絡。這裏對keras函數語法不太理解的能夠看這篇博客：深度學習（六）keras經常使用函數學習

def get_unet(n_ch,patch_height,patch_width):
    inputs = Input(shape=(n_ch,patch_height,patch_width))
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same',data_format='channels_first')(inputs)
    conv1 = Dropout(0.2)(conv1)
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same',data_format='channels_first')(conv1)
    pool1 = MaxPooling2D((2, 2))(conv1)
    #
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same',data_format='channels_first')(pool1)
    conv2 = Dropout(0.2)(conv2)
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same',data_format='channels_first')(conv2)
    pool2 = MaxPooling2D((2, 2))(conv2)
    #
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same',data_format='channels_first')(pool2)
    conv3 = Dropout(0.2)(conv3)
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same',data_format='channels_first')(conv3)

    up1 = UpSampling2D(size=(2, 2))(conv3)
    up1 = concatenate([conv2,up1],axis=1)
    conv4 = Conv2D(64, (3, 3), activation='relu', padding='same',data_format='channels_first')(up1)
    conv4 = Dropout(0.2)(conv4)
    conv4 = Conv2D(64, (3, 3), activation='relu', padding='same',data_format='channels_first')(conv4)
    #上採樣後橫向拼接
    up2 = UpSampling2D(size=(2, 2))(conv4)
    up2 = concatenate([conv1,up2], axis=1)
    conv5 = Conv2D(32, (3, 3), activation='relu', padding='same',data_format='channels_first')(up2)
    conv5 = Dropout(0.2)(conv5)
    conv5 = Conv2D(32, (3, 3), activation='relu', padding='same',data_format='channels_first')(conv5)
    #
    conv6 = Conv2D(2, (1, 1), activation='relu',padding='same',data_format='channels_first')(conv5)
    conv6 = core.Reshape((2,patch_height*patch_width))(conv6)
    conv6 = core.Permute((2,1))(conv6)
    ############
    conv7 = core.Activation('softmax')(conv6)

    model = Model(inputs=inputs, outputs=conv7)

    # sgd = SGD(lr=0.01, decay=1e-6, momentum=0.3, nesterov=False)
    model.compile(optimizer='sgd', loss='categorical_crossentropy',metrics=['accuracy'])

    return model