內容列表git
介紹github
先決條件編程
什麼是U-NET微信
U-NET結構網絡
KAGGLE數據科學SCIENCE BOWL 2018 挑戰賽架構
介紹
計算機視覺是人工智能的一個領域,訓練計算機解釋和理解視覺世界。利用來自相機、視頻和深度學習模型的數字圖像,機器能夠準確地識別和分類物體,而後對它們看到的東西作出反應。app
在過去幾年裏,深度學習使得計算機視覺領域迅速發展。在這篇文章中,我想討論計算機視覺中一個叫作分割的特殊任務。儘管研究人員已經提出了許多方法來解決這個問題,但我將討論一種特殊的架構,即UNET,它使用一個徹底卷積的網絡模型來完成這項任務。dom
咱們將利用UNET構建Kaggle SCIENCE BOWL 2018 挑戰賽的第一解決方案。機器學習
先決條件
這篇文章是假設讀者已經熟悉機器學習和卷積網絡的基本概念。同時,他/她也有一些使用Python和Keras庫的ConvNets的工做知識。ide
什麼是市場細分?
分割的目的是將圖像的不一樣部分分割成可感知的相干部分。細分有兩種類型:
語義分割(基於標記類的像素級預測)
實例分割(目標檢測和目標識別)
在這篇文章中,咱們將主要關注語義分割。
U-NET是什麼?
U-Net建立於2015年,是一款專爲生物醫學圖像分割而開發的CNN。目前,U-Net已經成爲一種很是流行的用於語義分割的端到端編解碼器網絡。它有一個獨特的上下結構,有一個收縮路徑和一個擴展路徑。
U-NET 結構
U-Net下采樣路徑由4個block組成,其層數以下:
3x3 CONV (ReLU +批次標準化和Dropout使用)
3x3 CONV (ReLU +批次標準化和Dropout使用)
2x2 最大池化
當咱們沿着這些塊往下走時,特徵圖會翻倍,從64開始,而後是12八、256和512。
瓶頸層由2個CONV層、BN和Dropout組成
與下采樣類似上採樣路徑由4個塊組成,層數以下:
反捲積層
從特徵圖中拼接出相應的收縮路徑
3x3 CONV (ReLU +BN和Dropout)
3x3 CONV (ReLU +BN和Dropout)
KAGGLE DATA SCIENCE BOWL 2018 CHALLENGE
這項挑戰的主要任務是在圖像中檢測原子核。經過自動化核檢測,你能夠幫助更快的解鎖治療。識別細胞核是大多數分析的起點,由於人體30萬億個細胞中的大多數都包含一個充滿DNA的細胞核,而DNA是給每一個細胞編程的遺傳密碼。識別細胞核使研究人員可以識別樣本中的每一個細胞,並經過測量細胞對各類治療的反應,研究人員能夠了解潛在的生物學過程。
樣本圖像,目標和方法
咱們將使用U-Net這個專門爲分割任務而設計的CNN自動生成圖像遮罩
導入全部必要的包和模塊
import os
import sys
import random
import warnings
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from tqdm import tqdm
from itertools import chain
from skimage.io import imread, imshow, imread_collection, concatenate_images
from skimage.transform import resize
from skimage.morphology import label
from keras.models import Model, load_model
from keras.layers import Input
from keras.layers.core import Dropout, Lambda
from keras.layers.convolutional import Conv2D, Conv2DTranspose
from keras.layers.pooling import MaxPooling2D
from keras.layers.merge import concatenate
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras import backend as K
import tensorflow as tf
IMG_WIDTH = 128
IMG_HEIGHT = 128
IMG_CHANNELS = 3
TRAIN_PATH = './U_NET/train/'
TEST_PATH = './U_NET/validation/'
warnings.filterwarnings('ignore', category=UserWarning, module='skimage')
seed = 42
random.seed = seed
np.random.seed = seed
爲訓練和測試數據收集咱們的文件名
train_ids = next(os.walk(TRAIN_PATH))[1]
test_ids = next(os.walk(TEST_PATH))[1]
建立尺寸爲128 x 128的圖像遮罩(黑色圖像)
print('Getting and resizing training images ... ')
X_train = np.zeros((len(train_ids), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.uint8)
Y_train = np.zeros((len(train_ids), IMG_HEIGHT, IMG_WIDTH, 1), dtype=np.bool)# Re-sizing our training images to 128 x 128
# Note sys.stdout prints info that can be cleared unlike print.
# Using TQDM allows us to create progress bars
sys.stdout.flush()
for n, id_ in tqdm(enumerate(train_ids), total=len(train_ids)):
path = TRAIN_PATH + id_
img = imread(path + '/images/' + id_ + '.png')[:,:,:IMG_CHANNELS]
img = resize(img, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True)
X_train[n] = img
mask = np.zeros((IMG_HEIGHT, IMG_WIDTH, 1), dtype=np.bool)
# Now we take all masks associated with that image and combine them into one single mask
for mask_file in next(os.walk(path + '/masks/'))[2]:
mask_ = imread(path + '/masks/' + mask_file)
mask_ = np.expand_dims(resize(mask_, (IMG_HEIGHT, IMG_WIDTH), mode='constant',
preserve_range=True), axis=-1)
mask = np.maximum(mask, mask_)
# Y_train is now our single mask associated with our image
Y_train[n] = mask
# Get and resize test images
X_test = np.zeros((len(test_ids), IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS), dtype=np.uint8)
sizes_test = []
print('Getting and resizing test images ... ')
sys.stdout.flush()
# Here we resize our test images
for n, id_ in tqdm(enumerate(test_ids), total=len(test_ids)):
path = TEST_PATH + id_
img = imread(path + '/images/' + id_ + '.png')[:,:,:IMG_CHANNELS]
sizes_test.append([img.shape[0], img.shape[1]])
img = resize(img, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True)
X_test[n] = img
print('Done!')
創建U-Net模型
def my_iou_metric(label, pred): metric_value = tf.py_func(iou_metric_batch, [label, pred], tf.float32)
return metric_value
# Build U-Net model
# Note we make our layers varaibles so that we can concatenate or stack
# This is required so that we can re-create our U-Net Modelinputs = Input((IMG_HEIGHT, IMG_WIDTH, IMG_CHANNELS))
s = Lambda(lambda x: x / 255) (inputs)c1 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (s)
c1 = Dropout(0.1) (c1)
c1 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c1)
p1 = MaxPooling2D((2, 2)) (c1)c2 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p1)
c2 = Dropout(0.1) (c2)
c2 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c2)
p2 = MaxPooling2D((2, 2)) (c2)c3 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p2)
c3 = Dropout(0.2) (c3)
c3 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c3)
p3 = MaxPooling2D((2, 2)) (c3)c4 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p3)
c4 = Dropout(0.2) (c4)
c4 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c4)
p4 = MaxPooling2D(pool_size=(2, 2)) (c4)c5 = Conv2D(256, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (p4)
c5 = Dropout(0.3) (c5)
c5 = Conv2D(256, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c5)u6 = Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same') (c5)
u6 = concatenate([u6, c4])
c6 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u6)
c6 = Dropout(0.2) (c6)
c6 = Conv2D(128, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c6)u7 = Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same') (c6)
u7 = concatenate([u7, c3])
c7 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u7)
c7 = Dropout(0.2) (c7)
c7 = Conv2D(64, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c7)u8 = Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same') (c7)
u8 = concatenate([u8, c2])
c8 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u8)
c8 = Dropout(0.1) (c8)
c8 = Conv2D(32, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c8)u9 = Conv2DTranspose(16, (2, 2), strides=(2, 2), padding='same') (c8)
u9 = concatenate([u9, c1], axis=3)
c9 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (u9)
c9 = Dropout(0.1) (c9)
c9 = Conv2D(16, (3, 3), activation='elu', kernel_initializer='he_normal', padding='same') (c9)# Note our output is effectively a mask of 128 x 128
outputs = Conv2D(1, (1, 1), activation='sigmoid') (c9)model = Model(inputs=[inputs], outputs=[outputs])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=[my_iou_metric])
model.summary()
訓練咱們的模型
model_path = "./nuclei_finder_unet_1.h5"
checkpoint = ModelCheckpoint(model_path,
monitor="val_loss",
mode="min",
save_best_only = True,
verbose=1)
earlystop = EarlyStopping(monitor = 'val_loss',
min_delta = 0,
patience = 5,
verbose = 1,
restore_best_weights = True)
# Fit our model
results = model.fit(X_train, Y_train, validation_split=0.1,
batch_size=16, epochs=10,
callbacks=[earlystop, checkpoint])
生成驗證數據的預測
# Predict on training and validation data
# Note our use of mean_iou metri
model = load_model('./nuclei_finder_unet_1.h5',
custom_objects={'my_iou_metric': my_iou_metric})
# the first 90% was used for training
preds_train = model.predict(X_train[:int(X_train.shape[0]*0.9)], verbose=1)
# the last 10% used as validation
preds_val = model.predict(X_train[int(X_train.shape[0]*0.9):], verbose=1)
#preds_test = model.predict(X_test, verbose=1)
# Threshold predictions
preds_train_t = (preds_train > 0.5).astype(np.uint8)
preds_val_t = (preds_val > 0.5).astype(np.uint8)
在咱們的訓練數據上顯示咱們預測的遮罩
ix = random.randint(0, 602)
plt.figure(figsize=(20,20))
# Our original training image
plt.subplot(131)
imshow(X_train[ix])
plt.title("Image")
# Our original combined mask
plt.subplot(132)
imshow(np.squeeze(Y_train[ix]))
plt.title("Mask")
# The mask our U-Net model predicts
plt.subplot(133)
imshow(np.squeeze(preds_train_t[ix] > 0.5))
plt.title("Predictions")
plt.show()
最後這裏是完整的代碼:
數據集:https://www.kaggle.com/c/data-science-bowl-2018
本文代碼:https://github.com/bhaveshgoyal27/mediumblogs/blob/master/U-Net.ipynb
做者:Bhavesh Goyal
deephub翻譯組
本文分享自微信公衆號 - DeepHub IMBA(deephub-imba)。
若有侵權,請聯繫 support@oschina.cn 刪除。
本文參與「OSC源創計劃」,歡迎正在閱讀的你也加入,一塊兒分享。