Keras（二）Application中五款已訓練模型、VGG16框架解讀

原文連接：http://www.one2know.cn/keras3/git

Application的五款已訓練模型 + H5py簡述

Keras的應用模塊Application提供了帶有預訓練權重的Keras模型，這些模型能夠用來進行預測、特徵提取和finetune。後續還有對如下幾個模型的參數介紹： Xception VGG16 VGG19 ResNet50 InceptionV3 全部的這些模型(除了Xception)都兼容Theano和Tensorflow，並會自動基於~/.keras/keras.json的Keras的圖像維度進行自動設置。例如，若是你設置data_format=」channel_last」，則加載的模型將按照TensorFlow的維度順序來構造，即「Width-Height-Depth」的順序。模型的官方下載路徑： <u>https://github.com/fchollet/deep-learning-models/releases</u>
th與tf的區別 Keras提供了兩套後端，Theano和Tensorflow th和tf的大部分功能都被backend統一包裝起來了，但兩者仍是存在不小的衝突，有時候你須要特別注意Keras是運行在哪一種後端之上，它們的主要衝突有： dim_ordering，也就是維度順序。比方說一張224×224的彩色圖片，theano的維度順序是(3，224，224)，即通道維在前。而tf的維度順序是(224，224，3)，即通道維在後。數據格式的區別，channels_last」對應本來的「tf」，「channels_first」對應本來的「th」。以128x128的RGB圖像爲例，「channels_first」應將數據組織爲（3,128,128），而「channels_last」應將數據組織爲（128,128,3）
notop模型是否包含最後的3個全鏈接層，用來作微調（fine-tuning）專用，專門開源了這類模型
H5py簡述 keras的已訓練模型是H5PY格式的，後綴是h5 h5py.File相似Python的詞典對象，所以咱們能夠查看全部的鍵值輸入： import h5py file=h5py.File('.../notop.h5','r') 查看鍵值： f = file.attrs['nb_layers'] f.key() 查看到f中各個層內有些什麼：

for name in f:
    print(name)

官方案例：利用ResNet50網絡進行ImageNet分類識別大象的品種：

from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input,decode_predictions
import numpy as np

model = ResNet50(weights=r'..\Model\resnet50_weights_tf_dim_ordering_tf_kernels.h5')

img_path = 'elephant.jpg'
img = image.load_img(img_path,target_size=(224,224))
# 現有模型輸入shape爲 (224, 224, 3)
x = image.img_to_array(img)
x = np.expand_dims(x,axis=0)
x = preprocess_input(x)

preds = model.predict(x)
print('Predicted:',decode_predictions(preds,top=3)[0])

輸出：github

Predicted: [('n02504458', 'African_elephant', 0.603124), ('n02504013', 'Indian_elephant', 0.334439), ('n01871265', 'tusker', 0.062180385)]

五個模型 1.Xception模型：僅能以TensorFlow爲後端使用，目前該模型只支持channels_last的維度順序(width, height, channels) 默認輸入圖片大小爲299x299 keras.applications.xception.Xception(include_top=True,weights='imagenet',input_tensor=None, input_shape=None,pooling=None, classes=1000) 2.VGG16模型：在Theano和TensorFlow後端都可使用，並接受channels_first和channels_last兩種輸入維度順序默認輸入圖片大小爲224x224 keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',input_tensor=None, input_shape=None,pooling=None,classes=1000) 3.VGG19模型在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序默認輸入圖片大小爲224x224 keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None,pooling=None,classes=1000) 4.ResNet50模型在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序默認輸入圖片大小爲224x224 keras.applications.resnet50.ResNet50(include_top=True,weights='imagenet',input_tensor=None, input_shape=None,pooling=None,classes=1000) 5.InceptionV3模型在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序默認輸入圖片大小爲299x299 keras.applications.inception_v3.InceptionV3(include_top=True,weights='imagenet',input_tensor=None,input_shape=None,pooling=None,classes=1000)

keras-applications-VGG16解讀：函數式

VGG16默認的輸入數據格式應該是：channels_last

from __future__ import print_function

import numpy as np
import warnings

from keras.models import Model
from keras.layers import Flatten,Dense,Input,Conv2D
from keras.layers import MaxPooling2D,GlobalMaxPooling2D,GlobalAveragePooling2D
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras import backend as K
from keras.applications.imagenet_utils import decode_predictions
# decode_predictions 輸出5個最高几率：(類名, 語義概念, 預測機率) decode_predictions(y_pred)
from keras.applications.imagenet_utils import preprocess_input
# 預處理 圖像編碼服從規定，譬如,RGB，GBR這一類的，preprocess_input(x)
from keras_applications.imagenet_utils import _obtain_input_shape
# 肯定適當的輸入形狀，至關於opencv中的read.img，將圖像變爲數組
from keras.engine.topology import get_source_inputs

WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'

def VGG16(include_top=True, weights='imagenet',
          input_tensor=None, input_shape=None,
          pooling=None,
          classes=1000):
    # 檢查weight與分類設置是否正確
    if weights not in {'imagenet', None}:
        raise ValueError('The `weights` argument should be either '
                         '`None` (random initialization) or `imagenet` '
                         '(pre-training on ImageNet).')

    if weights == 'imagenet' and include_top and classes != 1000:
        raise ValueError('If using `weights` as imagenet with `include_top`'
                         ' as true, `classes` should be 1000')

    # 設置圖像尺寸，相似caffe中的transform
    # Determine proper input shape
    input_shape = _obtain_input_shape(input_shape,
                                      default_size=224,
                                      min_size=48,
                                      # 模型所能接受的最小長寬
                                      data_format=K.image_data_format(),
                                      # 數據的使用格式
                                      require_flatten=include_top)
                                      #是否經過一個Flatten層再鏈接到分類器

    # 數據簡單處理，resize
    if input_tensor is None:
        img_input = Input(shape=input_shape)
        # 這裏的Input是keras的格式，能夠用於轉換
    else:
        if not K.is_keras_tensor(input_tensor):
            img_input = Input(tensor=input_tensor, shape=input_shape)
        else:
            img_input = input_tensor
        # 若是是tensor的數據格式，須要兩步走：
        # 先判斷是不是keras指定的數據類型，is_keras_tensor
        # 而後get_source_inputs(input_tensor)

    # 編寫網絡結構，prototxt
    # Block 1
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
    x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)

    # Block 2
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
    x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)

    # Block 3
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
    x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)

    # Block 4
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)

    # Block 5
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
    x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
    x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)

    if include_top:
        # Classification block
        x = Flatten(name='flatten')(x)
        x = Dense(4096, activation='relu', name='fc1')(x)
        x = Dense(4096, activation='relu', name='fc2')(x)
        x = Dense(classes, activation='softmax', name='predictions')(x)
    else:
        if pooling == 'avg':
            x = GlobalAveragePooling2D()(x)
        elif pooling == 'max':
            x = GlobalMaxPooling2D()(x)

    # 調整數據
    # Ensure that the model takes into account
    # any potential predecessors of `input_tensor`.
    if input_tensor is not None:
        inputs = get_source_inputs(input_tensor)
        # get_source_inputs 返回計算須要的數據列表，List of input tensors.
        # 若是是tensor的數據格式，須要兩步走：
        # 先判斷是不是keras指定的數據類型，is_keras_tensor
        # 而後get_source_inputs(input_tensor)
    else:
        inputs = img_input

    # 建立模型
    # Create model.
    model = Model(inputs, x, name='vgg16')

    # 加載權重
    # load weights
    if weights == 'imagenet':
        if include_top:
            weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',
                                    WEIGHTS_PATH,
                                    cache_subdir='models')
        else:
            weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
                                    WEIGHTS_PATH_NO_TOP,
                                    cache_subdir='models')
        model.load_weights(weights_path)

        if K.backend() == 'theano':
            layer_utils.convert_all_kernels_in_model(model)

        if K.image_data_format() == 'channels_first':
            if include_top:
                maxpool = model.get_layer(name='block5_pool')
                shape = maxpool.output_shape[1:]
                dense = model.get_layer(name='fc1')
                layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')

            if K.backend() == 'tensorflow':
                warnings.warn('You are using the TensorFlow backend, yet you '
                              'are using the Theano '
                              'image data format convention '
                              '(`image_data_format="channels_first"`). '
                              'For best performance, set '
                              '`image_data_format="channels_last"` in '
                              'your Keras config '
                              'at ~/.keras/keras.json.')
    return model

if __name__ == '__main__':
    model = VGG16(include_top=True, weights='imagenet')

    img_path = 'elephant.jpg'
    img = image.load_img(img_path, target_size=(224, 224))
    x = image.img_to_array(img)
    x = np.expand_dims(x, axis=0)
    x = preprocess_input(x)
    print('Input image shape:', x.shape)

    preds = model.predict(x)
    print('Predicted:', decode_predictions(preds))
    # decode_predictions 輸出5個最高几率：(類名, 語義概念, 預測機率)

輸出：json

Input image shape: (1, 224, 224, 3)
Predicted: [[('n02504458', 'African_elephant', 0.62728244), ('n02504013', 'Indian_elephant', 0.19092941), ('n01871265', 'tusker', 0.18166111), ('n02437312', 'Arabian_camel', 4.5080957e-05), ('n07802026', 'hay', 1.7709652e-05)]]

將model下載到本地，修改下載的代碼註釋掉下面兩行： WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5' WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5' 修改下面兩行： weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',WEIGHTS_PATH,cache_subdir='models') weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',WEIGHTS_PATH_NO_TOP,cache_subdir='models')
幾個layer中的新用法 from keras.applications.imagenet_utils import decode_predictions decode_predictions 輸出5個最高几率：(類名, 語義概念, 預測機率) decode_predictions(y_pred) from keras.applications.imagenet_utils import preprocess_input 預處理圖像編碼服從規定，譬如,RGB，GBR這一類的，preprocess_input(x)
from keras.applications.imagenet_utils import _obtain_input_shape 肯定適當的輸入形狀，至關於opencv中的read.img，將圖像變爲數組（1）decode_predictions用在最後輸出結果上，比較好用【print(‘Predicted:’, decode_predictions(preds))】；（2）preprocess_input，改變編碼，【preprocess_input(x)】；（3）_obtain_input_shape 至關於caffe中的transform，在預測的時候，須要對預測的圖片進行必定的預處理。 input_shape = _obtain_input_shape(input_shape,default_size=224,min_size=48,data_format=K.image_data_format(),include_top=include_top) min_size=48，模型所能接受的最小長寬 data_format=K.image_data_format()，數據的使用格式
當include_top=True時 fc_model = VGG16(include_top=True) notop_model = VGG16(include_top=False) 用VGG16作fine-tuning的時候，獲得的notop_model就是沒有全鏈接層的模型，而後再去添加本身的層。當是健全的網絡結構的時候，fc_model須要添加如下的內容以補全網絡結構：

x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)

pool層以後接一個flatten層，修改數據格式，而後接兩個dense層，最後有softmax的Dense層後端

channels_first轉成channels_last格式

maxpool = model.get_layer(name='block5_pool')
 # model.get_layer()依據層名或下標得到層對象
 shape = maxpool.output_shape[1:]
 # 獲取block5_pool層輸出的數據格式
 dense = model.get_layer(name='fc1')
 layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')

convert_dense_weights_data_format將convnet的權重從一種數據格式移植到另外一種數據格式時，若是convnet包含一個平坦層（應用於最後一個卷積特徵映射），而後是一個密集層，則應更新該密集層的權重，以反映新的維度順序。數組