原文連接:http://www.one2know.cn/keras3/git
Application的五款已訓練模型 + H5py簡述
- Keras的應用模塊Application提供了帶有預訓練權重的Keras模型,這些模型能夠用來進行預測、特徵提取和finetune。 後續還有對如下幾個模型的參數介紹: Xception VGG16 VGG19 ResNet50 InceptionV3 全部的這些模型(除了Xception)都兼容Theano和Tensorflow,並會自動基於~/.keras/keras.json的Keras的圖像維度進行自動設置。例如,若是你設置data_format=」channel_last」,則加載的模型將按照TensorFlow的維度順序來構造,即「Width-Height-Depth」的順序。 模型的官方下載路徑: <u>https://github.com/fchollet/deep-learning-models/releases</u>
- th與tf的區別 Keras提供了兩套後端,Theano和Tensorflow th和tf的大部分功能都被backend統一包裝起來了,但兩者仍是存在不小的衝突,有時候你須要特別注意Keras是運行在哪一種後端之上,它們的主要衝突有: dim_ordering,也就是維度順序。比方說一張224×224的彩色圖片,theano的維度順序是(3,224,224),即通道維在前。而tf的維度順序是(224,224,3),即通道維在後。 數據格式的區別,channels_last」對應本來的「tf」,「channels_first」對應本來的「th」。以128x128的RGB圖像爲例,「channels_first」應將數據組織爲(3,128,128),而「channels_last」應將數據組織爲(128,128,3)
- notop模型 是否包含最後的3個全鏈接層,用來作微調(fine-tuning)專用,專門開源了這類模型
- H5py簡述 keras的已訓練模型是H5PY格式的,後綴是h5 h5py.File相似Python的詞典對象,所以咱們能夠查看全部的鍵值 輸入:
import h5py
file=h5py.File('.../notop.h5','r')
查看鍵值:f = file.attrs['nb_layers']
f.key()
查看到f中各個層內有些什麼:
for name in f: print(name)
- 官方案例:利用ResNet50網絡進行ImageNet分類 識別大象的品種:
from keras.applications.resnet50 import ResNet50 from keras.preprocessing import image from keras.applications.resnet50 import preprocess_input,decode_predictions import numpy as np model = ResNet50(weights=r'..\Model\resnet50_weights_tf_dim_ordering_tf_kernels.h5') img_path = 'elephant.jpg' img = image.load_img(img_path,target_size=(224,224)) # 現有模型輸入shape爲 (224, 224, 3) x = image.img_to_array(img) x = np.expand_dims(x,axis=0) x = preprocess_input(x) preds = model.predict(x) print('Predicted:',decode_predictions(preds,top=3)[0])
輸出:github
Predicted: [('n02504458', 'African_elephant', 0.603124), ('n02504013', 'Indian_elephant', 0.334439), ('n01871265', 'tusker', 0.062180385)]
- 五個模型 1.Xception模型:僅能以TensorFlow爲後端使用,目前該模型只支持channels_last的維度順序(width, height, channels) 默認輸入圖片大小爲299x299
keras.applications.xception.Xception(include_top=True,weights='imagenet',input_tensor=None, input_shape=None,pooling=None, classes=1000)
2.VGG16模型:在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序 默認輸入圖片大小爲224x224keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',input_tensor=None, input_shape=None,pooling=None,classes=1000)
3.VGG19模型 在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序 默認輸入圖片大小爲224x224keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None,pooling=None,classes=1000)
4.ResNet50模型 在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序 默認輸入圖片大小爲224x224keras.applications.resnet50.ResNet50(include_top=True,weights='imagenet',input_tensor=None, input_shape=None,pooling=None,classes=1000)
5.InceptionV3模型 在Theano和TensorFlow後端都可使用,並接受channels_first和channels_last兩種輸入維度順序 默認輸入圖片大小爲299x299keras.applications.inception_v3.InceptionV3(include_top=True,weights='imagenet',input_tensor=None,input_shape=None,pooling=None,classes=1000)
keras-applications-VGG16解讀:函數式
- VGG16默認的輸入數據格式應該是:channels_last
from __future__ import print_function import numpy as np import warnings from keras.models import Model from keras.layers import Flatten,Dense,Input,Conv2D from keras.layers import MaxPooling2D,GlobalMaxPooling2D,GlobalAveragePooling2D from keras.preprocessing import image from keras.utils import layer_utils from keras.utils.data_utils import get_file from keras import backend as K from keras.applications.imagenet_utils import decode_predictions # decode_predictions 輸出5個最高几率:(類名, 語義概念, 預測機率) decode_predictions(y_pred) from keras.applications.imagenet_utils import preprocess_input # 預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x) from keras_applications.imagenet_utils import _obtain_input_shape # 肯定適當的輸入形狀,至關於opencv中的read.img,將圖像變爲數組 from keras.engine.topology import get_source_inputs WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5' WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5' def VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): # 檢查weight與分類設置是否正確 if weights not in {'imagenet', None}: raise ValueError('The `weights` argument should be either ' '`None` (random initialization) or `imagenet` ' '(pre-training on ImageNet).') if weights == 'imagenet' and include_top and classes != 1000: raise ValueError('If using `weights` as imagenet with `include_top`' ' as true, `classes` should be 1000') # 設置圖像尺寸,相似caffe中的transform # Determine proper input shape input_shape = _obtain_input_shape(input_shape, default_size=224, min_size=48, # 模型所能接受的最小長寬 data_format=K.image_data_format(), # 數據的使用格式 require_flatten=include_top) #是否經過一個Flatten層再鏈接到分類器 # 數據簡單處理,resize if input_tensor is None: img_input = Input(shape=input_shape) # 這裏的Input是keras的格式,能夠用於轉換 else: if not K.is_keras_tensor(input_tensor): img_input = Input(tensor=input_tensor, shape=input_shape) else: img_input = input_tensor # 若是是tensor的數據格式,須要兩步走: # 先判斷是不是keras指定的數據類型,is_keras_tensor # 而後get_source_inputs(input_tensor) # 編寫網絡結構,prototxt # Block 1 x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input) x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x) # Block 2 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x) x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x) # Block 3 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x) # Block 4 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x) # Block 5 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x) if include_top: # Classification block x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x) else: if pooling == 'avg': x = GlobalAveragePooling2D()(x) elif pooling == 'max': x = GlobalMaxPooling2D()(x) # 調整數據 # Ensure that the model takes into account # any potential predecessors of `input_tensor`. if input_tensor is not None: inputs = get_source_inputs(input_tensor) # get_source_inputs 返回計算須要的數據列表,List of input tensors. # 若是是tensor的數據格式,須要兩步走: # 先判斷是不是keras指定的數據類型,is_keras_tensor # 而後get_source_inputs(input_tensor) else: inputs = img_input # 建立模型 # Create model. model = Model(inputs, x, name='vgg16') # 加載權重 # load weights if weights == 'imagenet': if include_top: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5', WEIGHTS_PATH, cache_subdir='models') else: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5', WEIGHTS_PATH_NO_TOP, cache_subdir='models') model.load_weights(weights_path) if K.backend() == 'theano': layer_utils.convert_all_kernels_in_model(model) if K.image_data_format() == 'channels_first': if include_top: maxpool = model.get_layer(name='block5_pool') shape = maxpool.output_shape[1:] dense = model.get_layer(name='fc1') layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first') if K.backend() == 'tensorflow': warnings.warn('You are using the TensorFlow backend, yet you ' 'are using the Theano ' 'image data format convention ' '(`image_data_format="channels_first"`). ' 'For best performance, set ' '`image_data_format="channels_last"` in ' 'your Keras config ' 'at ~/.keras/keras.json.') return model if __name__ == '__main__': model = VGG16(include_top=True, weights='imagenet') img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) print('Input image shape:', x.shape) preds = model.predict(x) print('Predicted:', decode_predictions(preds)) # decode_predictions 輸出5個最高几率:(類名, 語義概念, 預測機率)
輸出:json
Input image shape: (1, 224, 224, 3) Predicted: [[('n02504458', 'African_elephant', 0.62728244), ('n02504013', 'Indian_elephant', 0.19092941), ('n01871265', 'tusker', 0.18166111), ('n02437312', 'Arabian_camel', 4.5080957e-05), ('n07802026', 'hay', 1.7709652e-05)]]
- 將model下載到本地,修改下載的代碼 註釋掉下面兩行:
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
修改下面兩行:weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',WEIGHTS_PATH,cache_subdir='models')
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',WEIGHTS_PATH_NO_TOP,cache_subdir='models')
- 幾個layer中的新用法
from keras.applications.imagenet_utils import decode_predictions
decode_predictions 輸出5個最高几率:(類名, 語義概念, 預測機率) decode_predictions(y_pred)from keras.applications.imagenet_utils import preprocess_input
預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x)
from keras.applications.imagenet_utils import _obtain_input_shape
肯定適當的輸入形狀,至關於opencv中的read.img,將圖像變爲數組 (1)decode_predictions用在最後輸出結果上,比較好用【print(‘Predicted:’, decode_predictions(preds))】; (2)preprocess_input,改變編碼,【preprocess_input(x)】; (3)_obtain_input_shape 至關於caffe中的transform,在預測的時候,須要對預測的圖片進行必定的預處理。input_shape = _obtain_input_shape(input_shape,default_size=224,min_size=48,data_format=K.image_data_format(),include_top=include_top)
min_size=48,模型所能接受的最小長寬 data_format=K.image_data_format(),數據的使用格式 - 當include_top=True時
fc_model = VGG16(include_top=True)
notop_model = VGG16(include_top=False)
用VGG16作fine-tuning的時候,獲得的notop_model就是沒有全鏈接層的模型,而後再去添加本身的層。 當是健全的網絡結構的時候,fc_model須要添加如下的內容以補全網絡結構:
x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x)
pool層以後接一個flatten層,修改數據格式,而後接兩個dense層,最後有softmax的Dense層後端
- channels_first轉成channels_last格式
maxpool = model.get_layer(name='block5_pool') # model.get_layer()依據層名或下標得到層對象 shape = maxpool.output_shape[1:] # 獲取block5_pool層輸出的數據格式 dense = model.get_layer(name='fc1') layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
convert_dense_weights_data_format
將convnet的權重從一種數據格式移植到另外一種數據格式時,若是convnet包含一個平坦層(應用於最後一個卷積特徵映射),而後是一個密集層,則應更新該密集層的權重,以反映新的維度順序。數組