目錄html
TFlearnpython
pip3 安裝 TFlearngit
pip3 install tflearn --user Installing collected packages: tflearn Successfully installed tflearn-0.3.2
在這個例子中咱們將對泰坦尼克號上的乘客進行存活可能性預測。github
數據集中,每個乘客的相關信息以下:後端
VARIABLE DESCRIPTIONS: survived Survived (0 = No; 1 = Yes) pclass Passenger Class (1 = st; 2 = nd; 3 = rd) name Name sex Sex age Age sibsp Number of Siblings/Spouses Aboard parch Number of Parents/Children Aboard ticket Ticket Number fare Passenger Fare
其中總共有9項,咱們將其分爲標籤(label)和輸入(data),令標籤爲是否存活,存活爲1,那麼輸入包含8項,其中咱們認爲姓名以及船票的號碼(能夠由票價直接體現)對於咱們預測乘客的存活概率是沒有什麼用的,因此在預處理中,咱們將其拋棄。網絡
數據集被存儲爲 csv
文件格式。csv
,全稱爲 Comma-Separated Values
,即逗號分隔值,其文本以純文本形式存儲表格數據,咱們可使用文本編輯器或 excel
直接打開。先加載數據到內存中session
使用 load_csv()
函數從csv文件中讀取數據,並轉爲 python List
。其中 target_column
參數用於表示咱們的標籤列 id
,該函數將返回一個元組:(data,labels)
。而後按照咱們前面說的,拋棄輸入中的姓名以及船票號碼字段,並將性別字段轉爲數值,0 表示男性,1 表示女性。app
TFLearn中採用Tensor進行運算,所以這裏的net都是Tensor,與TensorFlow中同樣,咱們也能夠將其中的某一個部分用TensorFlow中的函數本身寫,從而實現一些TFLearn庫中沒有的功能。其中全鏈接層的W(weights_init)和b(bias_init)能夠指定,不過默認爲W:'truncated_normal',b:'zeros',此外,其中的 activation 參數默認爲'linear'。dom
其中 tflearn.DNN 是TFLearn中提供的一個模型 wrapper,至關於咱們將不少功能包裝起來,咱們給它一個 net 結構,生成一個 model 對象,而後調用model對象的訓練、預測、存儲等功能,DNN類有三個屬性(成員變量):trainer,predictor,session。在fit()函數中n_epoch=10表示整個訓練數據集將會用10遍,batch_size=16表示一次用16個數據計算參數的更新。編輯器
最後利用訓練獲得的模型進行預測:
import numpy as np import tflearn # Download the Titanic dataset from tflearn.datasets import titanic titanic.download_dataset('titanic_dataset.csv') # Load CSV file, indicate that the first column represents labels from tflearn.data_utils import load_csv data, labels = load_csv('titanic_dataset.csv', target_column=0, categorical_labels=True, n_classes=2) # Preprocessing function def preprocess(data, columns_to_ignore): # Sort by descending id and delete columns for id in sorted(columns_to_ignore, reverse=True): [r.pop(id) for r in data] for i in range(len(data)): # Converting 'sex' field to float (id is 1 after removing labels column) data[i][1] = 1. if data[i][1] == 'female' else 0. return np.array(data, dtype=np.float32) # Ignore 'name' and 'ticket' columns (id 1 & 6 of data array) to_ignore=[1, 6] # Preprocess data data = preprocess(data, to_ignore) # Build neural network net = tflearn.input_data(shape=[None, 6]) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 32) net = tflearn.fully_connected(net, 2, activation='softmax') net = tflearn.regression(net) # Define model model = tflearn.DNN(net) # Start training (apply gradient descent algorithm) model.fit(data, labels, n_epoch=10, batch_size=16, show_metric=True) # Let's create some data for DiCaprio and Winslet dicaprio = [3, 'Jack Dawson', 'male', 19, 0, 0, 'N/A', 5.0000] winslet = [1, 'Rose DeWitt Bukater', 'female', 17, 1, 2, 'N/A', 100.0000] # Preprocess data dicaprio, winslet = preprocess([dicaprio, winslet], to_ignore) # Predict surviving chances (class 1 results) pred = model.predict([dicaprio, winslet]) print("DiCaprio Surviving Rate:", pred[0][1]) print("Winslet Surviving Rate:", pred[1][1])
Training samples: 1309 Validation samples: 0 -- successfully opened CUDA library libcublas.so.10.0 locally Training Step: 82 | total loss: 0.65318 | time: 3.584s | Adam | epoch: 001 | loss: 0.65318 - acc: 0.6781 -- iter: 1309/1309 -- Training Step: 164 | total loss: 0.63713 | time: 1.298s | Adam | epoch: 002 | loss: 0.63713 - acc: 0.6687 -- iter: 1309/1309 -- Training Step: 246 | total loss: 0.55357 | time: 1.354s | Adam | epoch: 003 | loss: 0.55357 - acc: 0.7219 -- iter: 1309/1309 -- Training Step: 328 | total loss: 0.56566 | time: 1.312s | Adam | epoch: 004 | loss: 0.56566 - acc: 0.7091 -- iter: 1309/1309 -- Training Step: 410 | total loss: 0.48417 | time: 1.311s | Adam | epoch: 005 | loss: 0.48417 - acc: 0.7854 -- iter: 1309/1309 -- Training Step: 492 | total loss: 0.56114 | time: 1.300s | Adam | epoch: 006 | loss: 0.56114 - acc: 0.7463 -- iter: 1309/1309 -- Training Step: 574 | total loss: 0.51057 | time: 1.289s | Adam | epoch: 007 | loss: 0.51057 - acc: 0.7988 -- iter: 1309/1309 -- Training Step: 656 | total loss: 0.56562 | time: 1.312s | Adam | epoch: 008 | loss: 0.56562 - acc: 0.7551 -- iter: 1309/1309 -- Training Step: 738 | total loss: 0.52883 | time: 1.324s | Adam | epoch: 009 | loss: 0.52883 - acc: 0.7654 -- iter: 1309/1309 -- Training Step: 820 | total loss: 0.50510 | time: 1.340s | Adam | epoch: 010 | loss: 0.50510 - acc: 0.7687 -- iter: 1309/1309 -- DiCaprio Surviving Rate: 0.17452878 Winslet Surviving Rate: 0.938663
咱們的模型完成訓練,整體準確率在 76.87%,這意味着它能夠預測76%總乘客的正確結果(倖存與否)。
其中 Dicaprio
是男主角,Winslet
爲女主角,能夠看出預測仍是比較準的。
掌握 keras 能夠大幅提高對開發效率和網絡結構的理解。優勢:
pip3 install keras --user Successfully installed keras-2.2.4
安裝完成後,進入python3,檢查一下安裝成果,import keras時,下方提示using TensorFlow backend,就證實Keras安裝成功並使用TensorFlow做爲backend。
import keras Using TensorFlow backend. ModuleNotFoundError: No module named 'numpy.core._multiarray_umath' ImportError: numpy.core.multiarray failed to import
這裏有一個小問題,須要升級numpy包
pip3 install -i https://pypi.tuna.tsinghua.edu.cn/simple --upgrade numpy --user Successfully installed numpy-1.16.3
而後keras成功安裝
import keras Using TensorFlow backend.
keras 的核心數據是模型。模型是用來組織網絡層的方式。模型有兩種,一種叫 Sequential 模型,另外一種叫 Model 模型 。 Sequential 模型是一系列網絡層按順序構成的棧,是單輸入單輸出的,層與層之間只有相鄰關係,是最簡單的一種模型。
Keras 是一個用 Python 編寫的高級神經網絡 API,它可以以 TensorFlow, CNTK, 或者 Theano 做爲後端運行。Keras 的開發重點是支持快速的實驗。可以以最小的時延把你的想法轉換爲實驗結果,是作好研究的關鍵。
若是你在如下狀況下須要深度學習庫,請使用 Keras:
Keras 兼容的 Python 版本: Python 2.7-3.6。
用戶友好。 Keras 是爲人類而不是爲機器設計的 API。它把用戶體驗放在首要和中心位置。Keras 遵循減小認知困難的最佳實踐:它提供一致且簡單的 API,將常見用例所需的用戶操做數量降至最低,而且在用戶錯誤時提供清晰和可操做的反饋。
模塊化。 模型被理解爲由獨立的、徹底可配置的模塊構成的序列或圖。這些模塊能夠以儘量少的限制組裝在一塊兒。特別是神經網絡層、損失函數、優化器、初始化方法、激活函數、正則化方法,它們都是能夠結合起來構建新模型的模塊。
易擴展性。 新的模塊是很容易添加的(做爲新的類和函數),現有的模塊已經提供了充足的示例。因爲可以輕鬆地建立能夠提升表現力的新模塊,Keras 更加適合高級研究。
基於 Python 實現。 Keras 沒有特定格式的單獨配置文件。模型定義在 Python 代碼中,這些代碼緊湊,易於調試,而且易於擴展。
在 examples 目錄 中,你能夠找到真實數據集的示例模型:
Keras下載的數據集在如下目錄中:
C:\Users\user_name\.keras\datasets
/home/user_name
,對於root用戶,主目錄是:/root
Keras下載的預訓練模型在一下目錄中:
/root/.keras/models
/home/user_name/.keras/models
# https://github.com/keras-team/keras/tree/master/examples/cifar10_cnn.py """ #Trains a ResNet on the CIFAR10 dataset. ResNet v1: [Deep Residual Learning for Image Recognition ](https://arxiv.org/pdf/1512.03385.pdf) ResNet v2: [Identity Mappings in Deep Residual Networks ](https://arxiv.org/pdf/1603.05027.pdf) Model|n|200-epoch accuracy|Original paper accuracy |sec/epoch GTX1080Ti :------------|--:|-------:|-----------------------:|---: ResNet20 v1| 3| 92.16 %| 91.25 %|35 ResNet32 v1| 5| 92.46 %| 92.49 %|50 ResNet44 v1| 7| 92.50 %| 92.83 %|70 ResNet56 v1| 9| 92.71 %| 93.03 %|90 ResNet110 v1| 18| 92.65 %| 93.39+-.16 %|165 ResNet164 v1| 27| - %| 94.07 %| - ResNet1001 v1|N/A| - %| 92.39 %| - Model|n|200-epoch accuracy|Original paper accuracy |sec/epoch GTX1080Ti :------------|--:|-------:|-----------------------:|---: ResNet20 v2| 2| - %| - %|--- ResNet32 v2|N/A| NA %| NA %| NA ResNet44 v2|N/A| NA %| NA %| NA ResNet56 v2| 6| 93.01 %| NA %|100 ResNet110 v2| 12| 93.15 %| 93.63 %|180 ResNet164 v2| 18| - %| 94.54 %| - ResNet1001 v2|111| - %| 95.08+-.14 %| - """ # %matplotlib inline # %config InlineBackend.figure_format = 'svg' # calculate time using import timeit start = timeit.default_timer() import keras from keras.layers import Dense, Conv2D, BatchNormalization, Activation from keras.layers import AveragePooling2D, Input, Flatten from keras.optimizers import Adam from keras.callbacks import ModelCheckpoint, LearningRateScheduler from keras.callbacks import ReduceLROnPlateau from keras.preprocessing.image import ImageDataGenerator from keras.regularizers import l2 from keras import backend as K from keras.models import Model from keras.datasets import cifar10 import numpy as np import os # calculate time using import timeit start = timeit.default_timer() # Training parameters batch_size = 32 # orig paper trained all networks with batch_size=128 epochs = 10 data_augmentation = True num_classes = 10 # Subtracting pixel mean improves accuracy subtract_pixel_mean = True # Model parameter # ---------------------------------------------------------------------------- # | | 200-epoch | Orig Paper| 200-epoch | Orig Paper| sec/epoch # Model | n | ResNet v1 | ResNet v1 | ResNet v2 | ResNet v2 | GTX1080Ti # |v1(v2)| %Accuracy | %Accuracy | %Accuracy | %Accuracy | v1 (v2) # ---------------------------------------------------------------------------- # ResNet20 | 3 (2)| 92.16 | 91.25 | ----- | ----- | 35 (---) # ResNet32 | 5(NA)| 92.46 | 92.49 | NA | NA | 50 ( NA) # ResNet44 | 7(NA)| 92.50 | 92.83 | NA | NA | 70 ( NA) # ResNet56 | 9 (6)| 92.71 | 93.03 | 93.01 | NA | 90 (100) # ResNet110 |18(12)| 92.65 | 93.39+-.16| 93.15 | 93.63 | 165(180) # ResNet164 |27(18)| ----- | 94.07 | ----- | 94.54 | ---(---) # ResNet1001| (111)| ----- | 92.39 | ----- | 95.08+-.14| ---(---) # --------------------------------------------------------------------------- n = 3 # Model version # Orig paper: version = 1 (ResNet v1), Improved ResNet: version = 2 (ResNet v2) version = 1 # Computed depth from supplied model parameter n if version == 1: depth = n * 6 + 2 elif version == 2: depth = n * 9 + 2 # Model name, depth and version model_type = 'ResNet%dv%d' % (depth, version) # Load the CIFAR10 data. (x_train, y_train), (x_test, y_test) = cifar10.load_data() # Input image dimensions. input_shape = x_train.shape[1:] # Normalize data. x_train = x_train.astype('float32') / 255 x_test = x_test.astype('float32') / 255 # If subtract pixel mean is enabled if subtract_pixel_mean: x_train_mean = np.mean(x_train, axis=0) x_train -= x_train_mean x_test -= x_train_mean print('x_train shape:', x_train.shape) print(x_train.shape[0], 'train samples') print(x_test.shape[0], 'test samples') print('y_train shape:', y_train.shape) # Convert class vectors to binary class matrices. y_train = keras.utils.to_categorical(y_train, num_classes) y_test = keras.utils.to_categorical(y_test, num_classes) def lr_schedule(epoch): """Learning Rate Schedule Learning rate is scheduled to be reduced after 80, 120, 160, 180 epochs. Called automatically every epoch as part of callbacks during training. # Arguments epoch (int): The number of epochs # Returns lr (float32): learning rate """ lr = 1e-3 if epoch > 180: lr *= 0.5e-3 elif epoch > 160: lr *= 1e-3 elif epoch > 120: lr *= 1e-2 elif epoch > 80: lr *= 1e-1 print('Learning rate: ', lr) return lr def resnet_layer(inputs, num_filters=16, kernel_size=3, strides=1, activation='relu', batch_normalization=True, conv_first=True): """2D Convolution-Batch Normalization-Activation stack builder # Arguments inputs (tensor): input tensor from input image or previous layer num_filters (int): Conv2D number of filters kernel_size (int): Conv2D square kernel dimensions strides (int): Conv2D square stride dimensions activation (string): activation name batch_normalization (bool): whether to include batch normalization conv_first (bool): conv-bn-activation (True) or bn-activation-conv (False) # Returns x (tensor): tensor as input to the next layer """ conv = Conv2D(num_filters, kernel_size=kernel_size, strides=strides, padding='same', kernel_initializer='he_normal', kernel_regularizer=l2(1e-4)) x = inputs if conv_first: x = conv(x) if batch_normalization: x = BatchNormalization()(x) if activation is not None: x = Activation(activation)(x) else: if batch_normalization: x = BatchNormalization()(x) if activation is not None: x = Activation(activation)(x) x = conv(x) return x def resnet_v1(input_shape, depth, num_classes=10): """ResNet Version 1 Model builder [a] Stacks of 2 x (3 x 3) Conv2D-BN-ReLU Last ReLU is after the shortcut connection. At the beginning of each stage, the feature map size is halved (downsampled) by a convolutional layer with strides=2, while the number of filters is doubled. Within each stage, the layers have the same number filters and the same number of filters. Features maps sizes: stage 0: 32x32, 16 stage 1: 16x16, 32 stage 2: 8x8, 64 The Number of parameters is approx the same as Table 6 of [a]: ResNet20 0.27M ResNet32 0.46M ResNet44 0.66M ResNet56 0.85M ResNet110 1.7M # Arguments input_shape (tensor): shape of input image tensor depth (int): number of core convolutional layers num_classes (int): number of classes (CIFAR10 has 10) # Returns model (Model): Keras model instance """ if (depth - 2) % 6 != 0: raise ValueError('depth should be 6n+2 (eg 20, 32, 44 in [a])') # Start model definition. num_filters = 16 num_res_blocks = int((depth - 2) / 6) inputs = Input(shape=input_shape) x = resnet_layer(inputs=inputs) # Instantiate the stack of residual units for stack in range(3): for res_block in range(num_res_blocks): strides = 1 if stack > 0 and res_block == 0: # first layer but not first stack strides = 2 # downsample y = resnet_layer(inputs=x, num_filters=num_filters, strides=strides) y = resnet_layer(inputs=y, num_filters=num_filters, activation=None) if stack > 0 and res_block == 0: # first layer but not first stack # linear projection residual shortcut connection to match # changed dims x = resnet_layer(inputs=x, num_filters=num_filters, kernel_size=1, strides=strides, activation=None, batch_normalization=False) x = keras.layers.add([x, y]) x = Activation('relu')(x) num_filters *= 2 # Add classifier on top. # v1 does not use BN after last shortcut connection-ReLU x = AveragePooling2D(pool_size=8)(x) y = Flatten()(x) outputs = Dense(num_classes, activation='softmax', kernel_initializer='he_normal')(y) # Instantiate model. model = Model(inputs=inputs, outputs=outputs) return model def resnet_v2(input_shape, depth, num_classes=10): """ResNet Version 2 Model builder [b] Stacks of (1 x 1)-(3 x 3)-(1 x 1) BN-ReLU-Conv2D or also known as bottleneck layer First shortcut connection per layer is 1 x 1 Conv2D. Second and onwards shortcut connection is identity. At the beginning of each stage, the feature map size is halved (downsampled) by a convolutional layer with strides=2, while the number of filter maps is doubled. Within each stage, the layers have the same number filters and the same filter map sizes. Features maps sizes: conv1 : 32x32, 16 stage 0: 32x32, 64 stage 1: 16x16, 128 stage 2: 8x8, 256 # Arguments input_shape (tensor): shape of input image tensor depth (int): number of core convolutional layers num_classes (int): number of classes (CIFAR10 has 10) # Returns model (Model): Keras model instance """ if (depth - 2) % 9 != 0: raise ValueError('depth should be 9n+2 (eg 56 or 110 in [b])') # Start model definition. num_filters_in = 16 num_res_blocks = int((depth - 2) / 9) inputs = Input(shape=input_shape) # v2 performs Conv2D with BN-ReLU on input before splitting into 2 paths x = resnet_layer(inputs=inputs, num_filters=num_filters_in, conv_first=True) # Instantiate the stack of residual units for stage in range(3): for res_block in range(num_res_blocks): activation = 'relu' batch_normalization = True strides = 1 if stage == 0: num_filters_out = num_filters_in * 4 if res_block == 0: # first layer and first stage activation = None batch_normalization = False else: num_filters_out = num_filters_in * 2 if res_block == 0: # first layer but not first stage strides = 2 # downsample # bottleneck residual unit y = resnet_layer(inputs=x, num_filters=num_filters_in, kernel_size=1, strides=strides, activation=activation, batch_normalization=batch_normalization, conv_first=False) y = resnet_layer(inputs=y, num_filters=num_filters_in, conv_first=False) y = resnet_layer(inputs=y, num_filters=num_filters_out, kernel_size=1, conv_first=False) if res_block == 0: # linear projection residual shortcut connection to match # changed dims x = resnet_layer(inputs=x, num_filters=num_filters_out, kernel_size=1, strides=strides, activation=None, batch_normalization=False) x = keras.layers.add([x, y]) num_filters_in = num_filters_out # Add classifier on top. # v2 has BN-ReLU before Pooling x = BatchNormalization()(x) x = Activation('relu')(x) x = AveragePooling2D(pool_size=8)(x) y = Flatten()(x) outputs = Dense(num_classes, activation='softmax', kernel_initializer='he_normal')(y) # Instantiate model. model = Model(inputs=inputs, outputs=outputs) return model if version == 2: model = resnet_v2(input_shape=input_shape, depth=depth) else: model = resnet_v1(input_shape=input_shape, depth=depth) model.compile(loss='categorical_crossentropy', optimizer=Adam(lr=lr_schedule(0)), metrics=['accuracy']) model.summary() print(model_type) # Prepare model model saving directory. save_dir = os.path.join(os.getcwd(), 'saved_models') model_name = 'cifar10_%s_model.{epoch:03d}.h5' % model_type if not os.path.isdir(save_dir): os.makedirs(save_dir) filepath = os.path.join(save_dir, model_name) # Prepare callbacks for model saving and for learning rate adjustment. checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=1, save_best_only=True) lr_scheduler = LearningRateScheduler(lr_schedule) lr_reducer = ReduceLROnPlateau(factor=np.sqrt(0.1), cooldown=0, patience=5, min_lr=0.5e-6) callbacks = [checkpoint, lr_reducer, lr_scheduler] # Run training, with or without data augmentation. if not data_augmentation: print('Not using data augmentation.') model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs, validation_data=(x_test, y_test), shuffle=True, callbacks=callbacks) else: print('Using real-time data augmentation.') # This will do preprocessing and realtime data augmentation: datagen = ImageDataGenerator( # set input mean to 0 over the dataset featurewise_center=False, # set each sample mean to 0 samplewise_center=False, # divide inputs by std of dataset featurewise_std_normalization=False, # divide each input by its std samplewise_std_normalization=False, # apply ZCA whitening zca_whitening=False, # epsilon for ZCA whitening zca_epsilon=1e-06, # randomly rotate images in the range (deg 0 to 180) rotation_range=0, # randomly shift images horizontally width_shift_range=0.1, # randomly shift images vertically height_shift_range=0.1, # set range for random shear shear_range=0., # set range for random zoom zoom_range=0., # set range for random channel shifts channel_shift_range=0., # set mode for filling points outside the input boundaries fill_mode='nearest', # value used for fill_mode = "constant" cval=0., # randomly flip images horizontal_flip=True, # randomly flip images vertical_flip=False, # set rescaling factor (applied before any other transformation) rescale=None, # set function that will be applied on each input preprocessing_function=None, # image data format, either "channels_first" or "channels_last" data_format=None, # fraction of images reserved for validation (strictly between 0 and 1) validation_split=0.0) # Compute quantities required for featurewise normalization # (std, mean, and principal components if ZCA whitening is applied). datagen.fit(x_train) # Fit the model on the batches generated by datagen.flow(). model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), steps_per_epoch=x_train.shape[0], validation_data=(x_test, y_test), epochs=epochs, verbose=1, workers=4, callbacks=callbacks) # Score trained model. scores = model.evaluate(x_test, y_test, verbose=1) print('Test loss:', scores[0]) print('Test accuracy:', scores[1]) # output time using end = timeit.default_timer() tdf = end -start timeh = tdf // 3600 timem = tdf // 60 times = tdf % 60 print("use time: " , int(timeh) , "h" , int(timem) , "m" ,times, "s") # output time using end = timeit.default_timer() tdf = end -start timeh = tdf // 3600 timem = tdf // 60 times = tdf % 60 print("use time: " , int(timeh) , "h" , int(timem) , "m" ,times, "s")
直接運行後會有錯誤
python3 cifar10_cnn.py ValueError: steps_per_epoch=None is only valid for a generator based on the keras.utils.Sequence class. Please specify steps_per_epoch or use the keras.utils.Sequence class.
這個是因爲版本更迭,有些函數的參數做了修改
只須要將 cifar10_resnet.py
文件中
model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), validation_data=(x_test, y_test), epochs=epochs, verbose=1, workers=4, callbacks=callbacks)
修改成
model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size), steps_per_epoch=x_train.shape[0] // batch_size, validation_data=(x_test, y_test), epochs=epochs, verbose=1, workers=4, callbacks=callbacks)
Using TensorFlow backend. x_train shape: (50000, 32, 32, 3) 50000 train samples 10000 test samples y_train shape: (50000, 1) Learning rate: 0.001 __________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ================================================================================================== input_1 (InputLayer) (None, 32, 32, 3) 0 __________________________________________________________________________________________________ conv2d_1 (Conv2D) (None, 32, 32, 16) 448 input_1[0][0] __________________________________________________________________________________________________ batch_normalization_1 (BatchNor (None, 32, 32, 16) 64 conv2d_1[0][0] __________________________________________________________________________________________________ activation_1 (Activation) (None, 32, 32, 16) 0 batch_normalization_1[0][0] __________________________________________________________________________________________________ conv2d_2 (Conv2D) (None, 32, 32, 16) 2320 activation_1[0][0] __________________________________________________________________________________________________ batch_normalization_2 (BatchNor (None, 32, 32, 16) 64 conv2d_2[0][0] __________________________________________________________________________________________________ activation_2 (Activation) (None, 32, 32, 16) 0 batch_normalization_2[0][0] __________________________________________________________________________________________________ conv2d_3 (Conv2D) (None, 32, 32, 16) 2320 activation_2[0][0] __________________________________________________________________________________________________ batch_normalization_3 (BatchNor (None, 32, 32, 16) 64 conv2d_3[0][0] __________________________________________________________________________________________________ add_1 (Add) (None, 32, 32, 16) 0 activation_1[0][0] batch_normalization_3[0][0] __________________________________________________________________________________________________ activation_3 (Activation) (None, 32, 32, 16) 0 add_1[0][0] __________________________________________________________________________________________________ conv2d_4 (Conv2D) (None, 32, 32, 16) 2320 activation_3[0][0] __________________________________________________________________________________________________ batch_normalization_4 (BatchNor (None, 32, 32, 16) 64 conv2d_4[0][0] __________________________________________________________________________________________________ activation_4 (Activation) (None, 32, 32, 16) 0 batch_normalization_4[0][0] __________________________________________________________________________________________________ conv2d_5 (Conv2D) (None, 32, 32, 16) 2320 activation_4[0][0] __________________________________________________________________________________________________ batch_normalization_5 (BatchNor (None, 32, 32, 16) 64 conv2d_5[0][0] __________________________________________________________________________________________________ add_2 (Add) (None, 32, 32, 16) 0 activation_3[0][0] batch_normalization_5[0][0] __________________________________________________________________________________________________ activation_5 (Activation) (None, 32, 32, 16) 0 add_2[0][0] __________________________________________________________________________________________________ conv2d_6 (Conv2D) (None, 32, 32, 16) 2320 activation_5[0][0] __________________________________________________________________________________________________ batch_normalization_6 (BatchNor (None, 32, 32, 16) 64 conv2d_6[0][0] __________________________________________________________________________________________________ activation_6 (Activation) (None, 32, 32, 16) 0 batch_normalization_6[0][0] __________________________________________________________________________________________________ conv2d_7 (Conv2D) (None, 32, 32, 16) 2320 activation_6[0][0] __________________________________________________________________________________________________ batch_normalization_7 (BatchNor (None, 32, 32, 16) 64 conv2d_7[0][0] __________________________________________________________________________________________________ add_3 (Add) (None, 32, 32, 16) 0 activation_5[0][0] batch_normalization_7[0][0] __________________________________________________________________________________________________ activation_7 (Activation) (None, 32, 32, 16) 0 add_3[0][0] __________________________________________________________________________________________________ conv2d_8 (Conv2D) (None, 16, 16, 32) 4640 activation_7[0][0] __________________________________________________________________________________________________ batch_normalization_8 (BatchNor (None, 16, 16, 32) 128 conv2d_8[0][0] __________________________________________________________________________________________________ activation_8 (Activation) (None, 16, 16, 32) 0 batch_normalization_8[0][0] __________________________________________________________________________________________________ conv2d_9 (Conv2D) (None, 16, 16, 32) 9248 activation_8[0][0] __________________________________________________________________________________________________ conv2d_10 (Conv2D) (None, 16, 16, 32) 544 activation_7[0][0] __________________________________________________________________________________________________ batch_normalization_9 (BatchNor (None, 16, 16, 32) 128 conv2d_9[0][0] __________________________________________________________________________________________________ add_4 (Add) (None, 16, 16, 32) 0 conv2d_10[0][0] batch_normalization_9[0][0] __________________________________________________________________________________________________ activation_9 (Activation) (None, 16, 16, 32) 0 add_4[0][0] __________________________________________________________________________________________________ conv2d_11 (Conv2D) (None, 16, 16, 32) 9248 activation_9[0][0] __________________________________________________________________________________________________ batch_normalization_10 (BatchNo (None, 16, 16, 32) 128 conv2d_11[0][0] __________________________________________________________________________________________________ activation_10 (Activation) (None, 16, 16, 32) 0 batch_normalization_10[0][0] __________________________________________________________________________________________________ conv2d_12 (Conv2D) (None, 16, 16, 32) 9248 activation_10[0][0] __________________________________________________________________________________________________ batch_normalization_11 (BatchNo (None, 16, 16, 32) 128 conv2d_12[0][0] __________________________________________________________________________________________________ add_5 (Add) (None, 16, 16, 32) 0 activation_9[0][0] batch_normalization_11[0][0] __________________________________________________________________________________________________ activation_11 (Activation) (None, 16, 16, 32) 0 add_5[0][0] __________________________________________________________________________________________________ conv2d_13 (Conv2D) (None, 16, 16, 32) 9248 activation_11[0][0] __________________________________________________________________________________________________ batch_normalization_12 (BatchNo (None, 16, 16, 32) 128 conv2d_13[0][0] __________________________________________________________________________________________________ activation_12 (Activation) (None, 16, 16, 32) 0 batch_normalization_12[0][0] __________________________________________________________________________________________________ conv2d_14 (Conv2D) (None, 16, 16, 32) 9248 activation_12[0][0] __________________________________________________________________________________________________ batch_normalization_13 (BatchNo (None, 16, 16, 32) 128 conv2d_14[0][0] __________________________________________________________________________________________________ add_6 (Add) (None, 16, 16, 32) 0 activation_11[0][0] batch_normalization_13[0][0] __________________________________________________________________________________________________ activation_13 (Activation) (None, 16, 16, 32) 0 add_6[0][0] __________________________________________________________________________________________________ conv2d_15 (Conv2D) (None, 8, 8, 64) 18496 activation_13[0][0] __________________________________________________________________________________________________ batch_normalization_14 (BatchNo (None, 8, 8, 64) 256 conv2d_15[0][0] __________________________________________________________________________________________________ activation_14 (Activation) (None, 8, 8, 64) 0 batch_normalization_14[0][0] __________________________________________________________________________________________________ conv2d_16 (Conv2D) (None, 8, 8, 64) 36928 activation_14[0][0] __________________________________________________________________________________________________ conv2d_17 (Conv2D) (None, 8, 8, 64) 2112 activation_13[0][0] __________________________________________________________________________________________________ batch_normalization_15 (BatchNo (None, 8, 8, 64) 256 conv2d_16[0][0] __________________________________________________________________________________________________ add_7 (Add) (None, 8, 8, 64) 0 conv2d_17[0][0] batch_normalization_15[0][0] __________________________________________________________________________________________________ activation_15 (Activation) (None, 8, 8, 64) 0 add_7[0][0] __________________________________________________________________________________________________ conv2d_18 (Conv2D) (None, 8, 8, 64) 36928 activation_15[0][0] __________________________________________________________________________________________________ batch_normalization_16 (BatchNo (None, 8, 8, 64) 256 conv2d_18[0][0] __________________________________________________________________________________________________ activation_16 (Activation) (None, 8, 8, 64) 0 batch_normalization_16[0][0] __________________________________________________________________________________________________ conv2d_19 (Conv2D) (None, 8, 8, 64) 36928 activation_16[0][0] __________________________________________________________________________________________________ batch_normalization_17 (BatchNo (None, 8, 8, 64) 256 conv2d_19[0][0] __________________________________________________________________________________________________ add_8 (Add) (None, 8, 8, 64) 0 activation_15[0][0] batch_normalization_17[0][0] __________________________________________________________________________________________________ activation_17 (Activation) (None, 8, 8, 64) 0 add_8[0][0] __________________________________________________________________________________________________ conv2d_20 (Conv2D) (None, 8, 8, 64) 36928 activation_17[0][0] __________________________________________________________________________________________________ batch_normalization_18 (BatchNo (None, 8, 8, 64) 256 conv2d_20[0][0] __________________________________________________________________________________________________ activation_18 (Activation) (None, 8, 8, 64) 0 batch_normalization_18[0][0] __________________________________________________________________________________________________ conv2d_21 (Conv2D) (None, 8, 8, 64) 36928 activation_18[0][0] __________________________________________________________________________________________________ batch_normalization_19 (BatchNo (None, 8, 8, 64) 256 conv2d_21[0][0] __________________________________________________________________________________________________ add_9 (Add) (None, 8, 8, 64) 0 activation_17[0][0] batch_normalization_19[0][0] __________________________________________________________________________________________________ activation_19 (Activation) (None, 8, 8, 64) 0 add_9[0][0] __________________________________________________________________________________________________ average_pooling2d_1 (AveragePoo (None, 1, 1, 64) 0 activation_19[0][0] __________________________________________________________________________________________________ flatten_1 (Flatten) (None, 64) 0 average_pooling2d_1[0][0] __________________________________________________________________________________________________ dense_1 (Dense) (None, 10) 650 flatten_1[0][0] ================================================================================================== Total params: 274,442 Trainable params: 273,066 Non-trainable params: 1,376 __________________________________________________________________________________________________ ResNet20v1 Using real-time data augmentation. Epoch 1/10 Learning rate: 0.001 successfully opened CUDA library libcublas.so.10.0 locally 50000/50000 [==============================] - 11286s 226ms/step - loss: 0.7185 - acc: 0.8164 - val_loss: 0.7312 - val_acc: 0.8302
訓練一個 Epoch
須要3個小時左右,訓練後測試集精度爲83.03%。例子須要訓練200個Epoch
,Jentson Nano 的 0.5T 的算力太差,不適合訓練模型,計算量太大選擇放棄。