在咱們學習的這個項目中,模型主要分爲兩種狀態,即進行推斷用的inference模式和進行訓練用的training模式。所謂推斷模式就是已經訓練好的的模型,咱們傳入一張圖片,網絡將其分析結果計算出來的模式。python
本節咱們從demo.ipynb入手,一窺已經訓練好的Mask-RCNN模型如何根據一張輸入圖片進行推斷,獲得相關信息,即inference模式的工做原理。git
首先進行配置設定,設定項都被集成進class config中了,自建新的設定只要基礎改class並更新屬性便可,在demo中咱們直接使用COCO的預訓練模型因此使用其設置便可,但因爲咱們想檢測單張圖片,因此須要更新幾個相關數目設定:github
# 父類繼承了Config類,目的就是記錄配置,並在其基礎上添加了幾個新的屬性 class InferenceConfig(coco.CocoConfig): # Set batch size to 1 since we'll be running inference on # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU GPU_COUNT = 1 IMAGES_PER_GPU = 1 config = InferenceConfig() config.display()
打印出配置以下:網絡
Configurations: BACKBONE resnet101 BACKBONE_STRIDES [4, 8, 16, 32, 64] BATCH_SIZE 1 BBOX_STD_DEV [ 0.1 0.1 0.2 0.2] COMPUTE_BACKBONE_SHAPE None DETECTION_MAX_INSTANCES 100 DETECTION_MIN_CONFIDENCE 0.7 DETECTION_NMS_THRESHOLD 0.3 FPN_CLASSIF_FC_LAYERS_SIZE 1024 GPU_COUNT 1 GRADIENT_CLIP_NORM 5.0 IMAGES_PER_GPU 1 IMAGE_CHANNEL_COUNT 3 IMAGE_MAX_DIM 1024 IMAGE_META_SIZE 93 IMAGE_MIN_DIM 800 IMAGE_MIN_SCALE 0 IMAGE_RESIZE_MODE square IMAGE_SHAPE [1024 1024 3] LEARNING_MOMENTUM 0.9 LEARNING_RATE 0.001 LOSS_WEIGHTS {'rpn_class_loss': 1.0, 'rpn_bbox_loss': 1.0, 'mrcnn_class_loss': 1.0, 'mrcnn_bbox_loss': 1.0, 'mrcnn_mask_loss': 1.0} MASK_POOL_SIZE 14 MASK_SHAPE [28, 28] MAX_GT_INSTANCES 100 MEAN_PIXEL [ 123.7 116.8 103.9] MINI_MASK_SHAPE (56, 56) NAME coco NUM_CLASSES 81 POOL_SIZE 7 POST_NMS_ROIS_INFERENCE 1000 POST_NMS_ROIS_TRAINING 2000 PRE_NMS_LIMIT 6000 ROI_POSITIVE_RATIO 0.33 RPN_ANCHOR_RATIOS [0.5, 1, 2] RPN_ANCHOR_SCALES (32, 64, 128, 256, 512) RPN_ANCHOR_STRIDE 1 RPN_BBOX_STD_DEV [ 0.1 0.1 0.2 0.2] RPN_NMS_THRESHOLD 0.7 RPN_TRAIN_ANCHORS_PER_IMAGE 256 STEPS_PER_EPOCH 1000 TOP_DOWN_PYRAMID_SIZE 256 TRAIN_BN False TRAIN_ROIS_PER_IMAGE 200 USE_MINI_MASK True USE_RPN_ROIS True VALIDATION_STEPS 50 WEIGHT_DECAY 0.0001
首先初始化模型,而後載入預訓練參數文件,在末尾我可視化了模型,不過真的太長了,因此註釋掉了。在第一步初始化時就會根據mode參數的具體值創建計算圖,本節介紹的推斷網絡就是在mode參數設定爲"inference"時創建的計算網絡。dom
# Create model object in inference mode. model = modellib.MaskRCNN(mode="inference", model_dir=MODEL_DIR, config=config) # Load weights trained on MS-COCO model.load_weights(COCO_MODEL_PATH, by_name=True) # model.keras_model.summary()
# Load a random image from the images folder file_names = next(os.walk(IMAGE_DIR))[2] # 只要是迭代器調用next方法獲取值,學習了 image = skimage.io.imread(os.path.join(IMAGE_DIR, random.choice(file_names))) print(image.shape) # Run detection results = model.detect([image], verbose=1) # Visualize results r = results[0] visualize.display_instances(image, r['rois'], r['masks'], r['class_ids'], class_names, r['scores'])
讀取一張圖片,調用model的detect方法,便可輸出結果,最後使用輔助方法可視化結果:源碼分析
inference的前向邏輯以下圖所示,咱們簡單的看一下其計算流程是怎樣的,學習
rpn_class:[batch, num_rois, 2]
rpn_bbox:[batch, num_rois, (dy, dx, log(dh), log(dw))]
rpn_rois:[IMAGES_PER_GPU, num_rois, (y1, x1, y2, x2)]
mrcnn_class_logits: [batch, num_rois, NUM_CLASSES] classifier logits (before softmax)
mrcnn_class: [batch, num_rois, NUM_CLASSES] classifier probabilities
mrcnn_bbox(deltas): [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))]
最後,咱們但願網絡輸出下面的張量:spa
# num_anchors, 每張圖片上生成的錨框數量
# num_rois, 每張圖片上由錨框篩選出的推薦區數量,
# # 由 POST_NMS_ROIS_TRAINING 或 POST_NMS_ROIS_INFERENCE 規定
# num_detections, 每張圖片上最終檢測輸出框,
# # 由 DETECTION_MAX_INSTANCES 規定
# detections, [batch, num_detections, (y1, x1, y2, x2, class_id, score)]
# mrcnn_class, [batch, num_rois, NUM_CLASSES] classifier probabilities
# mrcnn_bbox, [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))]
# mrcnn_mask, [batch, num_detections, MASK_POOL_SIZE, MASK_POOL_SIZE, NUM_CLASSES]
# rpn_rois, [batch, num_rois, (y1, x1, y2, x2, class_id, score)]
# rpn_class, [batch, num_anchors, 2]
# rpn_bbox [batch, num_anchors, 4]
具體每種張量的意義咱們會在源碼分析中一一介紹。blog