項目地址:Mask_RCNNpython
語言框架:Python 3, Keras, and TensorFlowgit
Python 3.4, TensorFlow 1.3, Keras 2.0.8 其餘依賴見:requirements.txt
github
基礎網絡:Feature Pyramid Network (FPN) and a ResNet101 backboneapi
如下是模型主體文件,bash
demo.ipynb Is the easiest way to start. It shows an example of using a model pre-trained on MS COCO to segment objects in your own images. It includes code to run object detection and instance segmentation on arbitrary images.網絡
train_shapes.ipynb shows how to train Mask R-CNN on your own dataset. This notebook introduces a toy dataset (Shapes) to demonstrate training on a new dataset.框架
(model.py, utils.py, config.py): These files contain the main Mask RCNN implementation.ide
下面的幾個文件是觀察理解模型所用,post
inspect_data.ipynb. This notebook visualizes the different pre-processing steps to prepare the training data.學習
inspect_model.ipynb This notebook goes in depth into the steps performed to detect and segment objects. It provides visualizations of every step of the pipeline.
inspect_weights.ipynb This notebooks inspects the weights of a trained model and looks for anomalies and odd patterns.
可視化展現:
下面圖片展現了RPN網絡的輸出,即同時包含了積極錨框和消極錨框的proposal們:
下面展現了送入迴歸器以前的(第一步中的proposal)和最終輸出的定位框座標:
mask展現,
查看不一樣層的激活特徵,
權重分佈直方圖,
做者已經提供了一個預訓練好的基於COCO數據的權重以便咱們快速啓動訓練任務,相關代碼位於samples/coco/coco.py
,
咱們既能夠將之做爲包導入到本身的代碼中,也能夠以下在命令行直接調用該部分代碼,
# Train a new model starting from pre-trained COCO weights python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=coco # Train a new model starting from ImageNet weights python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=imagenet # Continue training a model that you had trained earlier python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=/path/to/weights.h5 # Continue training the last model you trained. This will find # the last trained weights in the model directory. python3 samples/coco/coco.py train --dataset=/path/to/coco/ --model=last
若是咱們但願驗證COCO數據於最後保存的模型之上,
# Run COCO evaluation on the last trained model python3 samples/coco/coco.py evaluate --dataset=/path/to/coco/ --model=last
相關參數仍然位於samples/coco/coco.py
。
做者推薦了一篇博客blog post about the balloon color splash sample,介紹了從標註圖片到訓練它們再到將結果應用於一個小例子中的流程,其源碼也位於本項目內,見samples/coco/
balloon。
爲了訓練本身的數據,咱們須要修改繼承兩個類,
Config
This class contains the default configuration.Subclass it and modify the attributes you need to change.
Dataset
This class provides a consistent way to work with any dataset. // 一個class能夠處理不一樣數據集It allows you to use new datasets for training without having to change the code of the model. // 經過這個class咱們能夠最大程度避免修改model文件自己
It also supports loading multiple datasets at the same time, which is useful if the objects you want to detect are not all available in one dataset. // 能夠同時處理不一樣數據集用於一次訓練,知足特殊需求
有關使用示例見這四個文件:
samples/shapes/train_shapes.ipynb,
samples/coco/coco.py,
samples/balloon/balloon.py
,
samples/nucleus/nucleus.py
。
爲了代碼的簡單和可擴展性,做者對工程進行了小幅度調整,其統計以下:
Image Resizing:本工程的圖像尺寸預調整不一樣於原論文中的,以COCO數據集爲例,做者將數據調整爲1024*1024的大小,爲了保證長寬比不變(語義不受影響),做者採起了填充0將圖片變爲1:1長寬比的方式,而非直接裁剪插值。
Bounding Boxes:數據集中的數據標籤,除了帶有mask標籤以外,還有gt box標籤,做者爲了統一併簡化操做,捨棄了box標籤,徹底使用mask圖,使用它們生成一個徹底覆蓋所有標記像素的框做爲gt box,這一步驟很大程度上簡化了圖像加強操做,不少例如旋轉處理同樣的預處理操一旦涉及gt box就會變得很麻煩,採用mash生成方式可使得預處理操做更爲方便。
爲了驗證計算出來的gt box和原數據gt box的差別,做者進行了比較,效果以下:
We found that ~2% of bounding boxes differed by 1px or more, ~0.05% differed by 5px or more, and only 0.01% differed by 10px or more.
Learning Rate:原論文使用0.02的學習率,做者實驗以爲該學習率偏大,常常性形成梯度爆炸,特別是當batch很小時,做者有兩點猜想:這多是由於caffe與TensorFlow在多GPU上傳播更新梯度的策略不一樣所致(sum vs mean across batches and GPUs);或者這是因爲paper團隊使用了clip梯度的方式規避了梯度爆照的發生,可是做者提到他採用的clipping操做效果並不顯著。
一、Install dependencies
pip3 install -r requirements.txt
二、Clone this repository
三、Run setup from the repository root directory
python3 setup.py install
四、Download pre-trained COCO weights (mask_rcnn_coco.h5) from the releases page.
五、(Optional) To train or test on MS COCO install pycocotools
from one of these repos. They are forks of the original pycocotools with fixes for Python3 and Windows (the official repo doesn't seem to be active anymore).
最後,做者展現了幾個使用了本框架的工程,這裏再也不引用。