[CVPR2017] Weakly Supervised Cascaded Convolutional Networks論文筆記

時間 2020-05-21

標籤 cvpr2017 cvpr weakly supervised cascaded convolutional networks 論文筆記简体版

原文原文鏈接

https://www.csee.umbc.edu/~hpirsiav/papers/cascade_cvpr17.pdf網絡

Weakly Supervised Cascaded Convolutional Networks, Ali Diba, Vivek Sharma, Ali Pazandeh, Hamed Pirsiavash and Luc Van Goolapp

亮點dom

經過多任務疊加（分類，分割）提升了多物體弱監督檢測的正確率
經過利用segmentation篩選純淨的proposals，獲得了更魯棒的結果
爲弱監督分割任務設計比較魯棒的loss

只考慮全局的分類結果和置信度對高的部分
經過loss的weights關注到最須要關注的部分

相關工做 ui

One of the most common approaches [7] consists of the following steps:spa

generates object proposals,
extracts features from the proposals,
applies multiple instance learning (MIL) to the features and finds the box labels from the weak bag (image) labels.

弱監督物體檢測難點: 弱監督物體檢測對初始化要求很高，很差的初始化可能會使網絡陷入局部最優解，解決的辦法主要有如下幾個：設計

improve the initialization [31, 9, 28, 29]
regularizing the optimization strategies [4, 5, 7]
[17] employ an iterative self-learning strategy to employ harder samples to a small set of initial samples
[15] use a convex relaxation of soft-max loss

Majority of the previous works [25, 32] use a large collection of noisy object proposals to train their object detector. In contrast, our method only focuses on a very few clean collection of object proposals that are far more reliable, robust, computationally efficient, and gives better performanceorm

方法blog

Two-stage: proposal and image classification (conv1 till con5, global pooling) + multiple instance learning (2fc, score layer)ip

1. image classification: CNN with global average pooling (GAP) ［36］中引入，將分類過程當中fc層的weights做爲原來convolutional layer輸出的權重並將全部頻道加權獲得的圖做爲class activation map。在這一步中，還產生一個分類的loss LGAPci

[36] B. Zhou, A. Khosla, A. Lapedriza, A. Oliva, and A. Torralba. Learning deep features for discriminative localization. In CVPR, 2016. 3, 4, 5, 6, 7, 8

2. multiple instance learning

Proposal: edgeboxs [37] is used to generate an initial set of object proposals. Then we threshold the class activation map [36] to come up with a mask. Finally, we choose the initial boxes with largest overlap with the mask.

Three-stage: more information about the objects’ boundary learned in a segmentation task can lead to acquisition of a better appearance model and then better object localization.

主要思想：分割監督信號幫助提高定位準確率。
弱分割監督信號：上一級獲得的mask

實驗結果

PASCAL VOC 2007

＋3.3% classification compared with [18]
+1.6% correct localization compared with [27]
+0.6% compared with [6]

PASCAL VOC 2010

+3.3% compared with [6]

PASCAL VOC 2012

+8.8% compared with [18]
ILSVRC 2013
+5.5% compared with [18]

Object detection training

PASCAL VOC 2007 test set: Faster RCNN trained by the pseudo ground-truth (GT) bounding boxes generated by our cascaded networks performs slightly better than our transfered model. (+0.3%)

[6] H. Bilen and A. Vedaldi. Weakly supervised deep detection networks. In CVPR, 2016. 6, 7, 8

[18] D. Li, J.-B. Huang, Y. Li, S. Wang, and M.-H. Yang. Weakly supervised object localization with progressive domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition, 2016. 2, 6, 7

[27] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015. 5, 6

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。