首先放出大佬的項目地址:https://github.com/yangxue0827/R2CNN_FPN_Tensorflowhtml
那麼從輸入的數據開始吧,輸入的數據要求爲tfrecord格式的數據集,好在大佬在項目裏已經給出了相應的代碼,不過須要的原始數據爲VOC格式,這裏,我在之前的筆記裏保存了普通圖片+txt格式的原始數據生成VOC格式的數據集的代碼(http://www.cnblogs.com/fourmi/p/8947342.html)。上述數據集生成後,就開始設置batch了,設置BatchSize爲1,這裏也被稱爲在線學習(https://blog.csdn.net/ycheng_sjtu/article/details/49804041),貌似收斂效果可能會有很差的影響,下面的是生成batch代碼的解釋。python
def next_batch(dataset_name, batch_size, shortside_len, is_training):
if dataset_name not in ['tianchi', 'spacenet', 'pascal', 'coco']: raise ValueError('dataSet name must be in pascal or coco') if is_training: pattern = os.path.join('../data/tfrecords', dataset_name + '_train.tfrecord') else: pattern = os.path.join('../data/tfrecords', dataset_name + '_test.tfrecord') print('tfrecord path is -->', os.path.abspath(pattern)) filename_tensorlist = tf.train.match_filenames_once(pattern)#判斷是否讀取到文件 filename_queue = tf.train.string_input_producer(filename_tensorlist)#使用#tf.train.string_input_producer函數把咱們須要的所有文件打包爲一個tf#內部的queue類型,以後tf開文件就從這個queue中取目錄了,要注意一#點的是這個函數的shuffle參數默認是True img_name, img, gtboxes_and_label, num_obs = read_and_prepocess_single_img(filename_queue, shortside_len, #這裏對圖像進行處理與變換從而進行數據加強 ,返回的是文件名,座標及#標籤,以及物體的個數。 is_training=is_training) img_name_batch, img_batch, gtboxes_and_label_batch, num_obs_batch = \ tf.train.batch( [img_name, img, gtboxes_and_label, num_obs], batch_size=batch_size, capacity=100, num_threads=16, dynamic_pad=True) return img_name_batch, img_batch, gtboxes_and_label_batch, num_obs_batch#這裏產生batch,隊列最大等待數爲100,多線程處理
上述獲得的座標爲(x0,y0,x1,y1,x2,y2,x3,y3),做者下面對其進行變換爲(x_c,y_c,h,w),變換獲得圖像的中心及寬和高,使用的是opencv中的函數,ios
rect1 = cv2.minAreaRect(box) #獲得最小矩形區域
這裏有個有趣的函數,做用是將python用tensorflow進行封裝git
gtboxes_and_label = tf.py_func(back_forward_convert, inp=[tf.squeeze(gtboxes_and_label_batch, 0)], Tout=tf.float32) #tf.squeeze()這個是刪除第0維的值
在此項目中R2CNN的網絡部分主要包含三大結構與論文裏的遙相呼應,分別爲share-net,rpn,fast R-CNN。github
首先聊一下share-net吧,放個代碼感覺一下。網絡
_, share_net = get_network_byname(net_name=cfgs.NET_NAME, inputs=img_batch, num_classes=None, is_training=True, output_stride=None, global_pool=False, spatial_squeeze=False) #NET_NAME=resnet_v1_101
比較顯然,這裏使用的是resnet對數據進行特徵提取,而論文裏的是faster R-CNN,有關resnet_v1_101網絡優化參數的調整均可以在config_res101.py這個文件中進行更改。而網絡的結構的定義在resnet_v1.py文件中。這裏顯示的是其中resnet_v1_101網絡的定義,還有其餘的網絡能夠進行調用。多線程
def resnet_v1_101(inputs, num_classes=None, is_training=True, global_pool=True, output_stride=None, spatial_squeeze=True, reuse=None, scope='resnet_v1_101'): """ResNet-101 model of [1]. See resnet_v1() for arg and return description.""" blocks = [ resnet_v1_block('block1', base_depth=64, num_units=3, stride=2), resnet_v1_block('block2', base_depth=128, num_units=4, stride=2), resnet_v1_block('block3', base_depth=256, num_units=23, stride=2), resnet_v1_block('block4', base_depth=512, num_units=3, stride=1), ] return resnet_v1(inputs, blocks, num_classes, is_training, global_pool=global_pool, output_stride=output_stride, include_root_block=True, spatial_squeeze=spatial_squeeze, reuse=reuse, scope=scope) resnet_v1_101.default_image_size = resnet_v1.default_image_size
share-net代碼對比resnet網絡結構就比較清晰了,就這樣,有種暴殄天物的趕腳。。。。app
下面就說說rpn部分的代碼了,rpn能夠說是比較經典了,但我的學習深度學習比較短,還不能很好的理解,這裏引入網上大佬們的博客,你們一塊兒學習學習:https://blog.csdn.net/jiongnima/article/details/79781792,https://blog.csdn.net/happyflyy/article/details/54917514less
# *********************************************************************************************** # * rpn * # *********************************************************************************************** rpn = build_rpn.RPN(net_name=cfgs.NET_NAME, inputs=img_batch, gtboxes_and_label=gtboxes_and_label_minAreaRectangle, is_training=True, share_head=cfgs.SHARE_HEAD,#是否將起初的share-net 傳入,這裏設置爲false share_net=share_net,#傳入的是resnet_v1_101 stride=cfgs.STRIDE,#STRIDE = [4, 8, 16, 32, 64] anchor_ratios=cfgs.ANCHOR_RATIOS,#ANCHOR_RATIOS = [1 / 3., 1., 3.0]
anchor_scales=cfgs.ANCHOR_SCALES,#ANCHOR_SCALES = [1.]ide
scale_factors=cfgs.SCALE_FACTORS,#SCALE_FACTORS = [10., 10., 5., 5., 5.]
base_anchor_size_list=cfgs.BASE_ANCHOR_SIZE_LIST, # P2, P3, P4, P5, P6 level=cfgs.LEVEL, top_k_nms=cfgs.RPN_TOP_K_NMS, rpn_nms_iou_threshold=cfgs.RPN_NMS_IOU_THRESHOLD,#0.7 max_proposals_num=cfgs.MAX_PROPOSAL_NUM, rpn_iou_positive_threshold=cfgs.RPN_IOU_POSITIVE_THRESHOLD, rpn_iou_negative_threshold=cfgs.RPN_IOU_NEGATIVE_THRESHOLD, # iou>=0.7 is positive box, iou< 0.3 is negative rpn_mini_batch_size=cfgs.RPN_MINIBATCH_SIZE, rpn_positives_ratio=cfgs.RPN_POSITIVE_RATE, remove_outside_anchors=False, # whether remove anchors outside rpn_weight_decay=cfgs.WEIGHT_DECAY[cfgs.NET_NAME]) rpn_proposals_boxes, rpn_proposals_scores = rpn.rpn_proposals() # rpn_score shape: [300, ] rpn_location_loss, rpn_classification_loss = rpn.rpn_losses() rpn_total_loss = rpn_classification_loss + rpn_location_loss
從rpn的代碼直觀上能夠感受到的是,主要包含三部分,一是RPN網絡的搭建及初始化,二是proposals 的生成,及對應的文本/非文本分數值的計算,最後一個就是對應的loss函數的定義,這裏loss函數包含兩個一個是迴歸loss,另外一個是分類loss。重點是proposals的生成,首先要產生anchors,本代碼中有五個級別的anchors(32,64,128,256,512),首先創建特徵金字塔,滑動窗口的位置選在從resnet_v1_101/block4,做爲p5,而後進行一次池化操做,做爲P6,而後,依次對resnet_v1_101,的block4,block3,block2,分別進行上採樣-卷積-求加-卷積,依次造成相應的特徵金字塔層,返回的是多個尺寸的feature_map(p2,p3,p4,p5,p6,其中p6是由p5最大池化後處理獲得的)。針對金字塔的每一層即相對應的feature-map生成anchors,每層金字塔特定的feature-map上用到的anchor都有對應的大小((P2, 32), (P3, 64), (P4, 128), (P5, 256), (P6, 512)),生成anchors中有一個base_anchor ,還有一個anchor_scales,首先base_anchor根據anchor_scales進行大小的縮放,
而後,根據anchor_ratios的值進行長寬比的縮放,從而有多個anchor尺寸的選擇。而後,將feature_map*步長會獲得相應的中心點,由下列代碼最終獲得final_anchor
:return: anchors of shape [w * h * len(anchor_scales) * len(anchor_ratios), 4]
最終返回的生成的anchor的數量及格式能夠看的很清楚。
nchors of shape [w * h * len(anchor_scales) * len(anchor_ratios), 4] # [y_center, x_center, h, w]
有了anchors後,接下來就是rpn網絡的定義了,上代碼以下:
def rpn_net(self): rpn_encode_boxes_list = [] rpn_scores_list = [] with tf.variable_scope('rpn_net'): with slim.arg_scope([slim.conv2d], weights_regularizer=slim.l2_regularizer(self.rpn_weight_decay)): for level in self.level: if self.share_head: reuse_flag = None if level == 'P2' else True scope_list = ['conv2d_3x3', 'rpn_classifier', 'rpn_regressor'] else: reuse_flag = None scope_list = ['conv2d_3x3_'+level, 'rpn_classifier_'+level, 'rpn_regressor_'+level] rpn_conv2d_3x3 = slim.conv2d(inputs=self.feature_pyramid[level], num_outputs=256, kernel_size=[3, 3], stride=1, scope=scope_list[0], reuse=reuse_flag) rpn_box_scores = slim.conv2d(rpn_conv2d_3x3, num_outputs=2 * self.num_of_anchors_per_location, kernel_size=[1, 1], stride=1, scope=scope_list[1], activation_fn=None, reuse=reuse_flag) rpn_encode_boxes = slim.conv2d(rpn_conv2d_3x3, num_outputs=4 * self.num_of_anchors_per_location, kernel_size=[1, 1], stride=1, scope=scope_list[2], activation_fn=None, reuse=reuse_flag) rpn_box_scores = tf.reshape(rpn_box_scores, [-1, 2]) rpn_encode_boxes = tf.reshape(rpn_encode_boxes, [-1, 4]) rpn_scores_list.append(rpn_box_scores) rpn_encode_boxes_list.append(rpn_encode_boxes) rpn_all_encode_boxes = tf.concat(rpn_encode_boxes_list, axis=0) rpn_all_boxes_scores = tf.concat(rpn_scores_list, axis=0) return rpn_all_encode_boxes, rpn_all_boxes_scores
」with tf.variable_scope('rpn_net'):「表明初始化,採用前面特徵金子塔對應層級的基礎上進行依次核大小爲3*3的卷積操做獲得rpn_conv2d_3x3,而後下面就開始出現分歧,一部分在此基礎上進行分類操做(文本/非文本分數值),另外一個進行迴歸操做(框四個座標位置的預測),而後將分類和迴歸全部對應合併獲得兩個標準(分類,和迴歸)。這就是rpn網絡的功能???!!而後更具scores返回最高的幾個框,而後對這幾個框根據IOU(大於0.7的視爲不錯)進行NMS處理,返回index,而後根據Index挑選框(優秀選手),返回proposals(優秀選手)及他們的scores(成績)。PROPOSALS白活到這裏,下面就是rpn的loss函數了。。。
rpn_loss代碼定義。。。來吧。
def rpn_losses(self): with tf.variable_scope('rpn_losses'): minibatch_indices, minibatch_anchor_matched_gtboxes, object_mask, minibatch_labels_one_hot = \ self.make_minibatch(self.anchors) minibatch_anchors = tf.gather(self.anchors, minibatch_indices) minibatch_encode_boxes = tf.gather(self.rpn_encode_boxes, minibatch_indices) minibatch_boxes_scores = tf.gather(self.rpn_scores, minibatch_indices) # encode gtboxes minibatch_encode_gtboxes = encode_and_decode.encode_boxes(unencode_boxes=minibatch_anchor_matched_gtboxes, reference_boxes=minibatch_anchors, scale_factors=self.scale_factors) positive_anchors_in_img = draw_box_with_color(self.img_batch, minibatch_anchors * tf.expand_dims(object_mask, 1), text=tf.shape(tf.where(tf.equal(object_mask, 1.0)))[0]) negative_mask = tf.cast(tf.logical_not(tf.cast(object_mask, tf.bool)), tf.float32) negative_anchors_in_img = draw_box_with_color(self.img_batch, minibatch_anchors * tf.expand_dims(negative_mask, 1), text=tf.shape(tf.where(tf.equal(object_mask, 0.0)))[0]) minibatch_decode_boxes = encode_and_decode.decode_boxes(encode_boxes=minibatch_encode_boxes, reference_boxes=minibatch_anchors, scale_factors=self.scale_factors) tf.summary.image('/positive_anchors', positive_anchors_in_img) tf.summary.image('/negative_anchors', negative_anchors_in_img) top_k_scores, top_k_indices = tf.nn.top_k(minibatch_boxes_scores[:, 1], k=5) top_detections_in_img = draw_box_with_color(self.img_batch, tf.gather(minibatch_decode_boxes, top_k_indices), text=tf.shape(top_k_scores)[0]) tf.summary.image('/top_5', top_detections_in_img) # losses with tf.variable_scope('rpn_location_loss'): location_loss = losses.l1_smooth_losses(predict_boxes=minibatch_encode_boxes, gtboxes=minibatch_encode_gtboxes, object_weights=object_mask) slim.losses.add_loss(location_loss) # add smooth l1 loss to losses collection with tf.variable_scope('rpn_classification_loss'): classification_loss = slim.losses.softmax_cross_entropy(logits=minibatch_boxes_scores, onehot_labels=minibatch_labels_one_hot) return location_loss, classification_loss
由上面能夠看出來,rpn_loss針對的是minibatch,那minibatch是個啥呢?在make_minibatch中調用了一句函數"rpn_find_positive_negative_samples",
#此函數的說明爲: ''' assign anchors targets: object or background. :param anchors: [valid_num_of_anchors, 4]. use N to represent valid_num_of_anchors :return:labels. anchors_matched_gtboxes, object_mask labels shape is [N, ]. positive is 1, negative is 0, ignored is -1 anchor_matched_gtboxes. each anchor's gtbox(only positive box has gtbox)shape is [N, 4] object_mask. tf.float32. 1.0 represent box is object, 0.0 is others. shape is [N, ] '''
經過比較anchors和gtboxes比較計算出一個iou值,而後尋找每一行最大的iou值,將這個值與0.7比較,大於的爲positivate,將每一列的最大值進行累加求和。
labels = tf.ones(shape=[tf.shape(anchors)[0], ], dtype=tf.float32) * (-1) # [N, ] # ignored is -1
positives2 = tf.reduce_sum(tf.cast(tf.equal(ious, max_iou_each_column), tf.float32), axis=1) positives = tf.logical_or(positives1, tf.cast(positives2, tf.bool)) labels += 2 * tf.cast(positives, tf.float32) # Now, positive is 1, ignored and background is -1
通過上述幾句就能夠將positivate 表示爲1,其餘狀況表示爲-1 ,這裏看的不是很明白。。。labels=(-1,1)+2*(1,0)一一對應?
matchs = tf.cast(tf.argmax(ious, axis=1), tf.int32) anchors_matched_gtboxes = tf.gather(gtboxes, matchs) # [N, 4]
根據上述代碼能夠找到較好的matchs對應的groundtruth,尋找negative大同小異了,這裏貼出代碼,能夠嘗試比較一下。。
negatives = tf.less(max_iou_each_row, self.rpn_iou_negative_threshold) negatives = tf.logical_and(negatives, tf.greater_equal(max_iou_each_row, 0.1)) labels = labels + tf.cast(negatives, tf.float32) # [N, ] positive is >=1.0, negative is 0, ignored is -1.0 ''' Need to note: when opsitive, labels may >= 1.0. Because, when all the iou< 0.7, we set anchors having max iou each column as positive. these anchors may have iou < 0.3. In the begining, labels is [-1, -1, -1...-1] then anchors having iou<0.3 as well as are max iou each column will be +1.0. when decide negatives, because of iou<0.3, they add 1.0 again. So, the final result will be 2.0 So, when opsitive, labels may in [1.0, 2.0]. that is labels >=1.0 ''' positives = tf.cast(tf.greater_equal(labels, 1.0), tf.float32) ignored = tf.cast(tf.equal(labels, -1.0), tf.float32) * -1 labels = positives + ignored object_mask = tf.cast(positives, tf.float32) # 1.0 is object, 0.0 is others
# losses with tf.variable_scope('rpn_location_loss'): location_loss = losses.l1_smooth_losses(predict_boxes=minibatch_encode_boxes, gtboxes=minibatch_encode_gtboxes, object_weights=object_mask) slim.losses.add_loss(location_loss) # add smooth l1 loss to losses collection with tf.variable_scope('rpn_classification_loss'): classification_loss = slim.losses.softmax_cross_entropy(logits=minibatch_boxes_scores, onehot_labels=minibatch_labels_one_hot)
上述RPN部分代碼介紹至此。。。。
接下來就是Fast R-CNN了,也就是最後一部分了。
# *********************************************************************************************** # * Fast RCNN * # *********************************************************************************************** fast_rcnn = build_fast_rcnn1.FastRCNN(feature_pyramid=rpn.feature_pyramid, rpn_proposals_boxes=rpn_proposals_boxes, rpn_proposals_scores=rpn_proposals_scores, img_shape=tf.shape(img_batch), roi_size=cfgs.ROI_SIZE, roi_pool_kernel_size=cfgs.ROI_POOL_KERNEL_SIZE, scale_factors=cfgs.SCALE_FACTORS, gtboxes_and_label=gtboxes_and_label, gtboxes_and_label_minAreaRectangle=gtboxes_and_label_minAreaRectangle, fast_rcnn_nms_iou_threshold=cfgs.FAST_RCNN_NMS_IOU_THRESHOLD, fast_rcnn_maximum_boxes_per_img=100, fast_rcnn_nms_max_boxes_per_class=cfgs.FAST_RCNN_NMS_MAX_BOXES_PER_CLASS, show_detections_score_threshold=cfgs.FINAL_SCORE_THRESHOLD, # show detections which score >= 0.6 num_classes=cfgs.CLASS_NUM, fast_rcnn_minibatch_size=cfgs.FAST_RCNN_MINIBATCH_SIZE, fast_rcnn_positives_ratio=cfgs.FAST_RCNN_POSITIVE_RATE, fast_rcnn_positives_iou_threshold=cfgs.FAST_RCNN_IOU_POSITIVE_THRESHOLD, # iou>0.5 is positive, iou<0.5 is negative use_dropout=cfgs.USE_DROPOUT, weight_decay=cfgs.WEIGHT_DECAY[cfgs.NET_NAME], is_training=True, level=cfgs.LEVEL) fast_rcnn_decode_boxes, fast_rcnn_score, num_of_objects, detection_category, \ fast_rcnn_decode_boxes_rotate, fast_rcnn_score_rotate, num_of_objects_rotate, detection_category_rotate = \ fast_rcnn.fast_rcnn_predict() fast_rcnn_location_loss, fast_rcnn_classification_loss, \ fast_rcnn_location_rotate_loss, fast_rcnn_classification_rotate_loss = fast_rcnn.fast_rcnn_loss() fast_rcnn_total_loss = fast_rcnn_location_loss + fast_rcnn_classification_loss + \ fast_rcnn_location_rotate_loss + fast_rcnn_classification_rotate_loss
首先看一下下面的代碼,這個是fast R-CNN的定義。
def fast_rcnn_net(self): with tf.variable_scope('fast_rcnn_net'): with slim.arg_scope([slim.fully_connected], weights_regularizer=slim.l2_regularizer(self.weight_decay)): flatten_rois_features = slim.flatten(self.fast_rcnn_all_level_rois) net = slim.fully_connected(flatten_rois_features, 1024, scope='fc_1') if self.use_dropout: net = slim.dropout(net, keep_prob=0.5, is_training=self.is_training, scope='dropout') net = slim.fully_connected(net, 1024, scope='fc_2') fast_rcnn_scores = slim.fully_connected(net, self.num_classes + 1, activation_fn=None, scope='classifier') fast_rcnn_encode_boxes = slim.fully_connected(net, self.num_classes * 4, activation_fn=None, scope='regressor') if DEBUG: print_tensors(fast_rcnn_encode_boxes, 'fast_rcnn_encode_bxes') with tf.variable_scope('fast_rcnn_net_rotate'): with slim.arg_scope([slim.fully_connected], weights_regularizer=slim.l2_regularizer(self.weight_decay)): flatten_rois_features_rotate = slim.flatten(self.fast_rcnn_all_level_rois) net_rotate = slim.fully_connected(flatten_rois_features_rotate, 1024, scope='fc_1') if self.use_dropout: net_rotate = slim.dropout(net_rotate, keep_prob=0.5, is_training=self.is_training, scope='dropout') net_rotate = slim.fully_connected(net_rotate, 1024, scope='fc_2') fast_rcnn_scores_rotate = slim.fully_connected(net_rotate, self.num_classes + 1, activation_fn=None, scope='classifier') fast_rcnn_encode_boxes_rotate = slim.fully_connected(net_rotate, self.num_classes * 5, activation_fn=None, scope='regressor') return fast_rcnn_encode_boxes, fast_rcnn_scores, fast_rcnn_encode_boxes_rotate, fast_rcnn_scores_rotate
定義用到的是全鏈接層,注意這一句,
flatten_rois_features = slim.flatten(self.fast_rcnn_all_level_rois)
self.fast_rcnn_all_level_rois是爲了從feature map 上 得到感興趣區域。過程大致是首先是尋找對應層的rpn_proposals,而後提取出座標,進行歸一化處理後,根據處理後的座標,從特徵金字塔上提取相對應的區域feature map,而後經一個最大池化操做後獲得。。。
self.fast_rcnn_encode_boxes, self.fast_rcnn_scores, \
self.fast_rcnn_encode_boxes_rotate, self.fast_rcnn_scores_rotate = self.fast_rcnn_net()
fast_rcnn_encode_boxes,fast_rcnn_scores都是由fast_rcnn_net獲得的,是一個全鏈接的網絡。根據上述獲得的一些ROI區域的框及分數,能夠獲得fast R-CNN的proposals
def fast_rcnn_proposals_rotate(self, decode_boxes, scores): ''' mutilclass NMS :param decode_boxes: [N, num_classes*5] :param scores: [N, num_classes+1] :return: detection_boxes : [-1, 5] scores : [-1, ] ''' with tf.variable_scope('fast_rcnn_proposals'): category = tf.argmax(scores, axis=1) object_mask = tf.cast(tf.not_equal(category, 0), tf.float32) decode_boxes = decode_boxes * tf.expand_dims(object_mask, axis=1) # make background box is [0 0 0 0, 0] scores = scores * tf.expand_dims(object_mask, axis=1) decode_boxes = tf.reshape(decode_boxes, [-1, self.num_classes, 5]) # [N, num_classes, 5] decode_boxes_list = tf.unstack(decode_boxes, axis=1) score_list = tf.unstack(scores[:, 1:], axis=1) after_nms_boxes = [] after_nms_scores = [] category_list = [] for per_class_decode_boxes, per_class_scores in zip(decode_boxes_list, score_list): valid_indices = nms_rotate.nms_rotate(decode_boxes=per_class_decode_boxes, scores=per_class_scores, iou_threshold=self.fast_rcnn_nms_iou_threshold, max_output_size=self.fast_rcnn_nms_max_boxes_per_class, use_angle_condition=False, angle_threshold=15, use_gpu=cfgs.ROTATE_NMS_USE_GPU) after_nms_boxes.append(tf.gather(per_class_decode_boxes, valid_indices)) after_nms_scores.append(tf.gather(per_class_scores, valid_indices)) tmp_category = tf.gather(category, valid_indices) category_list.append(tmp_category) all_nms_boxes = tf.concat(after_nms_boxes, axis=0) all_nms_scores = tf.concat(after_nms_scores, axis=0) all_category = tf.concat(category_list, axis=0) all_nms_boxes = boxes_utils.clip_boxes_to_img_boundaries_five(all_nms_boxes, img_shape=self.img_shape) print('all_nms_boxes:',all_nms_boxes) scores_large_than_threshold_indices = \ tf.reshape(tf.where(tf.greater(all_nms_scores, self.show_detections_score_threshold)), [-1]) all_nms_boxes = tf.gather(all_nms_boxes, scores_large_than_threshold_indices) all_nms_scores = tf.gather(all_nms_scores, scores_large_than_threshold_indices) all_category = tf.gather(all_category, scores_large_than_threshold_indices) return all_nms_boxes, all_nms_scores, tf.shape(all_nms_boxes)[0], all_category
接下來就是定義loss函數,這裏形式和rpn的大致類似。就不加贅述了。
fast_rcnn_total_loss = fast_rcnn_location_loss + fast_rcnn_classification_loss + \
fast_rcnn_location_rotate_loss + fast_rcnn_classification_rotate_loss
放出幾張測試的結果圖吧,不帶數字的是標籤,帶數字的爲預測的,分爲不考慮角度的和考慮角度的兩種。
至此,通過幾天的折騰,算是完事了吧,經過閱讀代碼能夠明白整個實現流程,對理清思路仍是頗有個幫助的,尤爲是理解將論文具體應用的生活實踐的過程是如何實現的頗有用,可是自身水平極其有限,對這個大佬的代碼好多細節不太明白,有點暴殄天物了,前方路途遙遠,咱們繼續吧!!!
2018-04-29 09:44:21