YOLO,是You Only Look Once的縮寫,一種基於深度卷積神經網絡的物體檢測算法,YOLO v3是YOLO的第3個版本,檢測算法更快更準,2018年4月8日。python
本文源碼:https://github.com/SpikeKing/keras-yolo3-detectiongit
歡迎Follow個人GitHub:https://github.com/SpikeKinggithub
YOLO v3已經提供COCO(Common Objects in Context)數據集的模型參數,支持直接用於物體檢測,模型248M,下載:redis
wget https://pjreddie.com/media/files/yolov3.weights
複製代碼
將模型參數轉換爲Keras的模型參數,模型248.6M,轉換:算法
python convert.py -w yolov3.cfg model_data/yolov3.weights model_data/yolo_weights.h5
複製代碼
畫出網絡結構:bash
plot_model(model, to_file='./model_data/model.png', show_shapes=True, show_layer_names=True) # 網絡圖
複製代碼
COCO含有80個類別:網絡
person(人)
bicycle(自行車) car(汽車) motorbike(摩托車) aeroplane(飛機) bus(公共汽車) train(火車) truck(卡車) boat(船)
traffic light(信號燈) fire hydrant(消防栓) stop sign(停車標誌) parking meter(停車計費器) bench(長凳)
bird(鳥) cat(貓) dog(狗) horse(馬) sheep(羊) cow(牛) elephant(大象) bear(熊) zebra(斑馬) giraffe(長頸鹿)
backpack(揹包) umbrella(雨傘) handbag(手提包) tie(領帶) suitcase(手提箱)
frisbee(飛盤) skis(滑雪板雙腳) snowboard(滑雪板) sports ball(運動球) kite(風箏) baseball bat(棒球棒) baseball glove(棒球手套) skateboard(滑板) surfboard(衝浪板) tennis racket(網球拍)
bottle(瓶子) wine glass(高腳杯) cup(茶杯) fork(叉子) knife(刀)
spoon(勺子) bowl(碗)
banana(香蕉) apple(蘋果) sandwich(三明治) orange(橘子) broccoli(西蘭花) carrot(胡蘿蔔) hot dog(熱狗) pizza(披薩) donut(甜甜圈) cake(蛋糕)
chair(椅子) sofa(沙發) pottedplant(盆栽植物) bed(牀) diningtable(餐桌) toilet(廁所) tvmonitor(電視機)
laptop(筆記本) mouse(鼠標) remote(遙控器) keyboard(鍵盤) cell phone(電話)
microwave(微波爐) oven(烤箱) toaster(烤麪包器) sink(水槽) refrigerator(冰箱)
book(書) clock(鬧鐘) vase(花瓶) scissors(剪刀) teddy bear(泰迪熊) hair drier(吹風機) toothbrush(牙刷)
複製代碼
YOLO的默認anchors是9個,對應三個尺度,每一個尺度含有3個anchors,以下:session
10,13, 16,30, 33,23, 30,61, 62,45, 59,119, 116,90, 156,198, 373,326
複製代碼
YOLO檢測類的構造器:app
self.class_names
、self.anchors
,讀取類別和anchors;self.sess
是TensorFlow的上下文環境;self.model_image_size
,檢測圖片尺寸,將原圖片同比例resize檢測尺寸,空白填充;self.generate()
是參數流程,輸出框(boxes)、置信度(scores)和類別(classes);源碼:dom
class YOLO(object):
def __init__(self):
self.anchors_path = 'configs/yolo_anchors.txt' # anchors
self.model_path = 'model_data/yolo_weights.h5' # 模型文件
self.classes_path = 'configs/coco_classes.txt' # 類別文件
self.score = 0.3 # 置信度閾值
# self.iou = 0.45
self.iou = 0.20 # 交叉區域閾值
self.class_names = self._get_class() # 獲取類別
self.anchors = self._get_anchors() # 獲取anchor
self.sess = K.get_session()
self.model_image_size = (416, 416) # fixed size or (None, None), hw
self.boxes, self.scores, self.classes = self.generate()
def _get_class(self):
classes_path = os.path.expanduser(self.classes_path)
with open(classes_path) as f:
class_names = f.readlines()
class_names = [c.strip() for c in class_names]
return class_names
def _get_anchors(self):
anchors_path = os.path.expanduser(self.anchors_path)
with open(anchors_path) as f:
anchors = f.readline()
anchors = [float(x) for x in anchors.split(',')]
return np.array(anchors).reshape(-1, 2)
複製代碼
參數流程:輸出框(boxes)、置信度(scores)和類別(classes)
yolo_body
網絡中,加載yolo_model
參數;源碼:
def generate(self):
model_path = os.path.expanduser(self.model_path) # 轉換~
assert model_path.endswith('.h5'), 'Keras model or weights must be a .h5 file.'
num_anchors = len(self.anchors) # anchors的數量
num_classes = len(self.class_names) # 類別數
# 加載模型參數
self.yolo_model = yolo_body(Input(shape=(None, None, 3)), 3, num_classes)
self.yolo_model.load_weights(model_path)
print('{} model, {} anchors, and {} classes loaded.'.format(model_path, num_anchors, num_classes))
# 不一樣的框,不一樣的顏色
hsv_tuples = [(float(x) / len(self.class_names), 1., 1.)
for x in range(len(self.class_names))] # 不一樣顏色
self.colors = list(map(lambda x: colorsys.hsv_to_rgb(*x), hsv_tuples))
self.colors = list(map(lambda x: (int(x[0] * 255), int(x[1] * 255), int(x[2] * 255)), self.colors)) # RGB
np.random.seed(10101)
np.random.shuffle(self.colors)
np.random.seed(None)
# 根據檢測參數,過濾框
self.input_image_shape = K.placeholder(shape=(2,))
boxes, scores, classes = yolo_eval(self.yolo_model.output, self.anchors, len(self.class_names),
self.input_image_shape, score_threshold=self.score, iou_threshold=self.iou)
return boxes, scores, classes
複製代碼
檢測方法detect_image
第1步,圖像處理:
if self.model_image_size != (None, None): # 416x416, 416=32*13,必須爲32的倍數,最小尺度是除以32
assert self.model_image_size[0] % 32 == 0, 'Multiples of 32 required'
assert self.model_image_size[1] % 32 == 0, 'Multiples of 32 required'
boxed_image = letterbox_image(image, tuple(reversed(self.model_image_size))) # 填充圖像
else:
new_image_size = (image.width - (image.width % 32), image.height - (image.height % 32))
boxed_image = letterbox_image(image, new_image_size)
image_data = np.array(boxed_image, dtype='float32')
print('detector size {}'.format(image_data.shape))
image_data /= 255. # 轉換0~1
image_data = np.expand_dims(image_data, 0) # 添加批次維度,將圖片增長1維
複製代碼
第2步,feed數據,圖像,圖像尺寸;
out_boxes, out_scores, out_classes = self.sess.run(
[self.boxes, self.scores, self.classes],
feed_dict={
self.yolo_model.input: image_data,
self.input_image_shape: [image.size[1], image.size[0]],
K.learning_phase(): 0
})
複製代碼
第3步,繪製邊框,自動設置邊框寬度,繪製邊框和類別文字,使用Pillow。
font = ImageFont.truetype(font='font/FiraMono-Medium.otf',
size=np.floor(3e-2 * image.size[1] + 0.5).astype('int32')) # 字體
thickness = (image.size[0] + image.size[1]) // 512 # 厚度
for i, c in reversed(list(enumerate(out_classes))):
predicted_class = self.class_names[c] # 類別
box = out_boxes[i] # 框
score = out_scores[i] # 執行度
label = '{} {:.2f}'.format(predicted_class, score) # 標籤
draw = ImageDraw.Draw(image) # 畫圖
label_size = draw.textsize(label, font) # 標籤文字
top, left, bottom, right = box
top = max(0, np.floor(top + 0.5).astype('int32'))
left = max(0, np.floor(left + 0.5).astype('int32'))
bottom = min(image.size[1], np.floor(bottom + 0.5).astype('int32'))
right = min(image.size[0], np.floor(right + 0.5).astype('int32'))
print(label, (left, top), (right, bottom)) # 邊框
if top - label_size[1] >= 0: # 標籤文字
text_origin = np.array([left, top - label_size[1]])
else:
text_origin = np.array([left, top + 1])
# My kingdom for a good redistributable image drawing library.
for i in range(thickness): # 畫框
draw.rectangle(
[left + i, top + i, right - i, bottom - i],
outline=self.colors[c])
draw.rectangle( # 文字背景
[tuple(text_origin), tuple(text_origin + label_size)],
fill=self.colors[c])
draw.text(text_origin, label, fill=(0, 0, 0), font=font) # 文案
del draw
複製代碼
使用YOLO檢測器,檢測圖像:
def detect_img_for_test(yolo):
img_path = './dataset/a4386X6Te9ajq866zgOtWKLx18XGW.jpg'
image = Image.open(img_path)
r_image = yolo.detect_image(image)
r_image.show()
yolo.close_session()
if __name__ == '__main__':
detect_img_for_test(YOLO())
複製代碼
效果:
OK, that‘s all! Enjoy it!