探索 COCO 數據集的分割信息

COCO(Common Objects in Context)數據集是CV計算機視覺領域中,經常使用的數據集,其中的標註信息包括:類別、檢測框、分割信息等。python

下載頁面git

COCO

其中,Train、Val、Test都是圖片信息,標註信息位於Annotations中。github

在annotations文件夾中:json

  • captions是描述
  • instances是檢測框和分割
  • person_keypoints是關鍵點。

annotations

分析:instances_val2017.json,約20M左右api


五個類別

在JSON文件中,含有5個字段:app

['info', 'licenses', 'images', 'annotations', 'categories']
信息,許可證,圖片,標註,類別彙總
複製代碼

images圖片數是5000,annotations標註數是36781,以檢測框bbox爲一條標註信息,因此標註數大於圖片書。dom

infolicenses沒有什麼意義。url

images是圖片信息,包括:spa

  • 圖片名:file_name
  • 下載地址:coco_urlflickr_url
  • 圖片長和寬:height和width;
  • 圖片ID:id,與標註相關;

以下:3d

{'license': 4, 'file_name': '000000397133.jpg', 'coco_url': 'http://images.cocodataset.org/val2017/000000397133.jpg', 'height': 427, 'width': 640, 'date_captured': '2013-11-14 17:02:52', 'flickr_url': 'http://farm7.staticflickr.com/6116/6255196340_da26cf2c9e_z.jpg', 'id': 397133}
複製代碼

categories是類別信息,COCO的80類,包括:

  • 父類別:person;
  • 類別ID:id,與標註相關;
  • 類別含義:name;
[Info] 類別數: 80
[Info] 類別: {'supercategory': 'person', 'id': 1, 'name': 'person'}
複製代碼

annotations是標註信息,包括:

  • 分割:segmentation;
  • 分割的類別:iscrowd,是否擁擠;
  • 面積:area;
  • 圖片ID:image_id,與images中的id相對應;
  • 類別ID:category_id,與categories中的id相對應;
  • 檢測框:bbox;
  • 標註的ID:id;

接着,將會重點分析annotation的內容;

[Info] 標註數: 36781
[Info] 標註: {'segmentation': [[510.66, 423.01, 511.72, 420.03, 510.45, 416.0, 510.34, 413.02, 510.77, 410.26, 510.77, 407.5, 510.34, 405.16, 511.51, 402.83, 511.41, 400.49, 510.24, 398.16, 509.39, 397.31, 504.61, 399.22, 502.17, 399.64, 500.89, 401.66, 500.47, 402.08, 499.09, 401.87, 495.79, 401.98, 490.59, 401.77, 488.79, 401.77, 485.39, 398.58, 483.9, 397.31, 481.56, 396.35, 478.48, 395.93, 476.68, 396.03, 475.4, 396.77, 473.92, 398.79, 473.28, 399.96, 473.49, 401.87, 474.56, 403.47, 473.07, 405.59, 473.39, 407.71, 476.68, 409.41, 479.23, 409.73, 481.56, 410.69, 480.4, 411.85, 481.35, 414.93, 479.86, 418.65, 477.32, 420.03, 476.04, 422.58, 479.02, 422.58, 480.29, 423.01, 483.79, 419.93, 486.66, 416.21, 490.06, 415.57, 492.18, 416.85, 491.65, 420.24, 492.82, 422.9, 493.56, 424.39, 496.43, 424.6, 498.02, 423.01, 498.13, 421.31, 497.07, 420.03, 497.07, 415.15, 496.33, 414.51, 501.1, 411.96, 502.06, 411.32, 503.02, 415.04, 503.33, 418.12, 501.1, 420.24, 498.98, 421.63, 500.47, 424.39, 505.03, 423.32, 506.2, 421.31, 507.69, 419.5, 506.31, 423.32, 510.03, 423.01, 510.45, 423.01]], 'area': 702.1057499999998, 'iscrowd': 0, 'image_id': 289343, 'bbox': [473.07, 395.93, 38.65, 28.67], 'category_id': 18, 'id': 1768}
複製代碼

源碼:

def load_json():
    val_file = os.path.join(ROOT_DIR, 'datasets', 'annotations', 'instances_val2017.json')
    data_line = read_file_utf8(val_file)[0]
    coco_dict = json.loads(data_line)
    print('Keys: {}'.format(coco_dict.keys()))

    info = coco_dict['info']
    licenses = coco_dict['licenses']
    images = coco_dict['images']
    annotations = coco_dict['annotations']
    categories = coco_dict['categories']

    print('-' * 50)
    print('[Info] info: {}'.format(info))  # 信息
    print('-' * 50)
    print('[Info] licenses: {}'.format(licenses))  # 8個licenses
    print('-' * 50)
    print('[Info] 圖片數: {}'.format(len(images)))  # 圖片數
    print('[Info] 圖片: {}'.format(images[0]))  # 圖片數
    print('-' * 50)
    print('[Info] 標註數: {}'.format(len(annotations)))  # 標註
    print('[Info] 標註: {}'.format(annotations[0]))  # 標註
    print('-' * 50)
    print('[Info] 類別數: {}'.format(len(categories)))  # 類別
    print('[Info] 類別: {}'.format(categories[0]))  # 類別

    return images, annotations
複製代碼

分割

參考,COCOAPI的demo

具體含義:

當iscrowd爲0時,表示爲多邊形:

數據:

'segmentation': [[510.66, 423.01, 511.72, 420.03, 510.45, 416.0, 510.34, 413.02, 510.77, 410.26, ...]]
複製代碼

兩個值一組的多邊形,注意圖片的座標系是左上角爲(0,0),因此matplot圖片與原圖顛倒;

Img

原圖

源碼:

def draw_polygon(seg):
    print('[Info] 數據格式: {}'.format(seg))
    gemfield_polygons = seg
    polygons = []
    fig, ax = plt.subplots()

    gemfield_polygon = gemfield_polygons[0]
    max_value = max(gemfield_polygon) * 1.3
    gemfield_polygon = [i * 1.0 / max_value for i in gemfield_polygon]
    poly = np.array(gemfield_polygon).reshape((int(len(gemfield_polygon) / 2), 2))
    polygons.append(Polygon(poly, True))  # 多邊形
    p = PatchCollection(polygons, cmap=matplotlib.cm.jet, alpha=0.4)
    colors = 100 * np.random.rand(1)
    p.set_array(np.array(colors))

    ax.add_collection(p)
    plt.show()
複製代碼

當iscrowd爲1時,表示爲像素圖像,以依次排列,即先填充列。

數據:

{'counts': [272, 2, 4, 4, 4, 4, 2, 9, 1, 2, ...], 'size': [240, 320]}
複製代碼

以列爲順序,第1個是背景,第2個是前景,依次交替;

Img

原始圖片

源碼:

def draw_rle(seg):
    print('[Info] 數據格式: {}'.format(seg))
    rle = seg['counts']
    h, w = seg['size']
    M = np.zeros(h * w)
    N = len(rle)
    n = 0
    val = 1
    for pos in range(N):
        val = not val
        num = rle[pos]
        for c in range(num):
            M[n] = val
            n += 1
    gemfield = M.reshape(([h, w]), order='F')
    plt.imshow(gemfield)
    plt.show()
複製代碼

OK, that's all!

相關文章
相關標籤/搜索