深度學習數據集介紹及相互轉換

Pascal VOC & COCO

- 圖像檢測數據集的標註信息保存在 .json 文件中, 例如 2017_val 的標註數據就保存在 instances_val2017.json 文件中. 其內容以下:
{"info": {"description": "This is stable 1.0 version of the 2017 MS COCO dataset.", "url": "http://mscoco.org", "version": "1.0", "year": 2017, "contributor": "Microsoft COCO group", "date_created": "2017-11-11 02:11:36.777541" }, "images": [ {"license": 2,"file_name": "000000289343.jpg", "coco_url": "http://images.cocodataset.org/val2017/000000289343.jpg", "height": 640,"width": 529,"date_captured": "2013-11-15 00:35:14", "flickr_url": "http://farm5.staticflickr.com/4029/4669549715_7db3735de0_z.jpg","id": 289343}, ... {"license": 1,"file_name": "000000329219.jpg", "coco_url": "http://images.cocodataset.org/val2017/000000329219.jpg", "height": 427,"width": 640,"date_captured": "2013-11-14 19:21:56", "flickr_url": "http://farm9.staticflickr.com/8104/8505307842_465524a6a6_z.jpg", "id": 329219}, ... ], "annotations": [ {"segmentation": [[510.66,423.01,511.72,420.03,510.45,416.0,510...,423.01]], "area": 702.1057499999998, "iscrowd": 0, "image_id": 289343, "bbox": [473.07,395.93,38.65,28.67], "category_id": 18, "id": 1768 }, ... {"segmentation": [[304.09,266.18,308.95,263.56,313.06,262.81,...,266.55]], "area": 4290.290900000001, "iscrowd": 0, "image_id": 329219, "bbox": [297.73,252.34,60.21,108.45],"category_id": 18,"id": 8032} ], "licenses": [ {"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", "id": 1, "name": "Attribution-NonCommercial-ShareAlike License"}, ... {"url": "http://www.usa.gov/copyright.shtml", "id": 8, "name": "United States Government Work"} ], "categories": [ {"supercategory": "person", "id": 1, "name": "person"}, ... {"supercategory": "indoor", "id": 90, "name": "toothbrush"} ] }
COCO數據集annotation內容:

如instances_train2014.json訓練集:

{"info": {"description": "This is stable 1.0 version of the 2014 MS COCO dataset.", 
          "url": "http://mscoco.org", 
          "version": "1.0", 
          "year": 2014, 
          "contributor": "Microsoft COCO group", 
          "date_created": "2015-01-27 09:11:52.357475"}, 
"images": [{"license": 5, 
            "file_name": "COCO_train2014_000000057870.jpg", 
            "coco_url": "http://mscoco.org/images/57870", 
            "height": 480, 
            "width": 640, 
            "date_captured": "2013-11-14 16:28:13", 
            "flickr_url": "http://farm4.staticflickr.com/3153/2970773875_164f0c0b83_z.jpg", 
            "id": 57870},#   image_id
           {"license": 5, 
            "file_name": "COCO_train2014_000000384029.jpg",
            "coco_url": "http://mscoco.org/images/384029", 
            "height": 429, "width": 640, 
            "date_captured": "2013-11-14 16:29:45",
            "flickr_url": "http://farm3.staticflickr.com/2422/3577229611_3a3235458a_z.jpg", 
            "id": 384029}, 
           {"license": 1,
            "file_name": "COCO_train2014_000000222016.jpg", 
            "coco_url": "http://mscoco.org/images/222016",
            "height": 640,
            "width": 480, 
            "date_captured": "2013-11-14 16:37:59", 
            "flickr_url": "http://farm2.staticflickr.com/1431/1118526611_09172475e5_z.jpg",
            "id": 222016}
           {"license": 4, 
            "file_name": "COCO_train2014_000000475546.jpg", 
            "coco_url": "http://mscoco.org/images/475546",
            "height": 375,
            "width": 500, 
            "date_captured": "2013-11-25 21:20:23", 
            "flickr_url": "http://farm1.staticflickr.com/167/423175046_6cd9d0205a_z.jpg", 
            "id": 475546}],
"licenses": [{"url": "http://creativecommons.org/licenses/by-nc-sa/2.0/", 
              "id": 1, 
              "name": "Attribution-NonCommercial-ShareAlike License"}, 
             {"url": "http://creativecommons.org/licenses/by-nc/2.0/", 
              "id": 2, 
              "name": "Attribution-NonCommercial License"},
             {"url": "http://creativecommons.org/licenses/by-nc-nd/2.0/", 
              "id": 3, 
              "name": "Attribution-NonCommercial-NoDerivs License"},
             {"url": "http://creativecommons.org/licenses/by/2.0/",
              "id": 4, 
              "name": "Attribution License"}, 
             {"url": "http://creativecommons.org/licenses/by-sa/2.0/", 
              "id": 5,
              "name": "Attribution-ShareAlike License"}, 
             {"url": "http://creativecommons.org/licenses/by-nd/2.0/", 
              "id": 6, 
              "name": "Attribution-NoDerivs License"},
             {"url": "http://flickr.com/commons/usage/", 
              "id": 7, 
              "name": "No known copyright restrictions"},
             {"url": "http://www.usa.gov/copyright.shtml", 
              "id": 8,
              "name": "United States Government Work"}], 
"annotations": [{"segmentation": [[312.29, 562.89, 402.25, 232.61, 560.32, 300.72, 571.89]],
                 "area": 54652.9556,
                 "iscrowd": 0, 
                 "image_id": 480023, 
                 "bbox": [116.95, 305.86, 285.3, 266.03], 
                 "category_id": 58, 
                 "id": 86}, #這個id表示annotation的id,由於每個圖像有不止一個annotation,因此要對每個annotation編號 
                {"segmentation": [[252.46, 208.17, 267.96, 210.11, 208.45]], 
                 "area": 421.47274999999996,
                 "iscrowd": 0,
                 "image_id": 50518, 
                 "bbox": [245.54, 208.17, 40.14, 19.1], 
                 "category_id": 58, 
                 "id": 89}, 
                {"segmentation": [[349.66, 143.56, 344.19,  131.38, 352.94, 139.19, 355.13, 139.97, 354.5, 144.34]],
                 "area": 292.12984999999935,
                 "iscrowd": 0, 
                 "image_id": 497261, 
                 "bbox": [343.72, 112.63, 17.66, 31.71], 
                 "category_id": 1, 
                 "id": 2232195}, 
                {"segmentation": {"counts": [69901, 4, 21, 2,470, 12, 468, 13, 467, 12, 468, 12, 468, 12, 469, 10, 471, 8, 474, 4, 73630], "size": [480, 640]}, 
                 "area": 2846, 
                 "iscrowd": 1,
                 "image_id": 554752, 
                 "bbox": [145, 275, 341, 53],
                 "category_id": 1, 
                 "id": 900100554752}, 
                {"segmentation": {"counts": [70375, 8, 415, 12, 411, 391, 34, 391, 34, 391, 35, 149], "size": [425, 640]},
                 "area": 7298,
                 "iscrowd": 1, 
                 "image_id": 350724, 
                 "bbox": [165, 216, 474, 152], 
                 "category_id": 62,
                 "id": 906200350724},  
                {"segmentation": {"counts": [99015, 6, 352, 8,  349, 8, 75781], "size": [359, 640]},
                 "area": 6478, 
                 "iscrowd": 1, 
                 "image_id": 554743,
                 "bbox": [275, 207, 153, 148],
                 "category_id": 1, 
                 "id": 900100554743},
                {"segmentation": {"counts": [97214, 1, 425,  4, 6531], "size": [427, 640]}, 
                 "area": 3489, 
                 "iscrowd": 1, 
                 "image_id": 95999,
                 "bbox": [227, 260, 397, 82], 
                 "category_id": 1, 
                 "id": 900100095999}], 
"categories": [{"supercategory": "person", "id": 1, "name": "person"}, # 一共80類 
               {"supercategory": "vehicle", "id": 2, "name": "bicycle"},
               {"supercategory": "vehicle", "id": 3, "name": "car"},
               {"supercategory": "vehicle", "id": 4, "name": "motorcycle"},
               {"supercategory": "vehicle", "id": 5, "name": "airplane"},
               {"supercategory": "vehicle", "id": 6, "name": "bus"}, 
               {"supercategory": "vehicle", "id": 7, "name": "train"},
               {"supercategory": "vehicle", "id": 8, "name": "truck"},
               {"supercategory": "vehicle", "id": 9, "name": "boat"}, 
               {"supercategory": "outdoor", "id": 10, "name": "traffic light"},
               {"supercategory": "outdoor", "id": 11, "name": "fire hydrant"}, 
               {"supercategory": "outdoor", "id": 13, "name": "stop sign"}, 
               {"supercategory": "outdoor", "id": 14, "name": "parking meter"}, 
               {"supercategory": "outdoor", "id": 15, "name": "bench"}, 
               {"supercategory": "animal", "id": 16, "name": "bird"}, 
               {"supercategory": "animal", "id": 17, "name": "cat"}, 
               {"supercategory": "animal", "id": 18, "name": "dog"}, 
               {"supercategory": "animal", "id": 19, "name": "horse"}, 
               {"supercategory": "animal", "id": 20, "name": "sheep"}, 
               {"supercategory": "animal", "id": 21, "name": "cow"}, 
               {"supercategory": "animal", "id": 22, "name": "elephant"},
               {"supercategory": "animal", "id": 23, "name": "bear"}, 
               {"supercategory": "animal", "id": 24, "name": "zebra"}, 
               {"supercategory": "animal", "id": 25, "name": "giraffe"},
               {"supercategory": "accessory", "id": 27, "name": "backpack"},
               {"supercategory": "accessory", "id": 28, "name": "umbrella"}, 
               {"supercategory": "accessory", "id": 31, "name": "handbag"}, 
               {"supercategory": "accessory", "id": 32, "name": "tie"}, 
               {"supercategory": "accessory", "id": 33, "name": "suitcase"}, 
               {"supercategory": "sports", "id": 34, "name": "frisbee"},
               {"supercategory": "sports", "id": 35, "name": "skis"},
               {"supercategory": "sports", "id": 36, "name": "snowboard"},
               {"supercategory": "sports", "id": 37, "name": "sports ball"}, 
               {"supercategory": "sports", "id": 38, "name": "kite"},
               {"supercategory": "sports", "id": 39, "name": "baseball bat"}, 
               {"supercategory": "sports", "id": 40, "name": "baseball glove"},
               {"supercategory": "sports", "id": 41, "name": "skateboard"}, 
               {"supercategory": "sports", "id": 42, "name": "surfboard"},
               {"supercategory": "sports", "id": 43, "name": "tennis racket"}, 
               {"supercategory": "kitchen", "id": 44, "name": "bottle"}, 
               {"supercategory": "kitchen", "id": 46, "name": "wine glass"}, 
               {"supercategory": "kitchen", "id": 47, "name": "cup"},
               {"supercategory": "kitchen", "id": 48, "name": "fork"}, 
               {"supercategory": "kitchen", "id": 49, "name": "knife"},
               {"supercategory": "kitchen", "id": 50, "name": "spoon"}, 
               {"supercategory": "kitchen", "id": 51, "name": "bowl"},
               {"supercategory": "food", "id": 52, "name": "banana"}, 
               {"supercategory": "food", "id": 53, "name": "apple"}, 
               {"supercategory": "food", "id": 54, "name": "sandwich"}, 
               {"supercategory": "food", "id": 55, "name": "orange"}, 
               {"supercategory": "food", "id": 56, "name": "broccoli"}, 
               {"supercategory": "food", "id": 57, "name": "carrot"},
               {"supercategory": "food", "id": 58, "name": "hot dog"}, 
               {"supercategory": "food", "id": 59, "name": "pizza"}, 
               {"supercategory": "food", "id": 60, "name": "donut"},
               {"supercategory": "food", "id": 61, "name": "cake"}, 
               {"supercategory": "furniture", "id": 62, "name": "chair"},
               {"supercategory": "furniture", "id": 63, "name": "couch"},
               {"supercategory": "furniture", "id": 64, "name": "potted plant"}, 
               {"supercategory": "furniture", "id": 65, "name": "bed"}, 
               {"supercategory": "furniture", "id": 67, "name": "dining table"}, 
               {"supercategory": "furniture", "id": 70, "name": "toilet"},
               {"supercategory": "electronic", "id": 72, "name": "tv"}, 
               {"supercategory": "electronic", "id": 73, "name": "laptop"},
               {"supercategory": "electronic", "id": 74, "name": "mouse"}, 
               {"supercategory": "electronic", "id": 75, "name": "remote"}, 
               {"supercategory": "electronic", "id": 76, "name": "keyboard"}, 
               {"supercategory": "electronic", "id": 77, "name": "cell phone"}, 
               {"supercategory": "appliance", "id": 78, "name": "microwave"},
               {"supercategory": "appliance", "id": 79, "name": "oven"}, 
               {"supercategory": "appliance", "id": 80, "name": "toaster"},
               {"supercategory": "appliance", "id": 81, "name": "sink"},
               {"supercategory": "appliance", "id": 82, "name": "refrigerator"}, 
               {"supercategory": "indoor", "id": 84, "name": "book"}, 
               {"supercategory": "indoor", "id": 85, "name": "clock"},
               {"supercategory": "indoor", "id": 86, "name": "vase"}, 
               {"supercategory": "indoor", "id": 87, "name": "scissors"}, 
               {"supercategory": "indoor", "id": 88, "name": "teddy bear"}, 
               {"supercategory": "indoor", "id": 89, "name": "hair drier"}, 
               {"supercategory": "indoor", "id": 90, "name": "toothbrush"}]}
View Code

PASCAL-VOC2012

- 檢測每張圖的標註:linux

<annotation>
    <folder>VOC2012</folder>
    <filename>2007_000027.jpg</filename>
    <source>
        <database>The VOC2007 Database</database>
        <annotation>PASCAL VOC2007</annotation>
        <image>flickr</image>
    </source>
    <size>
        <width>486</width>
        <height>500</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>person</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>174</xmin>
            <ymin>101</ymin>
            <xmax>349</xmax>
            <ymax>351</ymax>
        </bndbox>
        <part>
            <name>head</name>
            <bndbox>
                <xmin>169</xmin>
                <ymin>104</ymin>
                <xmax>209</xmax>
                <ymax>146</ymax>
            </bndbox>
        </part>
        <part>
            <name>hand</name>
            <bndbox>
                <xmin>278</xmin>
                <ymin>210</ymin>
                <xmax>297</xmax>
                <ymax>233</ymax>
            </bndbox>
        </part>
        <part>
            <name>foot</name>
            <bndbox>
                <xmin>273</xmin>
                <ymin>333</ymin>
                <xmax>297</xmax>
                <ymax>354</ymax>
            </bndbox>
        </part>
        <part>
            <name>foot</name>
            <bndbox>
                <xmin>319</xmin>
                <ymin>307</ymin>
                <xmax>340</xmax>
                <ymax>326</ymax>
            </bndbox>
        </part>
    </object>
</annotation>

VOC2012數據集分爲20類,包括背景爲21類,分別以下: 
- Person: person 
- Animal: bird, cat, cow, dog, horse, sheep 
- Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train 
- Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitorgit

這裏寫圖片描述

  • 圖像分割(segmentation)有關的信息,VOC2012中的圖片並非都用於分割,用於分割比賽的圖片實例以下,包含原圖以及圖像分類分割和圖像物體分割兩種png圖。圖像分類分割是在20種物體中,ground-turth圖片上每一個物體的輪廓填充都有一個特定的顏色,一共20種顏色,好比摩托車用紅色表示,人用綠色表示。而圖像物體分割則僅僅在一副圖中生成不一樣物體的輪廓顏色便可,顏色本身隨便填充。
  • 注意2個分割後圖片標籤的區別 :
    * SegmentationClass: 標註出每個像素的類別 ;
    * SegmentationObject:: 標註出每個像素屬於哪個物體。

這裏寫圖片描述

  • 由於VOC2012中的圖片並非都用於分割,因此須要txt文件信息來標記處哪些圖片能夠用於分割,寫程序的時候就能夠利用信息 train.txt 對圖片進行挑選。train和val中的圖片加一塊兒一共2913張圖。
  • SegmentationClass中的png圖用於圖像分割分類,例若有兩類物體,人和飛機,其中飛機和人都對應着特定的顏色,注意該文件夾中的圖片爲三通道彩色圖,與以前單通道的灰度圖不一樣。png圖中對物體的分類像素不是0-20,而是對應着不一樣的RGB份量;
  • 而SegmentationObject中的png圖則僅僅對圖中不一樣的物體進行的分割,不對其物體所屬的類別進行標註;
  • 在最後一步中,將fc8中分割獲得的.mat格式的結果,轉換成.png格式的最終分割圖像,可是發現並非很清楚各顏色表明的類別,經過將create_labels.py程序中顏色的RGB值
  • 深度學習圖像分割(二)——如何製做本身的PASCAL-VOC2012數據集github

圖像標註工具

Labelme
Labelme適用於圖像分割任務的數據集製做:它來自下面的項目:https://github.com/wkentaro/labelme;該軟件實現了最基本的分割數據標註工做,在save後將保持Object的一些信息到一個json文件中,以下:https://github.com/wkentaro/labelme/blob/master/static/apc2016_obj3.json;同時該軟件提供了將json文件轉化爲labelimage的功能:

labelImg
Labelme適用於圖像檢測任務的數據集製做:它來自下面的項目:https://github.com/tzutalin/labelImg;其中標籤存儲功能和「Next Image」、「Prev Image」的設計使用起來比較方便。該軟件最後保存的xml文件格式和ImageNet數據集是同樣的。yolo_mark

yolo_mark適用於圖像檢測任務的數據集製做:它來自於下面的項目:https://github.com/AlexeyAB/Yolo_mark;它是yolo2的團隊開源的一個圖像標註工具,爲了方便其餘人使用yolo2訓練本身的任務模型。在linux和win下均可運行,依賴opencv庫。 json

KITTI與Cityscapes簡介

  KITTI由德國卡爾斯魯厄理工學院和豐田美國技術研究院聯合創辦,是目前國際上最大的自動駕駛場景下的計算機視覺算法評測數據集。用於評測目標(機動車、非機動車、行人等)檢測、目標跟蹤、路面分割等計算機視覺技術在車載環境下的性能。
     KITTI包含市區、鄉村和高速公路等場景採集的真實圖像數據,每張圖像中多達15輛車和30個行人,還有各類程度的遮擋。KITTI數據集中,目標檢測包括了車輛檢測、行人檢測、自行車等三個單項,目標追蹤包括車輛追蹤、行人追蹤等兩個單項,道路分割包括urban unmarked、urban marked、urban multiple marked三個場景及前三個場景的平均值urban road等四個單項。
    Cityscapes數據集則是由奔馳主推,提供無人駕駛環境下的圖像分割數據集。用於評估視覺算法在城區場景語義理解方面的性能。Cityscapes包含50個城市不一樣場景、不一樣背景、不一樣季節的街景,提供5000張精細標註的圖像、20000張粗略標註的圖像、30類標註物體。用PASCAL VOC標準的 intersection-over-union (IoU)得分來對算法性能進行評價。 Cityscapes數據集共有fine和coarse兩套評測標準,前者提供5000張精細標註的圖像,後者提供5000張精細標註外加20000張粗糙標註的圖像。app

相關文章
相關標籤/搜索