caffe事兒真多,數據必須得lmdb或者leveldb什麼的才行,若是數據是圖片的話,那用caffe自帶的convert_image.cpp就行,但若是不是圖片,就得本身寫程序了。我也不是計算機專業的,我哪看得懂源碼,遂奮發而百度之,然無甚結果,遂google之,嘗聞「內事不決問百度,外事不決問google」,古人誠不我欺。在caffe的google group裏我找到了這個網址:http://deepdish.io/2015/04/28/creating-lmdb-in-python/python
代碼以下:git
import numpy as np import lmdb import caffe N = 1000 # Let's pretend this is interesting data X = np.zeros((N, 3, 32, 32), dtype=np.uint8) y = np.zeros(N, dtype=np.int64) # We need to prepare the database for the size. We'll set it 10 times # greater than what we theoretically need. There is little drawback to # setting this too big. If you still run into problem after raising # this, you might want to try saving fewer entries in a single # transaction. map_size = X.nbytes * 10 env = lmdb.open('mylmdb', map_size=map_size) with env.begin(write=True) as txn: # txn is a Transaction object for i in range(N): datum = caffe.proto.caffe_pb2.Datum() datum.channels = X.shape[1] datum.height = X.shape[2] datum.width = X.shape[3] datum.data = X[i].tobytes() # or .tostring() if numpy < 1.9 datum.label = int(y[i]) str_id = '{:08}'.format(i) # The encode is only essential in Python 3 txn.put(str_id.encode('ascii'), datum.SerializeToString())
這是用python將數據轉爲lmdb的代碼,可是我用這個處理完數據再使用caffe會出現std::bad_alloc錯誤,後來通過艱苦地奮鬥,查閱了大量資料,我發現了問題所在:github
1.caffe的數據格式默認爲四維(n_samples, n_channels, height, width)
.因此必須把個人數據處理成這種格式ui
2.最後一行txn.put(str_id.encode('ascii'), datum.SerializeToString())必定要加上,我一開始一維python2不用寫這個,結果總是出錯,後來才發現這行必須寫!this
3.若是出現mdb_put: MDB_MAP_FULL: Environment mapsize limit reached
的錯誤,是由於lmdb默認的map_size比較小,我把lmdb/cffi.py裏面的map_size默認值改了一下,改爲了1099511627776(也就是1Tb),我也不知道是否是這麼改,而後我又把上面python程序裏map_size = X.nbytes 這句改爲了map_size = X.nbytes * 10,而後就成功了!google
找資料的過程當中,我還發現了用python寫leveldb的程序,網址在這裏:https://github.com/BVLC/caffe/issues/745和http://stackoverflow.com/questions/32707393/whats-caffes-input-formatspa
用python寫HDF5的程序在這裏:http://stackoverflow.com/questions/31774953/test-labels-for-regression-caffe-float-not-allowed/31808324#31808324rest
參考:code
1.http://stackoverflow.com/questions/30983213/how-to-use-1-dim-vector-as-input-for-caffe/30991590#30991590orm
2.關於lmdb的map_size大小的問題:https://github.com/BVLC/caffe/issues/1298和http://stackoverflow.com/questions/31820976/lmdb-increase-map-size