在衆多NoSQL數據庫,MongoDB是一個優秀的產品。其官方介紹以下:
MongoDB (from "humongous") is a scalable, high-performance, open source, document-oriented database.html
看起來,十分誘人!值得說明的是,MongoDB的document是以BSON(Binary JSON)格式存儲的,徹底支持Schema Free。這對地理空間數據是十分友好的。由於有著名的GeoJSON可供使用。另外OGR庫也支持將Geometry類型導出爲JSON格式。mongodb
本文將嘗試使用OGR庫把Shapefile導入到MongoDB存儲,而後創建空間索引,進行空間查詢。數據庫
著名的Foursquare使用了MongoDB數據庫。json
MongoDB+Python+Pymongo+GDAL for Python數組
關於MongoDB和Python安裝,本文不作介紹。關於GDAL for Python的安裝,可見個人另外一篇博文:http://blog.3sdn.net/311.html服務器
在繼續本文以前,請先啓動你的MongoDB服務器。本文默認採用以下服務器參數:
Server:localhost
Post:27017
Database Name:gisdbapp
這裏我直接提供代碼,代碼中已經有比較詳盡的註釋了。代碼基本源於「引文1」,只是作了些改動,將MongoDB的Geometry的存儲格式由wkt改爲json。你可直接複製並運行下面的代碼,固然須要修改一下Shapefile路徑和MongoDB服務器相關參數。性能
import os
import sys
import json
from pymongo import json_util
from pymongo.connection import Connection
from progressbar import ProgressBar
from osgeo import ogr測試
def shp2mongodb(shape_path, mongodb_server, mongodb_port, mongodb_db, mongodb_collection, append, query_filter):
"""Convert a shapefile to a mongodb collection"""
print ‘Converting a shapefile to a mongodb collection ‘
driver = ogr.GetDriverByName(‘ESRI Shapefile’)
print ‘Opening the shapefile %s…’ % shape_path
ds = driver.Open(shape_path, 0)
if ds is None:
print ‘Can not open’, ds
sys.exit(1)
lyr = ds.GetLayer()
totfeats = lyr.GetFeatureCount()
lyr.SetAttributeFilter(query_filter)
print ‘Starting to load %s of %s features in shapefile %s to MongoDB…’ % (lyr.GetFeatureCount(), totfeats, lyr.GetName())
print ‘Opening MongoDB connection to server %s:%i…’ % (mongodb_server, mongodb_port)
connection = Connection(mongodb_server, mongodb_port)
print ‘Getting database %s’ % mongodb_db
db = connection[mongodb_db]
print ‘Getting the collection %s’ % mongodb_collection
collection = db[mongodb_collection]
if append == False:
print ‘Removing features from the collection…’
collection.remove({})
print ‘Starting loading features…’
# define the progressbar
pbar = ProgressBar(maxval=lyr.GetFeatureCount()).start()
k=0
# iterate the features and access its attributes (including geometry) to store them in MongoDb
feat = lyr.GetNextFeature()
while feat:
mongofeat = {}
geom = feat.GetGeometryRef()
mongogeom = geom.ExportToJson()
# store the geometry data with json format
mongofeat['geom'] = json.loads(mongogeom,object_hook=json_util.object_hook)
# iterate the feature’s fields to get its values and store them in MongoDb
feat_defn = lyr.GetLayerDefn()
for i in range(feat_defn.GetFieldCount()):
value = feat.GetField(i)
if isinstance(value, str):
value = unicode(value, "gb2312")
field = feat.GetFieldDefnRef(i)
fieldname = field.GetName()
mongofeat[fieldname] = value
# insert the feature in the collection
collection.insert(mongofeat)
feat.Destroy()
feat = lyr.GetNextFeature()
k = k + 1
pbar.update(k)
pbar.finish()
print ‘%s features loaded in MongoDb from shapefile.’ % lyr.GetFeatureCount()
input_shape = ‘/home/evan/data/map/res4_4m/XianCh_point.shp’
mongodb_server = ‘localhost’
mongodb_port = 27017
mongodb_db = ‘gisdb’
mongodb_collection = ‘xqpoint’
filter = 」spa
print ‘Importing data to mongodb…’
shp2mongodb(input_shape, mongodb_server, mongodb_port, mongodb_db, mongodb_collection, False, filter)
在MongoDB的Shell中執行:
>db.xqpoint.findOne()
結果以下:
{
"_id" : ObjectId("4dc82e7f7de36a5ceb000000"),
"PERIMETER" : 0,
"NAME" : "漠河縣",
"PYNAME" : "Mohe Xian",
"AREA" : 0,
"ADCODE93" : 232723,
"CNTYPT_ID" : 31,
"CNTYPT_" : 1,
"geom" : {
"type" : "Point",
"coordinates" : [
122.53233,
52.968872
]
},
"ID" : 1031,
"PN" : 1,
"CLASS" : "AI"
}
這即是一個document,使用JSON格式,一目瞭然。其中的"geom"即爲Geometry類型的數據,即地理空間數據,也是採用JSON格式存儲,這樣後續的空間索引與空間查詢將十分方便。
MongoDB原生地支持了空間索引與空間查詢,這一點比PostgreSQL方便,再也不須要使用PostGIS進行空間擴展了。至於性能,我還沒測試,在此不敢妄加評論。
>db.xqpoint.ensureIndex({‘geom.coordinates’:’2d’})
是否是十分簡單?其它參數及用法請自行查看MongoDB手冊。
>db.xqpoint.find({"geom.coordinates":[122.53233,52.968872]})
便可查詢到上述「莫河縣」這個點。固然,像這種精確查詢,實際應用並很少。實際應用的空間查詢大多爲範圍查詢。MongoDB支持鄰域查詢($near),和範圍查詢($within)。
>db.xqpoint.find({"geom.coordinates":{$near:[122,52]}})
上述查詢語句查詢點[122,52]附近的點,MongoDB默認返回附近的100個點,並按距離排序。你也能夠用limit()指定返回的結果數量, 如:>db.xqpoint.find({"geom.coordinates":{$near:[122,52]}}).limit(5)
另外,你也能夠指定一個最大距離,只查詢這個距離內的點。
>db.xqpoint.find({"geom.coordinates":{$near:[122,52],$maxDistance:5}}).limit(5)
MongoDB的find()方法可很方便的進行查詢,同時MongoDB也提供了geoNear命令,用於鄰域查詢。
>db.runCommand({geoNear:"xqpoint",near:[122,56],num:2})
上述語句用於查詢[122,56]點附近的點,並只返回2個點。結果以下:
{
"ns" : "gisdb.xqpoint",
"near" : "1110011000111101111100010000011000111101111100010000",
"results" : [
{
"dis" : 3.077515616588727,
"obj" : {
"_id" : ObjectId("4dc82e7f7de36a5ceb000000"),
"PERIMETER" : 0,
"NAME" : "漠河縣",
"PYNAME" : "Mohe Xian",
"AREA" : 0,
"ADCODE93" : 232723,
"CNTYPT_ID" : 31,
"CNTYPT_" : 1,
"geom" : {
"type" : "Point",
"coordinates" : [
122.53233,
52.968872
]
},
"ID" : 1031,
"PN" : 1,
"CLASS" : "AI"
}
},
{
"dis" : 4.551319677334594,
"obj" : {
"_id" : ObjectId("4dc82e7f7de36a5ceb000001"),
"PERIMETER" : 0,
"NAME" : "塔河縣",
"PYNAME" : "Tahe Xian",
"AREA" : 0,
"ADCODE93" : 232722,
"CNTYPT_ID" : 66,
"CNTYPT_" : 2,
"geom" : {
"type" : "Point",
"coordinates" : [
124.7058,
52.340332
]
},
"ID" : 1059,
"PN" : 1,
"CLASS" : "AI"
}
}
],
"stats" : {
"time" : 0,
"btreelocs" : 85,
"nscanned" : 85,
"objectsLoaded" : 4,
"avgDistance" : 3.814417646961661,
"maxDistance" : 4.551319677334594
},
"ok" : 1
}
固然,咱們也可附加條件查詢條件,如查詢[122,56]附近的且"PYNAME"爲"Tahe Xian"的點:
>db.runCommand({geoNear:"xqpoint",near:[122,56],num:2,query:{"PYNAME":"Tahe Xian"})
返回結果以下:
{
"ns" : "gisdb.xqpoint",
"near" : "1110011000111101111100010000011000111101111100010000",
"results" : [
{
"dis" : 4.551319677334594,
"obj" : {
"_id" : ObjectId("4dc82e7f7de36a5ceb000001"),
"PERIMETER" : 0,
"NAME" : "塔河縣",
"PYNAME" : "Tahe Xian",
"AREA" : 0,
"ADCODE93" : 232722,
"CNTYPT_ID" : 66,
"CNTYPT_" : 2,
"geom" : {
"type" : "Point",
"coordinates" : [
124.7058,
52.340332
]
},
"ID" : 1059,
"PN" : 1,
"CLASS" : "AI"
}
}
],
"stats" : {
"time" : 45,
"btreelocs" : 2095,
"nscanned" : 2096,
"objectsLoaded" : 2096,
"avgDistance" : 4.551319677334594,
"maxDistance" : 4.551319677334594
},
"ok" : 1
}
MongoDB的$within操做符支持的形狀有$box(矩形),$center(圓形),$polygon(多邊形,包括凹多邊形和凸多邊形)。全部的範圍查詢,默認是包含邊界的。
查詢一個矩形範圍,須要指定矩形的左下角和右上角兩個座標點,以下:
> box = [[80,40],[100,50]]
> db.xqpoint.find({"geom.coordinates":{$within:{$box:box}}})
查詢一個圓形範圍,須要指定圓心座標和半徑,以下:
> center = [80,44]
> radius =5
> db.xqpoint.find({"geom.coordinates":{$within:{$center:[center,radius]}}})
查詢一個多邊形範圍,須要指定多邊形的各個頂點,能夠經過一個頂點數組或一系列點對象指定。其中,最後一個點是默認與第一個點鏈接的。以下:
> polygon1 = [[75,35],[80,35],[80,45],[60,40]]
> db.xqpoint.find({"geom.coordinates":{$within:{$polygon:polygon1}}})
或者
> polygon2 = {a:{75,35},b:{80,35},c:{80,45},d:{60,40}}
> db.xqpoint.find({"geom.coordinates":{$within:{$polygon:polygon2}}})
注意:MongoDB 1.9及以上版本才支持多邊形範圍查詢。
P.S. MongoDB還支持複合索引,球面模型(可簡單理解爲投影吧),多位置文檔(Multi-location Documents,即一個文檔中包括多個Geometry),可參見「引文2」或MongoDB手冊。
引文1:http://www.paolocorti.net/2009/12/06/using-mongodb-to-store-geographic-data/ 引文2:http://www.mongodb.org/display/DOCS/Geospatial+Indexing