mongoDB是近幾年迅速崛起的一種文檔型數據庫,普遍應用於對事務無要求,可是要求較好的開發靈活性,擴展彈性的領域,。css
隨着企業對數據挖掘需求的增長,用戶可能會對存儲在mongo中的數據有挖掘需求,可是mongoDB的語法較爲單一,不能知足挖掘的需求。html
PostgreSQL是起源於伯克利大小的一個開源數據庫,已經有20多年的歷史,以穩定性,功能強大著稱,號稱"開源界的Oracle"。node
在國內外各個行業都有很是多的用戶,如平安銀行,郵儲銀行,中移動,去哪兒,高德,菜鳥,美國宇航局,俄羅斯杜馬等等。python
PostgreSQL 9.6 新增了基於CPU的並行計算。 20TB之內的OLTP+OLAP的混合場景,PostgreSQL 會是很好的選擇。linux
PostgreSQL的FDW特性,能夠容許它鏈接任何數據源,將外部數據源當成本地源使用。sql
MongoDB Connector for BI就是PostgreSQL 的FDW衍生的產品。 爲mongoDB用戶提供豐富的SQL接口。mongodb
除了能夠鏈接mongoDB,PostgreSQL FDW還能鏈接幾乎全部數據源,圖中沒有徹底列出。
FDW請參考
http://wiki.postgresql.org/wiki/Fdwshell
本文將從mongodb用戶視角,講解一下mongodb bi connector的用法。數據庫
由於國內下載mongodb-bi的包很是慢,我這裏沒有驗證整個過程,以互聯網上一篇文檔或藍本,細化一下整個過程。centos
OS環境
[root@mongobihost raj]# lsb_release -a Distributor ID: RedHatEnterpriseServer Description: Red Hat Enterprise Linux Server release 6.5 (Santiago) Release: 6.5 [root@mongobihost raj]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 6.5 (Santiago)
python版本
[root@mongobihost raj]# which python /usr/bin/python [root@mongobihost raj]# python -V Python 2.6.6
下載 mongodb-bi-1.1.3-1-centos6-rpms.tar.bz2, 解壓
包含了PostgreSQL, FDW接口以及mongodb schema轉換成SQL的工具等。
root@mongobihost bin]# cd /tmp/ [root@mongobihost tmp]# ls -ltr mongodb-bi-schematools-1.1.3-1.el6.x86_64.rpm mongodb-bi-libs-1.1.3-1.el6.x86_64.rpm mongodb-bi-1.1.3-1.el6.x86_64.rpm mongodb-bi-server-1.1.3-1.el6.x86_64.rpm -- PostgreSQL server mongodb-bi-contrib-1.1.3-1.el6.x86_64.rpm -- PostgreSQL contrib mongodb-bi-devel-1.1.3-1.el6.x86_64.rpm -- PostgreSQL include mongodb-bi-multicorn-1.1.3-1.el6.x86_64.rpm -- PostgreSQL python FDW 開發接口 mongodb-bi-pymongo-1.1.3-1.x86_64.rpm mongodb-bi-fdw-1.1.3-1.noarch.rpm -- PostgreSQL mongofdw based on mulitcorn mongodb-bi-1.1.3-1-centos6-rpms.tar.bz2
安裝這些 rpm
[root@mongobihost tmp]# rpm -ivh *.rpm --nodeps Preparing... ########################################### [100%] package mongodb-bi-libs-1.1.3-1.el6.x86_64 is already installed package mongodb-bi-1.1.3-1.el6.x86_64 is already installed package mongodb-bi-devel-1.1.3-1.el6.x86_64 is already installed package mongodb-bi-server-1.1.3-1.el6.x86_64 is already installed package mongodb-bi-contrib-1.1.3-1.el6.x86_64 is already installed package mongodb-bi-schematools-1.1.3-1.el6.x86_64 is already installed package mongodb-bi-pymongo-1.1.3-1.x86_64 is already installed package mongodb-bi-fdw-1.1.3-1.noarch is already installed
安裝 mongodb-bi-multicorn
[root@mongobihost tmp]# rpm -ivh mongodb-bi-multicorn-1.1.3-1.el6.x86_64 --nodeps error: open of mongodb-bi-multicorn-1.1.3-1.el6.x86_64 failed: No such file or directory [root@mongobihost tmp]# rpm -ivh mongodb-bi-multicorn-1.1.3-1.el6.x86_64.rpm --nodeps Preparing... ########################################### [100%] 1:mongodb-bi-multicorn ########################################### [100%]
安裝完後,檢查python 的collections模塊是否正常
NOTE: python Version should be greater than 2.6 - Hence, upgrade it and then install RPMs. One way to check is : to start a Python2.6 shell, and confirm that the "collections" module includes the "OrderedDict()" methods. For example: python Python 2.6.6 (r266:84292, Sep 4 2013, 07:46:00) [GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import collections >>> od = collections.OrderedDict() >>> od OrderedDict() Ctrl+D to exit..
檢查 本地 Mongo
mongo ${HOST}:${PORT}/admin -u mongoadmin -p $password
MongoDB shell version: 3.2.4 connecting to: mongobihost:27017/admin Server has startup warnings: 2016-04-01T16:49:54.454-0700 I CONTROL [initandlisten] MongoDB Enterprise set01:PRIMARY> show dbs admin 0.000GB rajdb 1.210GB abcdeconfig 0.015GB abcdb 0.166GB jiradb 0.026GB local 1.199GB exit;
建立 mongodb bi 用戶
對應的操做是在PostgreSQL 中使用 create server和CREATE USER MAPPING FOR定義foreign server與user mapping的操做。( 指向提供的 mongodb url )
參考 https://docs.mongodb.com/bi-connector/reference/mongobiuser/#bin.mongobiuser
[root@mongobihost bin]# mongobiuser create biuser mongodb://biuser:test@mongobihost.myhost.com:27017/admin or [root@mongobihost bin]# mongobiuser create biuser mongodb://mongobihost.myblog.com:27017/admin Enter password: 2016-06-17T12:12:15.403-0700 creating user biuser 2016-06-17T12:12:15.408-0700 creating database buses
檢查PostgreSQL是否啓動
mongo bi connector修改了PostgreSQL中的一些默認選項,例如端口已修改成27032,固然你也能夠本身去修改這個端口。
如下是PostgreSQL在unix socket上的監聽,監聽端口27032,若是你須要監聽在IP端口上,須要修改postgresql.conf重啓數據庫.
[root@mongobihost bin]# netstat -an|grep PG Active Internet connections (servers and established) Proto RefCnt Flags Type State I-Node Path unix 2 [ ACC ] STREAM LISTENING 1262987 /tmp/.s.PGSQL.27032
查看PostgreSQL 配置文件的位置
其實用 rpm -ql mongodb-bi-server 更好
[root@mongobihost tmp]# find / -name postgresql.conf /var/lib/pgsql/9.4/data/postgresql.conf
修改監聽,在全部的接口上。這樣你的BI軟件才能經過網絡連到PostgreSQL
vi /var/lib/pgsql/9.4/data/postgresql.conf listen_addresses = '0.0.0.0'
配置PostgreSQL pg_hba.conf,容許全部來源IP訪問這個PostgreSQL
[root@mongobihost bin]# vi /var/lib/pgsql/9.4/data/pg_hba.conf #** Add below content : # IPv4 local connections: host all all 0.0.0.0/0 md5
重啓postgresql
pg_ctl restart -m fast -D /var/lib/pgsql/9.4/data
使用mongodrdl將須要參與BI分析的collection導出成爲建立PostgreSQL 外部表的DDL
mongodrdl -d rajdb -o rajdb.drdl -h mongobihost:27017 -u mongoadmin -p $password --authenticationDatabase admin Note: 27017 is mongo port 2016-06-17T14:20:15.546-0700 Table "employee", column "sfg.sfgsf" has no types: ignoring column. 2016-06-17T14:20:15.546-0700 Table "employee", column "fgfs.gsdfgf" has no types: ignoring column. 2016-06-17T14:20:15.546-0700 Table "employee", column "fgsf.sgfgs" has no types: ignoring column. 2016-06-17T14:20:15.546-0700 Table "employee", column "sgss.srgs" has no types: ignoring column. 2016-06-17T14:20:16.123-0700 Table "emp_Pack_flat", column "rtgs.comments" has no types: ignoring column. 2016-06-17T14:20:16.972-0700 Table "customer_transaction", column "FValues" is an array that has no types: ignoring column. 2016-06-17T14:20:16.973-0700 Table "customer_transaction_Notes", column "Notes.enumValues" is an array that has no types: ignoring column. 2016-06-17T14:20:16.973-0700 Table "customer_transaction_SiteValues", column "F1z_v.fields.SiteAbbr.enumValues" is an array that has no types: ignoring column. 2016-06-17T14:20:16.973-0700 Table "customer_transaction_URL", column "URL.enumValues" is an array that has no types: ignoring column. 2016-06-17T14:20:16.974-0700 Table "customer_transaction_active", column "F1z_v.fields.active.enumValues" is an array that has no types: ignoring column. 2016-06-17T14:20:16.974-0700 Table "customer_transaction_active", column "colCur.enumValues" is an array that has no types: ignoring column. 2016-06-17T14:20:16.974-0700 Table "customer_transaction_active", column "colDiff.enumValues" is an array that has no types: ignoring column.
使用mongobischema 將DDL導入PostgreSQL
# To import data into BI schema [root@mongobihost bin]# mongobischema import biuser ./rajdb.drdl Enter password: 2016-06-17T14:55:02.541-0700 creating table employee 2016-06-17T14:55:02.572-0700 creating table emp_Pac_fla 2016-06-17T14:55:02.579-0700 creating table customer_transaction 2016-06-17T14:55:02.588-0700 creating table customer_transaction_Notes 2016-06-17T14:55:02.597-0700 creating table customer_transaction_SiteVa 2016-06-17T14:55:02.606-0700 creating table customer_transaction_URL 2016-06-17T14:55:02.614-0700 creating table customer_transaction_active # to look at the tables in the BI schema, run below stmt.
檢查已導入的外部表
[root@mongobihost]# mongobischema list biuser Enter password: employee customer_transaction customer_transaction_Notes customer_transaction_SiteVa customer_transaction_URL customer_transaction_active
如何重啓PostgreSQL,也可直接使用pg_ctl。
If you need to restart the BI Connector, then sudo service postgresql-9.4 stop sudo service postgresql-9.4 start or pg_ctl restart -m fast -D /var/lib/pgsql/9.4/data
列出bi用戶,也能夠直接用PostgreSQL中的SQL或視圖查看
# mongobiuser list
檢查鏈接PostgreSQL是否正常
to check if things are okay on postgre Sql.. psql -h localhost -p 27032 -U biuser Password for user biuser: psql (9.4.5 MongoDB BI Connector 1.1.3) SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off) Type "help" for help. biuser=> \d List of relations Schema | Name | Type | Owner --------+-------------------------------------------------------------------------------+---------------+-------- public | customer_transaction | view | biuser public | customer_transaction_Notes | foreign table | biuser public | customer_transaction_SiteVa | view | biuser biuser=> select * from "customer_transaction" limit 1;
如今你能夠用BI軟件鏈接PostgreSQL來分析存儲在mongoDB的數據了 。