Mongodb複製集以及sharding的實現

一,概述
前端

mongodb是一個典型的文檔型nosql數據庫,下面說一下Nosql包括的幾大類型數據庫:node

NoSQL的數據存儲模型linux

鍵值模型:web

數據模型:key-value存儲算法

優勢:查找速度快sql

缺點:數據無結構,一般只被看成字符串或二進制數據mongodb

應用場景:內容緩存shell

實例:Redis, Dynamo數據庫


列式模型:緩存

數據模型:數據按列存儲,將同一列數據存在一塊兒;

優勢:查找迅速、可擴展性強、易於實現分佈式;

缺點:功能相對SQL頗有限;

應用場景:分佈式文件系統或分佈式存儲

實例:Bigtable, Cassandra, HBase, Hypertable


文檔模型:

數據模型:與鍵值模型相似,value指向結構化數據;

優勢:數據格式要求不嚴格,無需事先定義結構

缺點:查詢性能不高,缺少統一查詢語法

應用場景:web應用;

實例:MongoDB, CouchDB


圖式模型:

數據模型:圖結構模型

優勢:利用圖結構相關算法提升性能,並足特殊場景應用需求

缺點:難以實現分佈式,功能有定向性;

應用場景:社交網絡、推薦系統、關係圖譜

實例:Neo4J


本文介紹的是mongodb的複製集以及sharding,下面來講一下mongodb爲何要作sharding

愈來愈大的數據集及不斷提高吞吐量的應用程序對單臺mongodb服務器來說是一個挑戰————大量的查詢很快即能耗盡CPU的計算能力,而較大的數據集存儲需求也有可能很快超出單節點的存儲能力。最終,工做集的大多超出了系統的RAM並給I/O帶去巨大壓力。數據庫管理系統界解決此類問題一般有兩類方案:向上擴展和水平擴展。

sharding便是水平擴展的一種解決方案。它經過將數據集分佈於多個也稱做分片(shard)的節點上來下降單節點的訪問壓力。每一個分片都是一個獨立的數據庫,全部的分片組合起來構成一個邏輯上的完整意義的數據庫。所以,分片機制下降了每一個分片的數據操做量及須要存儲的數據量。


要構建一個 MongoDB Sharding Cluster,須要三種角色:

Shard Server:用於存儲實際的數據塊,也就是元數據,實際生產環境中一個shard server角色可由幾臺機器組個一個relica set承擔,防止主機單點故障

Config Server:存儲了整個 Cluster Metadata,其中包括 chunk 信息。

Route Server:前端路由,客戶端由此接入,且讓整個集羣看上去像單一數據庫,前端應用能夠透明使用。


下面是mongodb sharding的工做原理圖:

wKiom1M35CLg3i2DAADwDRGSG7E361.jpg


二,實驗環境

   3臺主機的環境:

   192.168.30.116  OS:Centos 6.4 x86_64

   192.168.30.117  OS:Centos 6.4 x86_64

   192.168.30.119  OS:Centos 6.4 x86_64

   下面是主機角色:

wKioL1M36N6hYuf2AAGGtgvVd7s179.jpg



下面開始安裝mongodb,這裏使用通用的二進制包安裝

[root@node1 ~]# tar zxvf mongodb-linux-x86_64-2.0.4.tgz -C /usr/local/
[root@node1 ~]# mv /usr/local/mongodb-linux-x86_64-2.0.4 /usr/local/mongodb
[root@node1 ~]# groupadd -r mongod
[root@node1 ~]# useradd -M -r -g mongod -d /mongodb/data -s /bin/false -c mongod mongod
[root@node1 ~]# mkdir /mongodb/data/shard11 -p
[root@node1 ~]# mkdir /mongodb/data/shard21 -p
[root@node1 ~]# mkdir /mongodb/data/config -p
[root@node1 ~]# mkdir -p /var/log/mongo
[root@node1 ~]# chown -R mongod.mongod /mongodb/data/ /var/log/mongo/
[root@node2 ~]# tar zxvf mongodb-linux-x86_64-2.0.4.tgz -C /usr/local/
[root@node2 ~]# mv /usr/local/mongodb-linux-x86_64-2.0.4 /usr/local/mongodb
[root@node2 ~]# groupadd -r mongod
[root@node2 ~]# useradd -M -r -g mongod -d /mongodb/data -s /bin/false -c mongod mongod
[root@node2 ~]# mkdir /mongodb/data/shard12 -p
[root@node2 ~]# mkdir /mongodb/data/shard22 -p
[root@node2 ~]# mkdir /mongodb/data/config -p
[root@node2 ~]# mkdir -p /var/log/mongo
[root@node2 ~]# chown -R mongod.mongod /mongodb/data/ /var/log/mongo/
[root@node3 ~]# tar zxvf mongodb-linux-x86_64-2.0.4.tgz -C /usr/local/
[root@node3 ~]# mv /usr/local/mongodb-linux-x86_64-2.0.4 /usr/local/mongodb
[root@node3 ~]# groupadd -r mongod
[root@node3 ~]# useradd -M -r -g mongod -d /mongodb/data -s /bin/false -c mongod mongod
[root@node3 ~]# mkdir /mongodb/data/shard13 -p
[root@node3 ~]# mkdir /mongodb/data/shard23 -p
[root@node3 ~]# mkdir /mongodb/data/config -p
[root@node3 ~]# mkdir -p /var/log/mongo
[root@node3 ~]# chown -R mongod.mongod /mongodb/data/ /var/log/mongo/


配置mongod


mongod在啓動時經過命令行選項或配置文件(如/etc/mongod.conf)接口讀取其配置屬性。這兩種配置接口可以提供相同的功能,所以,管理員能夠根據偏好進行選擇。若是要經過配置文件讀取配置選項,則能夠在啓動mongod時使用--config或-f選項來指定配置文件的位置。


mongod的基本配置參數


配置mongod的基本工做屬性可經過一些經常使用參數進行。


fork={true|false}: 是否以daemon方式啓動mongod,爲true表示其啓動後其自動轉入後臺工做;

bind_ip=IP:指定mongod監聽的IP地址;

port=PORT:指定mongod監聽的端口,默認爲27017;

quiet={true|false}:是否工做於靜默模式,以最小化的記錄日誌信息;正常條件下,應該設置爲true,僅在調試時纔將其設置爲false;

dbpath=/PATH/TO/SOMEWHERE:指定mongod存儲數據的位置,一般爲/data/mongodb、/var/lib/mongodb或/srv/mongodb等;

logpath=/PATH/TO/SOMEFILE:日誌文件路徑,例如/var/log/mongodb/mongod.log;若是沒有指定文件路徑,日誌信息則會發往標準輸出;

logappend={true|false}:設定mongod在啓動時是否不覆蓋日誌文件中的原有日誌信息,true表示以追加方式記錄日誌,即不覆蓋;

journal={true|false}:是否啓動日誌功能;日誌功能是保證工做於單實例模式的mongod寫入數據持久性的唯一途徑;




安全相關的配置參數


數據庫服務器的安全機制可有效防範非受權訪問。


bind_ip=IP: 指定mongod監聽的IP地址;生產環境中,一般須要將其指定爲一個可接受外部客戶端請求的IP地址,但應該僅開放須要接受客戶端請求的IP地址;若是須要指定多個地址,彼此間使用逗號分隔便可;

nounixsocket={true|false}:是否禁用mongodb的Unix套接字功能,默認爲啓用;主要用於本地通訊;

auth={true|false}:是否啓用認證功能;若是要啓用,遠程客戶端須要被受權方能訪問mongodb服務器;




配置replica sets

[root@node1 ~]# mv /etc/mongod.conf /etc/mongod.conf.bak
[root@node1 ~]# vi /etc/mongod.conf
shardsvr = true
replSet = shard1
port = 27017
dbpath=/mongodb/data/shard11
oplogSize = 100
logpath=/var/log/mongo/shard11.log
logappend = true
fork = true
[root@node2 ~]# mv /etc/mongod.conf /etc/mongod.conf.bak
[root@node2 ~]# vi /etc/mongod.conf
shardsvr = true
replSet = shard1
port = 27017
dbpath=/mongodb/data/shard12
oplogSize = 100
logpath=/var/log/mongo/shard12.log
logappend = true
fork = true
[root@node3 ~]# mv /etc/mongod.conf /etc/mongod.conf.bak
[root@node3 ~]# vi /etc/mongod.conf
shardsvr = true
replSet = shard1
port = 27017
dbpath=/mongodb/data/shard13
oplogSize = 100
logpath=/var/log/mongo/shard13.log
logappend = true
fork = true


啓動replica sets

[root@node1 ~]# /usr/local/mongodb/bin/mongod -f /etc/mongod.conf
[root@node1 ~]# netstat -anptl | grep mong
tcp        0      0 0.0.0.0:28017               0.0.0.0:*                   LISTEN      5888/mongod
tcp        0      0 0.0.0.0:27017               0.0.0.0:*

一樣啓動node2,node3的shard1

配置shard2用到的replica sets:

[root@node1 ~]# vi /etc/mongod2.conf
shardsvr = true
replSet = shard2
port = 27018
dbpath=/mongodb/data/shard21
oplogSize = 100
logpath=/var/log/mongo/shard21.log
logappend = true
fork = true
[root@node2 ~]# vi /etc/mongod2.conf
shardsvr = true
replSet = shard2
port = 27018
dbpath=/mongodb/data/shard22
oplogSize = 100
logpath=/var/log/mongo/shard22.log
logappend = true
fork = true
[root@node3 ~]# vi /etc/mongod.conf
shardsvr = true
replSet = shard2
port = 27018
dbpath=/mongodb/data/shard23
oplogSize = 100
logpath=/var/log/mongo/shard23.log
logappend = true
fork = true

分別啓動node1,node2,node3的shard2

[root@node1 ~]# /usr/local/mongodb/bin/mongod -f /etc/mongod2.conf
[root@node2 ~]# /usr/local/mongodb/bin/mongod -f /etc/mongod2.conf
[root@node3 ~]# /usr/local/mongodb/bin/mongod -f /etc/mongod2.conf

登陸其中一個shard1的mongod,初始化複製集

[root@node1 ~]# mongo 192.168.30.116:27017
> config = {_id: 'shard1',members: [
                          {_id: 0, host: '192.168.30.116:27017'},
                          {_id: 1, host: '192.168.30.117:27017'},
                          {_id: 2, host: '192.168.30.119:27017'}]
}
> rs.initiate(config);
[root@node1 ~]# mongo 192.168.30.116:27018
> config = {_id: 'shard2',members: [
                          {_id: 0, host: '192.168.30.116:27018'},
                          {_id: 1, host: '192.168.30.117:27018'},
                          {_id: 2, host: '192.168.30.119:27018'}]
}
> rs.initiate(config);


到此就配置好了二個replica sets,也就是準備好了二個shards


配置三臺config server

[root@node1 ~]# vi /etc/config.conf
configsvr = true
port = 20000
dbpath=/mongodb/data/config
logpath=/var/log/mongo/config.log
logappend = true
fork = true
[root@node2 ~]# vi /etc/config.conf
configsvr = true
port = 20000
dbpath=/mongodb/data/config
logpath=/var/log/mongo/config.log
logappend = true
fork = true
[root@node3 ~]# vi /etc/config.conf
configsvr = true
port = 20000
dbpath=/mongodb/data/config
logpath=/var/log/mongo/config.log
logappend = true
fork = true
[root@node1 ~]# /usr/local/mongodb/bin/mongod -f /etc/config.conf
[root@node2 ~]# /usr/local/mongodb/bin/mongod -f /etc/config.conf
[root@node3 ~]# /usr/local/mongodb/bin/mongod -f /etc/config.conf

配置啓動mongs

[root@node1 ~]# /usr/local/mongodb/bin/mongos --configdb 192.168.30.116:20000,192.168.30.117:20000,192.168.30.119:20000 --port 30000 --chunkSize 5 --logpath /mongodb/data/mongos.log --logappend --fork
[root@node2 ~]# /usr/local/mongodb/bin/mongos --configdb 192.168.30.116:20000,192.168.30.117:20000,192.168.30.119:20000 --port 30000 --chunkSize 5 --logpath /mongodb/data/mongos.log --logappend --fork
[root@node3 ~]# /usr/local/mongodb/bin/mongos --configdb 192.168.30.116:20000,192.168.30.117:20000,192.168.30.119:20000 --port 30000 --chunkSize 5 --logpath /mongodb/data/mongos.log --logappend --fork

查看是mongod,config,mongos是否啓動

[root@node1 ~]# ss -antpl | grep mong
LISTEN     0      128           192.168.30.116:30000                    *:*      users:(("mongos",3627,14))
LISTEN     0      128           192.168.30.116:28017                    *:*      users:(("mongod",22036,6))
LISTEN     0      128           192.168.30.116:28018                    *:*      users:(("mongod",3287,8))
LISTEN     0      128           192.168.30.116:31000                    *:*      users:(("mongos",3627,13))
LISTEN     0      128           192.168.30.116:20000                    *:*      users:(("mongod",3487,6))
LISTEN     0      128           192.168.30.116:21000                    *:*      users:(("mongod",3487,7))
LISTEN     0      128           192.168.30.116:27017                    *:*      users:(("mongod",22036,7))
LISTEN     0      128           192.168.30.116:27018                    *:*      users:(("mongod",3287,7))

都已啓動成功

鏈接到其中一個mongos進程,並切換到admin數據庫作如下配置:

[root@node1 ~]# /usr/local/mongodb/bin/mongo 192.168.30.116:30000/admin
MongoDB shell version: 2.0.4
connecting to: 192.168.30.116:30000/admin
>db.runCommand( { addshard: "shard1/192.168.30.116:27017,192.168.30.117:27017,192.168.30.119:27017″,name:"s1",maxsize:20480});
>db.runCommand( { addshard: "shard2/192.168.30.116:27018,192.168.30.117:27018,192.168.30.119:27018″,name:"s2",maxsize:20480});

查看加入的2個shard

mongos> db.runCommand( { listshards : 1 } );
{
    "shards" : [
        {
            "_id" : "s1",
            "host" : "shard1/192.168.30.116:27017,192.168.30.117:27017,192.168.30.119:27017"
        },
        {
            "_id" : "s2",
            "host" : "shard2/192.168.30.116:27018,192.168.30.117:27018,192.168.30.119:27018"
        }
    ],
    "ok" : 1
}


啓動用數據庫分片

mongos> db.runCommand({enablesharding : "testdb"});
{ "ok" : 1 }


指定分片的key

mongos> db.runCommand({shardcollection :"testdb.user",key :{_id:1}});
{ "collectionsharded" : "test.user", "ok" : 1 }


插入20W行數據測試是否分片

mongos> use testdb
switched to db testdb
mongos> for(var i=1;i<=200000;i++) db.user.insert({age:i,name:"ljl",addr:"beijing",country:"china"})
mongos> db.user.stats()
{
    "sharded" : true,
    "flags" : 1,
    "ns" : "testdb.user",
    "count" : 70001,
    "numExtents" : 7,
    "size" : 6160096,
    "storageSize" : 11190272,
    "totalIndexSize" : 2289280,
    "indexSizes" : {
        "_id_" : 2289280
    },
    "avgObjSize" : 88.00011428408166,
    "nindexes" : 1,
    "nchunks" : 10,
    "shards" : {
        "s1" : {
            "ns" : "testdb.user",
            "count" : 0,
            "size" : 0,
            "storageSize" : 8192,
            "numExtents" : 1,
            "nindexes" : 1,
            "lastExtentSize" : 8192,
            "paddingFactor" : 1,
            "flags" : 1,
            "totalIndexSize" : 8176,
            "indexSizes" : {
                "_id_" : 8176
            },
            "ok" : 1
        },
        "s2" : {
            "ns" : "testdb.user",
            "count" : 70001,
            "size" : 6160096,
            "avgObjSize" : 88.00011428408166,
            "storageSize" : 11182080,
            "numExtents" : 6,
            "nindexes" : 1,
            "lastExtentSize" : 8388608,
            "paddingFactor" : 1,
            "flags" : 1,
            "totalIndexSize" : 2281104,
            "indexSizes" : {
                "_id_" : 2281104
            },
            "ok" : 1
        }
    },
    "ok" : 1
}


以上信息發現數據已經分佈在不一樣分片上

下面模擬自動故障轉移功能,在node1關閉shard1

[root@node1 ~]# /usr/local/mongodb/bin/mongod --shutdown --shardsvr --replSet shard1 --port 27017 --dbpath /mongodb/data/shard11
killing process with pid: 22036
[root@node1 ~]# /usr/local/mongodb/bin/mongo 192.168.30.116:30000/testdb
MongoDB shell version: 2.0.4
connecting to: 192.168.30.116:30000/testdb
mongos> db.user.find()
{ "_id" : ObjectId("5338092034594e87a8cb007b"), "age" : 1, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb007c"), "age" : 2, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb007d"), "age" : 3, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb007e"), "age" : 4, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb007f"), "age" : 5, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0080"), "age" : 6, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0081"), "age" : 7, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0082"), "age" : 8, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0083"), "age" : 9, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0084"), "age" : 10, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0085"), "age" : 11, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0086"), "age" : 12, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0087"), "age" : 13, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0088"), "age" : 14, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb0089"), "age" : 15, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb008a"), "age" : 16, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb008b"), "age" : 17, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb008c"), "age" : 18, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb008d"), "age" : 19, "name" : "ljl", "addr" : "beijing", "country" : "china" }
{ "_id" : ObjectId("5338092034594e87a8cb008e"), "age" : 20, "name" : "ljl", "addr" : "beijing", "country" : "china" }
has more
mongos>

以上信息發現數據一切正常


分片注意的幾個事項:

選擇sharding key的標準:

應該在哪兒存儲數據?

應該從哪兒獲得但願的數據?


基本法則:

sharding key應該是主鍵;

sharding key應該能儘可能保證避免跨分片查詢。

相關文章
相關標籤/搜索