ceph的實驗環境在公司內部用了一段時間,主要是利用rbd提供的塊設備建立虛擬機、爲虛擬機分配塊,仍是很穩定的。但如今的環境大部分配置仍是ceph的默認值,只是將journal分離出來寫到了一個單獨的分區。後面打算利用ceph tier和ssd作一些優化: node
1. 將journal寫入一塊單獨的ssd磁盤。 shell
2. 利用ssd配置一個ssd pool,將這個pool做爲其它pool的cache,這就須要ceph tier。 app
網上搜索了一下,目前尚未這麼實踐的文章以及這麼作後性能到底會提高多少。因此此方案實施後會進行相關測試: dom
1. 默認安裝ceph。 性能
2. 將journal分離到單獨的普通硬盤分區。 測試
3. 將journal分離到單獨的ssd盤。 flex
4. 加入ssd pool後。 優化
crush的設置能夠看這篇文章:http://www.sebastien-han.fr/blog/2012/12/07/ceph-2-speed-storage-with-crush/ this
Roughly say your infrastructure could be based on several type of servers: spa
Such handy mecanism is possible with the help of the CRUSH Map.
CRUSH stands for Controlled Replication Under Scalable Hashing:
For more details check the Ceph Official documentation.
What are we going to do?
Grab your current CRUSH map:
$ ceph osd getcrushmap -o ma-crush-map $ crushtool -d ma-crush-map -o ma-crush-map.txt
For the sake of simplicity, let’s assume that you have 4 OSDs:
And here is the OSD tree:
$ ceph osd tree dumped osdmap tree epoch 621 # id weight type name up/down reweight -1 12 pool default -3 12 rack le-rack -2 3 host ceph-01 0 1 osd.0 up 1 1 1 osd.1 up 1 -4 3 host ceph-02 2 1 osd.2 up 1 3 1 osd.3 up 1
Edit your CRUSH map:
# begin crush map # devices device 0 osd.0 device 1 osd.1 device 2 osd.2 device 3 osd.3 # types type 0 osd type 1 host type 2 rack type 3 row type 4 room type 5 datacenter type 6 pool # buckets host ceph-01 { id -2 # do not change unnecessarily # weight 3.000 alg straw hash 0 # rjenkins1 item osd.0 weight 1.000 item osd.1 weight 1.000 } host ceph-02 { id -4 # do not change unnecessarily # weight 3.000 alg straw hash 0 # rjenkins1 item osd.2 weight 1.000 item osd.3 weight 1.000 } rack le-rack { id -3 # do not change unnecessarily # weight 12.000 alg straw hash 0 # rjenkins1 item ceph-01 weight 2.000 item ceph-02 weight 2.000 } pool default { id -1 # do not change unnecessarily # weight 12.000 alg straw hash 0 # rjenkins1 item le-rack weight 4.000 } # rules rule data { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule metadata { ruleset 1 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } rule rbd { ruleset 2 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } # end crush map
Now we have to add 2 new specific rules:
Add a bucket for the pool SSD:
pool ssd { id -5 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.0 weight 1.000 item osd.1 weight 1.000 }
Add a rule for the bucket nearly created:
rule ssd { ruleset 3 type replicated min_size 1 max_size 10 step take ssd step choose firstn 0 type host step emit }
Add a bucket for the pool SAS:
pool sas { id -6 # do not change unnecessarily alg straw hash 0 # rjenkins1 item osd.2 weight 1.000 item osd.3 weight 1.000 }
Add a rule for the bucket nearly created:
rule sas { ruleset 4 type replicated min_size 1 max_size 10 step take sas step choose firstn 0 type host step emit }
Eventually recompile and inject the new CRUSH map:
$ crushtool -c ma-crush-map.txt -o ma-nouvelle-crush-map $ ceph osd setcrushmap -i ma-nouvelle-crush-map
Create your 2 new pools:
$ rados mkpool ssd successfully created pool ssd $ rados mkpool sas successfully created pool sas
Set the rule set to the pool:
ceph osd pool set ssd crush_ruleset 3 ceph osd pool set sas crush_ruleset 4
Check that the changes have been applied successfully:
$ ceph osd dump | grep -E 'ssd|sas' pool 3 'ssd' rep size 2 crush_ruleset 3 object_hash rjenkins pg_num 128 pgp_num 128 last_change 21 owner 0 pool 4 'sas' rep size 2 crush_ruleset 4 object_hash rjenkins pg_num 128 pgp_num 128 last_change 23 owner 0
Just create some random files and put them into your object store:
$ dd if=/dev/zero of=ssd.pool bs=1M count=512 conv=fsync $ dd if=/dev/zero of=sas.pool bs=1M count=512 conv=fsync $ rados -p ssd put ssd.pool ssd.pool.object $ rados -p sas put sas.pool sas.pool.object
Where are pg active?
$ ceph osd map ssd ssd.pool.object osdmap e260 pool 'ssd' (3) object 'ssd.pool.object' -> pg 3.c5034eb8 (3.0) -> up [1,0] acting [1,0] $ ceph osd map sas sas.pool.object osdmap e260 pool 'sas' (4) object 'sas.pool.object' -> pg 4.9202e7ee (4.0) -> up [3,2] acting [3,2]
CRUSH Rules! As you can see from this article CRUSH allows you to perform amazing things. The CRUSH Map could be very complex, but it brings a lot of flexibility! Happy CRUSH Mapping ;-)