基於淘寶開源Tair分佈式KV存儲引擎的整合部署

時間 2019-12-10

原文原文鏈接

1、前言

　　Tair支撐了淘寶幾乎全部系統的緩存信息（Tair = Taobao Pair，Pair即Key-Value鍵值對），內置了三個存儲引擎：mdb（默認，相似於Memcache）、rdb（相似於Redis）、ldb（高性能KV存儲），其中前2者定位於cache緩存，ldb則定位於持久化存儲。Tair屬於分佈式系統，由一箇中心控制節點（Config Server）和一系列的服務節點（Data Server）組成，Config Server負責管理維護全部的Data Server狀態信息。Data Server對外提供各類數據服務，並以心跳（Heartbeat）的形式將自身情況彙報給Config Server。Config Server是一個輕量級的控制點，能夠採用Master-Slaver的形式來保證其可靠性，全部的Data Server地位都是等價的。持久化的數據存放於磁盤中，爲了解決磁盤損壞致使的數據丟失, Tair能夠配置數據的備份數目, 自動將一份數據的不一樣備份放到不一樣的主機上。linux

　　本文記錄了詳細的部署操做步驟（基於trunk版本），引用一張官方結構圖，更多關於Tair的介紹請參照官方wiki（http://code.taobao.org/p/tair/wiki/intro/）。c++

2、安裝Tair Server

　　一、系統環境：CentOS 6.5 （64位）算法

　　二、Tair Server 源碼採用 C++ 編寫，如下基於源碼在linux環境下 make installjson

　　三、安裝依賴包（感謝萬能的yum）bootstrap

　　　　1）wiki中描述須要安裝依賴automake 、autoconfig、libtool庫，實際CentOS6.5已經集成了以上lib，若是沒有請經過yum獲取緩存

　　　　　　命令：「 yum install libtool 」服務器

　　　　2）安裝boost-devel庫網絡

　　　　　　命令：「 yum install boost-devel 」app

　　　　3）安裝gcc-c++ 負載均衡

　　　　　　命令：「 yum install gcc-c++ 」

四、經過svn獲取 tb-common-utils 和 Tair 源碼

　　　　tb-common-utils 的 SVN地址：　　http://code.taobao.org/svn/tb-common-utils/trunk

　　　　Tair 的 SVN地址：　　http://code.taobao.org/svn/tair/trunk

　　五、將獲取到的源碼複製到當前用戶目錄，同時手工建立一個名稱爲「tairlib」文件夾，做爲相關lib文件的安裝路徑，結構以下

　　六、安裝tb-common-utils庫，從命名能夠看出應該是taobao內部開發的公共庫

　　　　1）經過終端進入tb-common-utils文件夾，切換到root帳號

　　　　2）給build.sh 腳本文件增長可執行X權限，不然可能會提示「權限不夠」

　　　　　命令：「 chmod +x build.sh 」

　　　　3）建立環境變量TBLIB_ROOT指示相關lib文件的安裝路徑，指向以前建立的tairlib文件夾

　　　　　注意這個是會話級的變量，若是中間關閉終端須要從新export

　　　　　命令： export TBLIB_ROOT="/home/glf/tairlib"

　　　　4）執行build.sh腳本。

　　　　5）至此tb-common-utils庫安裝完畢。

　　七、安裝tair

　　　　1）切換進入tair文件夾，若是是從新打開的終端窗口，請切換到root帳號，同時參照6.3使用一樣的命令建立環境變量，tair的安裝一樣須要這個環境變量

　　　　2）給bootstrap.sh 腳本文件增長可執行X權限，不然可能提示「權限不夠"

　　　　　　命令：「 chmod +x bootstrap.sh 」

　　　　3）執行 bootstrap.sh 腳本

　　　　4）執行 configure

　　　　5）執行 make

　　　　6）執行 make install ，安裝成功後會把tair安裝到 /root/tair_bin ，至此安裝tair 服務器完畢。

　　　　　由於使用的root帳號安裝，因此安裝目錄爲/root，但若是使用當前用戶在安裝過程會提示權限問題，麻煩知道怎麼解決的請告之。

3、Tair Server的配置

　　一、因爲機器網絡環境複雜，本文僅以一臺機器不一樣端口號模擬分佈式集羣部署，若是多機器部署端口能夠一致，只須要修改IP便可。

名稱	IP	端口
Config Server（Master）	10.0.2.15	5198
Config Server（Slaver）	10.0.2.15	5200
Data Server A	10.0.2.15	5191
Data Server B	10.0.2.15	5192

　　　　關於IP、端口如何配置請參照下面的conf文件，但須要注意的是Config Server的心跳端口（Heartbeat Port）爲Port+1，

　　　　例如Port=5198那麼Heartbeat Port默認=5199，因此在配置其餘端口的時候注意預留，不要重複。

　　二、CP 4份tair_bin文件夾，依次重命名以下圖，後面都在4個CP的目錄修改配置，原tair_bin保留不做任何修改（好不容易纔裝起來的呀：））。

　　　　tair_bin_cs1：做爲Config Server（Master）目錄

　　　　tair_bin_cs2：做爲Config Server（Slaver）目錄

　　　　tair_bin_ds1：做爲Data Server A 目錄

　　　　tair_bin_ds2：做爲Data Server B 目錄

　　三、依次在CP的4個目錄下建立（mkdir） data 和 logs 文件夾，用於設置相關配置文件中的路徑（不肯定是否必須，也有可能服務啓動的時候會根據conf設置的路徑自動建立）

　　四、每一個Server的etc 目錄下都包含如下文件（安裝時建立的sample文件），相關文件中已經存在對應的配置項解釋說明，

　　　同時能夠參照wiki的解釋：http://code.taobao.org/p/tair/wiki/deploy

　　　「configserver.conf.default」（Config Server使用）

　　　「dataserver.conf.default」（Data Server使用）

　　　「group.conf.default」（Config Server使用）

　　　「invalserver.conf.default」（暫未使用）

　　五、配置Config Server

　　　　1）在 tair_bin_cs1 和 tair_bin_cs2 的etc目錄下將「configserver.conf.default」重命名爲「configserver.conf」，將「group.conf.default」重命名爲「group.conf」，做爲服務器的正式配置文件。

　　　　2）打開 tair_bin_cs1\etc\configserver.conf 參照以下代碼進行配置，其中第一行config_server爲master服務器，第二行爲slaver服務器，

　　　　　使用絕對路徑修改 log_file、pid_file、goup_file、data_dir 目錄，使用 ifconfig 命名查看當前網卡的dev_name和ip，如下修改過的內容用紅色字體標識。

　　　　　（插曲：由於dev_name默認爲eth0，我機器實際爲eth1，一度陷入了絕境）

# # tair 2.3 --- configserver config #  [public] config_server=10.0.2.15:5198 config_server=10.0.2.15:5200 [configserver] port=5198 log_file=/root/tair_bin_cs1/logs/config.log pid_file=/root/tair_bin_cs1/logs/config.pid log_level=warn group_file=/root/tair_bin_cs1/etc/group.conf data_dir=/root/tair_bin_cs1/data/data dev_name=eth1

　　　　3）在 tair_bin_cs1\etc\group.conf 參照以下代碼進行配置，主要用於註冊DataSever服務器的IP和Port

#group name
[group_1] # data move is 1 means when some data serve down, the migrating will be start.  # default value is 0
_data_move=0
#_min_data_server_count: when data servers left in a group less than this value, config server will stop serve for this group #default value is copy count.
_min_data_server_count=1
#_plugIns_list=libStaticPlugIn.so
_build_strategy=1 #1 normal 2 rack 
_build_diff_ratio=0.6 #how much difference is allowd between different rack  # diff_ratio = |data_sever_count_in_rack1 - data_server_count_in_rack2| / max (data_sever_count_in_rack1, data_server_count_in_rack2) # diff_ration must less than _build_diff_ratio
_pos_mask=65535  # 65535 is 0xffff this will be used to gernerate rack info. 64 bit serverId & _pos_mask is the rack info, 
_copy_count=1 _bucket_number=1023
# accept ds strategy. 1 means accept ds automatically
_accept_strategy=1

# data center A
_server_list=10.0.2.15:5191 _server_list=10.0.2.15:5192

#quota info
_areaCapacity_list=0,1124000;

　　　　4）Config Server（Slave）的配置也基本一致，注意修改conf文件中的路徑、ip

　　　　　 configserver.conf 參照以下：

#
# tair 2.3 --- configserver config
#

[public]
config_server=10.0.2.15:5198
config_server=10.0.2.15:5200
 
[configserver]
port=5200
log_file=/root/tair_bin_cs2/logs/config.log
pid_file=/root/tair_bin_cs2/logs/config.pid
log_level=warn
group_file=/root/tair_bin_cs2/etc/group.conf
data_dir=/root/tair_bin_cs2/data/data
dev_name=eth1

View Code

　　　　　 group.conf 參照以下：

#group name
[group_1]
# data move is 1 means when some data serve down, the migrating will be start. 
# default value is 0
_data_move=0
#_min_data_server_count: when data servers left in a group less than this value, config server will stop serve for this group
#default value is copy count.
_min_data_server_count=1
#_plugIns_list=libStaticPlugIn.so
_build_strategy=1 #1 normal 2 rack 
_build_diff_ratio=0.6 #how much difference is allowd between different rack 
# diff_ratio =  |data_sever_count_in_rack1 - data_server_count_in_rack2| / max (data_sever_count_in_rack1, data_server_count_in_rack2)
# diff_ration must less than _build_diff_ratio
_pos_mask=65535  # 65535 is 0xffff  this will be used to gernerate rack info. 64 bit serverId & _pos_mask is the rack info, 
_copy_count=1    
_bucket_number=1023
# accept ds strategy. 1 means accept ds automatically
_accept_strategy=1

# data center A
_server_list=10.0.2.15:5191
_server_list=10.0.2.15:5192 

#quota info
_areaCapacity_list=0,1124000;

View Code

　　　　5）至此2臺 Config Server 配置完畢

　　六、配置Data Server（默認爲 mdb 引擎）

　　　　1）在2個Data Server的 etc 目錄下將「dataserver.conf.default」重命名爲「dataserver.conf」，做爲服務器的正式配置文件

　　　　2）打開 tair_bin_ds1\etc\dataserver.conf 參照以下代碼進行修改配置，注意[public]節點的2行config_server必須和configserver上的配置保持一致

#
#  tair 2.3 --- tairserver config 
#

[public]
config_server=10.0.2.15:5198
config_server=10.0.2.15:5200

[tairserver]
#
#storage_engine:
#
# mdb 
# kdb
# ldb
#
storage_engine=mdb
local_mode=0
#
#mdb_type:
# mdb
# mdb_shm
#
mdb_type=mdb_shm

#
# if you just run 1 tairserver on a computer, you may ignore this option.
# if you want to run more than 1 tairserver on a computer, each tairserver must have their own "mdb_shm_path"
#
#
mdb_shm_path=/mdb_shm_path01

#tairserver listen port
port=5191
heartbeat_port=6191

process_thread_num=16
#
#mdb size in MB
#
slab_mem_size=1024
log_file=/root/tair_bin_ds1/logs/server.log
pid_file=/root/tair_bin_ds1/logs/server.pid
log_level=warn
dev_name=eth1
ulog_dir=/root/tair_bin_ds1/data/ulog
ulog_file_number=3
ulog_file_size=64
check_expired_hour_range=2-4
check_slab_hour_range=5-7
dup_sync=1

do_rsync=0
# much resemble json format
# one local cluster config and one or multi remote cluster config.
# {local:[master_cs_addr,slave_cs_addr,group_name,timeout_ms,queue_limit],remote:[...],remote:[...]}
rsync_conf={local:[10.0.0.1:5198,10.0.0.2:5198,group_local,2000,1000],remote:[10.0.1.1:5198,10.0.1.2:5198,group_remote,2000,3000]}
# if same data can be updated in local and remote cluster, then we need care modify time to
# reserve latest update when do rsync to each other.
rsync_mtime_care=0
# rsync data directory(retry_log/fail_log..)
rsync_data_dir=/root/tair_bin_ds1/data/remote
# max log file size to record failed rsync data, rotate to a new file when over the limit
rsync_fail_log_size=30000000
# whether do retry when rsync failed at first time
rsync_do_retry=0
# when doing retry,  size limit of retry log's memory use
rsync_retry_log_mem_size=100000000

[fdb]
# in MB
index_mmap_size=30
cache_size=256
bucket_size=10223
free_block_pool_size=8
data_dir=/root/tair_bin_ds1/data/fdb
fdb_name=tair_fdb

[kdb]
# in byte
map_size=10485760      # the size of the internal memory-mapped region
bucket_size=1048583    # the number of buckets of the hash table
record_align=128       # the power of the alignment of record size
data_dir=/root/tair_bin_ds1/data/kdb      # the directory of kdb's data

[ldb]
#### ldb manager config
## data dir prefix, db path will be data/ldbxx, "xx" means db instance index.
## so if ldb_db_instance_count = 2, then leveldb will init in
## /data/ldb1/ldb/, /data/ldb2/ldb/. We can mount each disk to
## data/ldb1, data/ldb2, so we can init each instance on each disk.
data_dir=/root/tair_bin_ds1/data/ldb
## leveldb instance count, buckets will be well-distributed to instances
ldb_db_instance_count=1
## whether load backup version when startup.
## backup version may be created to maintain some db data of specifid version.
ldb_load_backup_version=0
## whether support version strategy.
## if yes, put will do get operation to update existed items's meta info(version .etc),
## get unexist item is expensive for leveldb. set 0 to disable if nobody even care version stuff.
ldb_db_version_care=1
## time range to compact for gc, 1-1 means do no compaction at all
ldb_compact_gc_range = 3-6
## backgroud task check compact interval (s)
ldb_check_compact_interval = 120
## use cache count, 0 means NOT use cache,`ldb_use_cache_count should NOT be larger
## than `ldb_db_instance_count, and better to be a factor of `ldb_db_instance_count.
## each cache mdb's config depends on mdb's config item(mdb_type, slab_mem_size, etc)
ldb_use_cache_count=1
## cache stat can't report configserver, record stat locally, stat file size.
## file will be rotate when file size is over this.
ldb_cache_stat_file_size=20971520
## migrate item batch size one time (1M)
ldb_migrate_batch_size = 3145728
## migrate item batch count.
## real batch migrate items depends on the smaller size/count
ldb_migrate_batch_count = 5000
## comparator_type bitcmp by default
# ldb_comparator_type=numeric
## numeric comparator: special compare method for user_key sorting in order to reducing compact
## parameters for numeric compare. format: [meta][prefix][delimiter][number][suffix] 
## skip meta size in compare
# ldb_userkey_skip_meta_size=2
## delimiter between prefix and number 
# ldb_userkey_num_delimiter=:
####
## use blommfilter
ldb_use_bloomfilter=1
## use mmap to speed up random acess file(sstable),may cost much memory
ldb_use_mmap_random_access=0
## how many highest levels to limit compaction
ldb_limit_compact_level_count=0
## limit compaction ratio: allow doing one compaction every ldb_limit_compact_interval
## 0 means limit all compaction
ldb_limit_compact_count_interval=0
## limit compaction time interval
## 0 means limit all compaction
ldb_limit_compact_time_interval=0
## limit compaction time range, start == end means doing limit the whole day.
ldb_limit_compact_time_range=6-1
## limit delete obsolete files when finishing one compaction
ldb_limit_delete_obsolete_file_interval=5
## whether trigger compaction by seek
ldb_do_seek_compaction=0
## whether split mmt when compaction with user-define logic(bucket range, eg) 
ldb_do_split_mmt_compaction=0

#### following config effects on FastDump ####
## when ldb_db_instance_count > 1, bucket will be sharded to instance base on config strategy.
## current supported:
##  hash : just do integer hash to bucket number then module to instance, instance's balance may be
##         not perfect in small buckets set. same bucket will be sharded to same instance
##         all the time, so data will be reused even if buckets owned by server changed(maybe cluster has changed),
##  map  : handle to get better balance among all instances. same bucket may be sharded to different instance based
##         on different buckets set(data will be migrated among instances).
ldb_bucket_index_to_instance_strategy=map
## bucket index can be updated. this is useful if the cluster wouldn't change once started
## even server down/up accidently.
ldb_bucket_index_can_update=1
## strategy map will save bucket index statistics into file, this is the file's directory
ldb_bucket_index_file_dir=/root/tair_bin_ds1/data/bindex
## memory usage for memtable sharded by bucket when batch-put(especially for FastDump)
ldb_max_mem_usage_for_memtable=3221225472
####

#### leveldb config (Warning: you should know what you're doing.)
## one leveldb instance max open files(actually table_cache_ capacity, consider as working set, see `ldb_table_cache_size)
ldb_max_open_files=65535
## whether return fail when occure fail when init/load db, and
## if true, read data when compactiong will verify checksum
ldb_paranoid_check=0
## memtable size
ldb_write_buffer_size=67108864
## sstable size
ldb_target_file_size=8388608
## max file size in each level. level-n (n > 0): (n - 1) * 10 * ldb_base_level_size
ldb_base_level_size=134217728
## sstable's block size
# ldb_block_size=4096
## sstable cache size (override `ldb_max_open_files)
ldb_table_cache_size=1073741824
##block cache size
ldb_block_cache_size=16777216
## arena used by memtable, arena block size
#ldb_arenablock_size=4096
## key is prefix-compressed period in block,
## this is period length(how many keys will be prefix-compressed period)
# ldb_block_restart_interval=16
## specifid compression method (snappy only now)
# ldb_compression=1
## compact when sstables count in level-0 is over this trigger
ldb_l0_compaction_trigger=1
## write will slow down when sstables count in level-0 is over this trigger
## or sstables' filesize in level-0 is over trigger * ldb_write_buffer_size if ldb_l0_limit_write_with_count=0
ldb_l0_slowdown_write_trigger=32
## write will stop(wait until trigger down)
ldb_l0_stop_write_trigger=64
## when write memtable, max level to below maybe
ldb_max_memcompact_level=3
## read verify checksum
ldb_read_verify_checksums=0
## write sync log. (one write will sync log once, expensive)
ldb_write_sync=0
## bits per key when use bloom filter
#ldb_bloomfilter_bits_per_key=10
## filter data base logarithm. filterbasesize=1<<ldb_filter_base_logarithm
#ldb_filter_base_logarithm=12

View Code

　　　　3）另外一臺DataServer的配置基本一致，參照以下代碼

　　　　　注意：因爲屬於單機部署多個DataServer，須要修改mdb_shm_path=/mdb_shm_path02，（ps：這點我開始沒留意致使每次啓動一個dataserver的時候另外一臺自動shutdown）

#
#  tair 2.3 --- tairserver config 
#

[public]
config_server=10.0.2.15:5198
config_server=10.0.2.15:5200

[tairserver]
#
#storage_engine:
#
# mdb 
# kdb
# ldb
#
storage_engine=mdb
local_mode=0
#
#mdb_type:
# mdb
# mdb_shm
#
mdb_type=mdb_shm

#
# if you just run 1 tairserver on a computer, you may ignore this option.
# if you want to run more than 1 tairserver on a computer, each tairserver must have their own "mdb_shm_path"
#
#
mdb_shm_path=/mdb_shm_path02

#tairserver listen port
port=5192
heartbeat_port=6192

process_thread_num=16
#
#mdb size in MB
#
slab_mem_size=1024
log_file=/root/tair_bin_ds2/logs/server.log
pid_file=/root/tair_bin_ds2/logs/server.pid
log_level=warn
dev_name=eth1
ulog_dir=/root/tair_bin_ds2/data/ulog
ulog_file_number=3
ulog_file_size=64
check_expired_hour_range=2-4
check_slab_hour_range=5-7
dup_sync=1

do_rsync=0
# much resemble json format
# one local cluster config and one or multi remote cluster config.
# {local:[master_cs_addr,slave_cs_addr,group_name,timeout_ms,queue_limit],remote:[...],remote:[...]}
rsync_conf={local:[10.0.0.1:5198,10.0.0.2:5198,group_local,2000,1000],remote:[10.0.1.1:5198,10.0.1.2:5198,group_remote,2000,3000]}
# if same data can be updated in local and remote cluster, then we need care modify time to
# reserve latest update when do rsync to each other.
rsync_mtime_care=0
# rsync data directory(retry_log/fail_log..)
rsync_data_dir=/root/tair_bin_ds2/data/remote
# max log file size to record failed rsync data, rotate to a new file when over the limit
rsync_fail_log_size=30000000
# whether do retry when rsync failed at first time
rsync_do_retry=0
# when doing retry,  size limit of retry log's memory use
rsync_retry_log_mem_size=100000000

[fdb]
# in MB
index_mmap_size=30
cache_size=256
bucket_size=10223
free_block_pool_size=8
data_dir=/root/tair_bin_ds2/data/fdb
fdb_name=tair_fdb

[kdb]
# in byte
map_size=10485760      # the size of the internal memory-mapped region
bucket_size=1048583    # the number of buckets of the hash table
record_align=128       # the power of the alignment of record size
data_dir=/root/tair_bin_ds2/data/kdb      # the directory of kdb's data

[ldb]
#### ldb manager config
## data dir prefix, db path will be data/ldbxx, "xx" means db instance index.
## so if ldb_db_instance_count = 2, then leveldb will init in
## /data/ldb1/ldb/, /data/ldb2/ldb/. We can mount each disk to
## data/ldb1, data/ldb2, so we can init each instance on each disk.
data_dir=/root/tair_bin_ds2/data/ldb
## leveldb instance count, buckets will be well-distributed to instances
ldb_db_instance_count=1
## whether load backup version when startup.
## backup version may be created to maintain some db data of specifid version.
ldb_load_backup_version=0
## whether support version strategy.
## if yes, put will do get operation to update existed items's meta info(version .etc),
## get unexist item is expensive for leveldb. set 0 to disable if nobody even care version stuff.
ldb_db_version_care=1
## time range to compact for gc, 1-1 means do no compaction at all
ldb_compact_gc_range = 3-6
## backgroud task check compact interval (s)
ldb_check_compact_interval = 120
## use cache count, 0 means NOT use cache,`ldb_use_cache_count should NOT be larger
## than `ldb_db_instance_count, and better to be a factor of `ldb_db_instance_count.
## each cache mdb's config depends on mdb's config item(mdb_type, slab_mem_size, etc)
ldb_use_cache_count=1
## cache stat can't report configserver, record stat locally, stat file size.
## file will be rotate when file size is over this.
ldb_cache_stat_file_size=20971520
## migrate item batch size one time (1M)
ldb_migrate_batch_size = 3145728
## migrate item batch count.
## real batch migrate items depends on the smaller size/count
ldb_migrate_batch_count = 5000
## comparator_type bitcmp by default
# ldb_comparator_type=numeric
## numeric comparator: special compare method for user_key sorting in order to reducing compact
## parameters for numeric compare. format: [meta][prefix][delimiter][number][suffix] 
## skip meta size in compare
# ldb_userkey_skip_meta_size=2
## delimiter between prefix and number 
# ldb_userkey_num_delimiter=:
####
## use blommfilter
ldb_use_bloomfilter=1
## use mmap to speed up random acess file(sstable),may cost much memory
ldb_use_mmap_random_access=0
## how many highest levels to limit compaction
ldb_limit_compact_level_count=0
## limit compaction ratio: allow doing one compaction every ldb_limit_compact_interval
## 0 means limit all compaction
ldb_limit_compact_count_interval=0
## limit compaction time interval
## 0 means limit all compaction
ldb_limit_compact_time_interval=0
## limit compaction time range, start == end means doing limit the whole day.
ldb_limit_compact_time_range=6-1
## limit delete obsolete files when finishing one compaction
ldb_limit_delete_obsolete_file_interval=5
## whether trigger compaction by seek
ldb_do_seek_compaction=0
## whether split mmt when compaction with user-define logic(bucket range, eg) 
ldb_do_split_mmt_compaction=0

#### following config effects on FastDump ####
## when ldb_db_instance_count > 1, bucket will be sharded to instance base on config strategy.
## current supported:
##  hash : just do integer hash to bucket number then module to instance, instance's balance may be
##         not perfect in small buckets set. same bucket will be sharded to same instance
##         all the time, so data will be reused even if buckets owned by server changed(maybe cluster has changed),
##  map  : handle to get better balance among all instances. same bucket may be sharded to different instance based
##         on different buckets set(data will be migrated among instances).
ldb_bucket_index_to_instance_strategy=map
## bucket index can be updated. this is useful if the cluster wouldn't change once started
## even server down/up accidently.
ldb_bucket_index_can_update=1
## strategy map will save bucket index statistics into file, this is the file's directory
ldb_bucket_index_file_dir=/root/tair_bin_ds2/data/bindex
## memory usage for memtable sharded by bucket when batch-put(especially for FastDump)
ldb_max_mem_usage_for_memtable=3221225472
####

#### leveldb config (Warning: you should know what you're doing.)
## one leveldb instance max open files(actually table_cache_ capacity, consider as working set, see `ldb_table_cache_size)
ldb_max_open_files=65535
## whether return fail when occure fail when init/load db, and
## if true, read data when compactiong will verify checksum
ldb_paranoid_check=0
## memtable size
ldb_write_buffer_size=67108864
## sstable size
ldb_target_file_size=8388608
## max file size in each level. level-n (n > 0): (n - 1) * 10 * ldb_base_level_size
ldb_base_level_size=134217728
## sstable's block size
# ldb_block_size=4096
## sstable cache size (override `ldb_max_open_files)
ldb_table_cache_size=1073741824
##block cache size
ldb_block_cache_size=16777216
## arena used by memtable, arena block size
#ldb_arenablock_size=4096
## key is prefix-compressed period in block,
## this is period length(how many keys will be prefix-compressed period)
# ldb_block_restart_interval=16
## specifid compression method (snappy only now)
# ldb_compression=1
## compact when sstables count in level-0 is over this trigger
ldb_l0_compaction_trigger=1
## write will slow down when sstables count in level-0 is over this trigger
## or sstables' filesize in level-0 is over trigger * ldb_write_buffer_size if ldb_l0_limit_write_with_count=0
ldb_l0_slowdown_write_trigger=32
## write will stop(wait until trigger down)
ldb_l0_stop_write_trigger=64
## when write memtable, max level to below maybe
ldb_max_memcompact_level=3
## read verify checksum
ldb_read_verify_checksums=0
## write sync log. (one write will sync log once, expensive)
ldb_write_sync=0
## bits per key when use bloom filter
#ldb_bloomfilter_bits_per_key=10
## filter data base logarithm. filterbasesize=1<<ldb_filter_base_logarithm
#ldb_filter_base_logarithm=12

View Code

　　　　4）至此2臺 Data Server 也配置完畢

4、啓動Tair Server集羣

　　一、經過終端任意選其中一臺Server執行 set_shm.sh（須要root權限），修改系統分配內存策略，確保程序可以使用足夠的共享內存

　　　　命令：「 ./set_shm.sh 」

　　二、分別經過終端進入2臺Data Server目錄，執行 tair.sh 腳本啓動服務器，注意：請先啓動DataServer，後啓動ConfigServer，相關解釋見wiki

　　　　此時因爲Config Server尚未啓動，Log會出現heartbeat錯誤。

[2014-12-19 19:28:42.336703] ERROR handlePacket (heartbeat_thread.cpp:141) [140335215126272] ControlPacket, cmd:3
[2014-12-19 19:28:43.341952] ERROR handlePacket (heartbeat_thread.cpp:141) [140335215126272] ControlPacket, cmd:2 [2014-12-19 19:28:43.341982] ERROR handlePacket (heartbeat_thread.cpp:141) [140335215126272] ControlPacket, cmd:3 [2014-12-19 19:28:44.345308] WARN update_server_table (tair_manager.cpp:1397) [140334767929088] updateServerTable, size: 2046 [2014-12-19 19:28:44.345312] WARN handlePacket (heartbeat_thread.cpp:212) [140335215126272] config server HOST UP: 10.0.2.15:5198 [2014-12-19 19:28:44.345350] WARN handlePacket (heartbeat_thread.cpp:212) [140335215126272] config server HOST UP: 10.0.2.15:5200

View Code

　　　　命令：「 ./tair.sh start_ds 」

　　三、分別經過終端進入2臺Config Server目錄，經過執行 tair.sh 腳原本啓動服務器

　　　　命令：「 ./tair.sh start_cs 」

　　四、至此4臺服務器啓動完畢，過程不會一路順風，請耐心仔細，查閱log文件