codis集羣使用經驗

集羣架構圖

codis集羣架構圖

HA方案測試用例:

功能測試

基於各類場景, 咱們主要針對redis緩存的get, set操做進行各類codis集羣HA方案的驗證python

  • 當codis集羣中只有一個Group1時,該場景能夠確保測試程序操做的數據都在該分組的redis實例中(經過codis-proxy的程序大多數不會使用hash tag來區分slots,該場景測試大多數操做場景,功能是否正常).web

    • 讓Group1.Master下線. > 1. 驗證get,set 正常, 會有5秒的服務不可用.redis

    • 讓Group1.Slave下線. > 1. get,set 正常緩存

    • 讓Group1.Mster, Group1.Slave同時下線, 以後再依次上線 > 1. get, set 服務不可用, 上線以後服務可用.數據結構

  • 當codis集羣中添加Group2, Group3.經過 hashtag 肯定對redis的操做落在了指定的slots上.架構

    • 添加Group2,Group3的過程當中(測試slot自動均衡過程當中是否影響操做). > 1. 驗證get, set是否異常socket

    • 讓Group1.Master, Group2.Master下線(測試集羣中多個分組中的master出現問題, codis-ha是否能夠順利將slave提高爲master) > 1. 驗證操做這兩個group的數據是否有問題ide

  • zookeeper不可用時.性能

    • 查看get, set是否能夠正常工做測試


性能測試

  • 初始狀態

    • group1: 192.168.200.120:8001(master), 192.168.200.120:8002(slave)

    • group2: 192.168.200.120:8003(master), 192.168.200.120:8004(slave)

    • 2臺codis-proxy, 地址爲: 192.168.200.120:19000, 192.168.200.120:19001

    • 2個group

    • 1個codis-server:192.168.200.120:8005

    • 1個HAproxy:

  • 單臺redis的性能測試(2.8.21版本)

    • 目的: 能夠比對codis-server與redis原生版本的性能, 排除由於機器性能產生的差別.

    • 壓測地址:

    • 200個用戶,

  • 單臺codis-server性能測試

    • redisMAMA監控codis-server截圖redisMAMA監控codis-server截圖

    • 對codis-serve壓測客戶端截圖1對codis-serve壓測客戶端截圖1

    • 對codis-serve壓測客戶端截圖2對codis-serve壓測客戶端截圖2

    • 目的: 脫離proxy的轉發, 看實際codis-server的性能.

    • 壓測地址:192.168.200.120:8001

    • 200個用戶, 15000qps

    • 壓測結果

  • 加入codis-proxy,而後一個group1(1個master,1個slave)

    • 讀測試

    • 寫測試

    • 目的: 與獨立codis-server的壓測結果進行比對, 肯定codis-proxy的性能損耗

    • 壓測地址: 192.168.200.120:19000

    • 200個用戶, 15000qps

    • 壓測結果

  • 動態加入新的group2(1個master,1個slave),變爲兩個組

    • 讀測試

    • 寫測試

    • 目的: 查看添加group以後, codis-proxy是否有新的性能提高

    • 壓測地址: 192.168.200.120:19000

    • 200個用戶, xxx qps

    • 壓測結果

  • 動態縮減group2, 變爲1個組

    • 目的: 動態縮減過程當中,數據遷移過程當中,對性能的損耗

  • keepAlive壓力測試

    • 壓測地址

主要命令

啓動一組codis集羣的順序,及經常使用命令

  1. 啓動codis web頁面 在zookeeper中註冊codis組的節點

    nohup ./bin/codis-config -c config.ini -L ./log/dashboard.log dashboard --addr=:18087 --http-log=./log/requests.log &>/dev/null

     

  2. 初始化slot

    $CODIS_HOME/bin/codis-config -c $CODIS_HOME/conf/config.ini slot init -f

     

* 查看slot槽位1的信息```./codis-config -c /usr/local/codis/conf/config.ini slot info 1```

 

  1. 添加codis組

    codis-config -c /usr/local/codis/conf/config.ini server add-group 1

     

* 刪除codis組```Codis-config -c /usr/local/codis/conf/config.ini server remove-group 1```

 

  1. 添加機器到codis組中

    codis-config -c /usr/local/codis/conf/config.ini server add 2 192.168.10.170:6380 master

     

* 從codis組中刪除redis實例```Codis-config -c /usr/local/codis/conf/config.ini server remove 1 192.168.10.169:6379```

 

  1. 分槽

    ./bin/codis-config -c /etc/codis/config_10.ini slot range-set 0 300 1 online

     

  2. 生成一個proxy,把codis_proxy_1這個proxy名稱註冊到zookeeper中,並設置爲offline狀態

    /usr/local/codis/bin/codis-config -c /usr/local/codis/conf/config.ini proxy offline codis_proxy_1

     

  3. 啓動proxy,代理端口爲19000 http端口爲11000

    nohup /usr/local/codis/bin/codis-proxy --log-level info -c /usr/local/codis/conf/config.ini -L /usr/local/codis/logs/proxy.log  --cpu=8 --addr=0.0.0.0:19000 --http-addr=0.0.0.0:11000 &

     

  4. 把proxy id爲codis_proxy_1的proxy設置爲online狀態,可讓客戶端訪問

    /usr/local/codis/bin/codis-config -c /usr/local/codis/conf/config.ini proxy online codis_proxy_1

     

  5. 啓動codis-ha,用來作group中從自動提高爲master用,並針對哪一個codis組名

    nohup ./codis-ha --codis-config=127.0.0.1:18087 --productName=testgroup1 &

     

目前須要手動操做的地方(可自動化)

  • 當group_1中的master斷掉,須要先從group_1中把掛掉的master節點刪除掉(須要寫腳本完成自動化).

    codis-config -c ../conf/config.ini server remove 1 192.168.10.170:6379

     

  • 把之前的master從新加進來換成slave節點(須要寫腳本完成自動化)

    codis-config -c ../conf/config.ini server add 1 192.168.10.170:6379 slave

     

  • 若是group_1中有 >=2 個slave時,master斷掉後,沒有提高爲master的slave仍是會同步斷掉的master節點,依次把slave從group_1中刪除, 依次把slave節點加入group_1中.

  • 當group_1中的slave斷掉,以後恢復後,人爲從新加入到group_1中(須要寫腳本完成自動化).

    codis-config -c ../conf/config.ini server add 1 192.168.10.170:6379 slave

  • 自動切換腳本
  • #!/usr/bin/python
    # -*- coding:utf-8 -*-
    
    import os
    import socket
    import logging
    from logging import handlers
    
    
    smslog = '/var/log/codisswitch.log'
    def log(msg,level='info',logfile=smslog):
            if not os.path.exists(logfile):
                    os.mknod(logfile)
            formatter = logging.Formatter('%(asctime)s - %(name)s - %(levelname)s - %(message)s')
            logger = logging.getLogger('codis-sms-sender')
            logger.setLevel(logging.DEBUG)
            file_handler = logging.handlers.TimedRotatingFileHandler(logfile,'D')
            file_handler.setFormatter(formatter)
            Levels = {'debug':logger.debug,
                    'info':logger.info,
                    'warning':logger.warning,
                    'error':logger.error,
                    'critical':logger.critical
                    }
            logger.addHandler(file_handler)
            Levels[level](msg)
            logger.removeHandler(file_handler)
            file_handler.close()
    
    def IsOpen(ip,port):
            s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
            try:
                    s.connect((ip,int(port)))
                    s.shutdown(2)
                    print '%s is open' % port
                    return 'true'
            except:
                    print '%s is down' % port
                    return 'false'
    
    def codisSwitch():
            codisInfo = eval(os.popen('/usr/local/codis/bin/codis-config -c /usr/local/codis/conf/config.ini server list').read())
            for i in codisInfo:
                    for ii in  i['servers']:
                            print ii
                            if ii['type'] == 'offline':
                                    ip = ii['addr'].split(':')[0]
                                    port = ii['addr'].split(':')[1]
                                    codisgroup = ii['group_id']
                                    msg = ip,str(port)+'作爲group:'+str(codisgroup)+'中的master is down'
                                    log(msg , 'critical')
                                    portStatus = IsOpen(ip,port)
                                    print 'group:'+str(codisgroup),'master:'+ii['addr'],'is down'
                                    if portStatus == 'true':
                                            codisInfo = eval(os.popen('/usr/local/codis/bin/codis-config -c /usr/local/codis/conf/config.ini server remove %s %s'%(str(codisgroup),ii['addr'])).read())
                                            print codisInfo
                                            if codisInfo['msg'] == 'OK':
                                                    print '------'
                                                    msg = ip,port,'已經從codis集羣中移除'
                                                    log(msg , 'critical')
                                                    codisaddInfo = eval(os.popen('/usr/local/codis/bin/codis-config -c /usr/local/codis/conf/config.ini server add %s %s slave'%(str(codisgroup),ii['addr'])).read())
                                                    if codisaddInfo['msg'] == 'OK':
                                                            print '++++++'
                                                            msg = ip,port,'已經從新加入到codis集羣中,角色是slave'
                                                            log(msg , 'critical')
                            elif ii['type'] == 'slave':
                                    print '-----------------'
                                    ip = ii['addr'].split(':')[0]
                                    port = ii['addr'].split(':')[1]
                                    codisgroup = ii['group_id']
                                    portStatus = IsOpen(ip,port)
                                    if portStatus == 'false':
                                            print '======'
                                            msg = 'group:'+str(codisgroup),'slave:'+ii['addr'],'is down'
                                            log(msg , 'critical')
                                    else:
                                            os.popen('/usr/local/codis/bin/codis-config -c /usr/local/codis/conf/config.ini server add %s %s slave'%(str(codisgroup),ii['addr'])).read()
    
    
    if __name__ =='__main__':
            while True:
                    codisSwitch()

     

其餘命令

業務隔離

經過Namespace進行業務隔離

zk中的數據結構

  • codis

    • fence(proxy 地址)

    • servers

    • slots

    • proxy(proxy列表)

    • migrate_tasks

    • dashbord

    • LOCK

    • actions (1033個子節點)

    • ActionResponse (1033個子節點, 應該和actions是一一對應)

    • 10-219:19000 (Proxy Addr)

    • ip:port = {"type":"slave","group_id":1,"addr":"192.168.10.168:6379"}

    • group1

    • slot_345 = {"product_name":"testgroup1","id":1,"group_id":1,"state":{"status":"online","migrate_status":{"from":-1,"to":-1},"last_op_ts":"0"}}

    • slot

    • codis_proxy_1 (Proxy Name, 臨時節點)  = {"id":"codis_proxy_1","addr":"10-219:19000","last_event":"","last_event_ts":0,"state":"online","description":"","debug_var_addr":"10-219:11000","pid":20659,"start_at":"2015-10-19 11:58:21.473980358 +0800 CST"}

    • 0000001602  = {"type":"slot_changed","desc":"","target":{"product_name":"testgroup1","id":801,"group_id":-1,"state":{"status":"offline","migrate_status":{"from":-1,"to":-1},"last_op_ts":"0"}},"ts":"1445224886","receivers":null}

    • 0000001602 = {"type":"slot_changed","desc":"","target":{"product_name":"testgroup1","id":462,"group_id":-1,"state":{"status":"offline","migrate_status":{"from":-1,"to":-1},"last_op_ts":"0"}},"ts":"1445224755","receivers":null}

    • product name

其餘問題

codis的優點

  • 動態縮容, 擴容, 對應用徹底透明, 能夠在大促或者雙十一時, 動態添加實例, 峯值事後再縮容.

  • product name 進行多租戶隔離

  • 成熟的管理界面.

  • 利用zk進行動態數據遷移.

  • 支持MSET, MGET

  • 支持LPUSH LPOP

  • 經過Hash tag支持eval

  • 有完整的數據遷移方案

相關文章
相關標籤/搜索