服務發現 - Consul的客戶端可用提供一個服務,好比 api 或者mysql ,另一些客戶端可用使用Consul去發現一個指定服務的提供者.經過DNS或者HTTP應用程序可用很容易的找到他所依賴的服務.html
健康檢查 - Consul客戶端可用提供任意數量的健康檢查,指定一個服務(好比:webserver是否返回了200 OK 狀態碼)或者使用本地節點(好比:內存使用是否大於90%). 這個信息可由operator用來監視集羣的健康.被服務發現組件用來避免將流量發送到不健康的主機.node
Key/Value存儲 - 應用程序可用根據本身的須要使用Consul的層級的Key/Value存儲.好比動態配置,功能標記,協調,領袖選舉等等,簡單的HTTP API讓他更易於使用.mysql
多數據中心 - Consul支持開箱即用的多數據中心.這意味着用戶不須要擔憂須要創建額外的抽象層讓業務擴展到多個區域.web
1. 支持多數據中心, 內外網的服務採用不一樣的端口進行監聽。 多數據中心集羣能夠避免單數據中心的單點故障, zookeeper和 etcd 均不提供多數據中心功能的支持redis
2. 支持健康檢查. etcd 不提供此功能.算法
3. 支持 http 和 dns 協議接口. zookeeper 的集成較爲複雜,etcd 只支持 http 協議. 有DNS功能, 支持REST APIsql
4. 官方提供web管理界面, etcd 無此功能.shell
5. 部署簡單, 運維友好, 無依賴, go的二進制程序copy過來就能用了, 一個程序搞定, 能夠結合ansible來推送。json
Consul 和其餘配置工具的對比:bootstrap
Consul 架構:
Consul 角色:
1. Consul Cluster由部署和運行了Consul Agent的節點組成。 在Cluster中有兩種角色:Server和 Client。
2. Server和Client的角色和Consul Cluster上運行的應用服務無關, 是基於Consul層面的一種角色劃分.
3. Consul Server: 用於維護Consul Cluster的狀態信息, 實現數據一致性, 響應RPC請求。官方建議是: 至少要運行3個或者3個以上的Consul Server。 多個server之中須要選舉一個leader, 這個選舉過程Consul基於Raft協議實現. 多個Server節點上的Consul數據信息保持強一致性。 在局域網內與本地客戶端通信,經過廣域網與其餘數據中心通信。Consul Client: 只維護自身的狀態, 並將HTTP和DNS接口請求轉發給服務端。
4. Consul 支持多數據中心, 多個數據中心要求每一個數據中心都要安裝一組Consul cluster,多個數據中心間基於gossip protocol協議來通信, 使用Raft算法實現一致性
10.64.58.45 深圳 server S1
10.64.67.43 佛山 server S2
10.64.58.46 深圳 server S3
10.64.50.129 深圳 client
10.64.50.128 深圳 client
1. Consul安裝
consul的安裝很是容易,從https://www.consul.io/downloads.html這裏下載之後,解壓便可使用,就是一個二進制文件,其餘的都沒有了。
4臺機器都建立目錄,分別是放配置文件,以及存放數據的。以及存放redis,mysql的健康檢查腳本
mkdir /etc/consul.d/ -p && mkdir /data/consul/ -p
mkdir /data/consul/shell -p
server的配置文件以下:
[root@centos consul.d]# cat server.json
{
"data_dir": "/data/consul",
"datacenter": "shenzhen",
"log_level": "INFO",
"server": true,
"bootstrap_expect": 1,
"bind_addr": "10.64.58.45",
"client_addr": "0.0.0.0",
"node": "S1",
"ports": {
"dns": 53
}
}
[root@centos consul.d]# pwd
/etc/consul.d
client的配置文件以下:
[root@centos consul.d]# cat client.json
{
"data_dir": "/data/consul",
"enable_script_checks": true,
"bind_addr": "10.64.50.129",
"retry_join": ["10.64.58.46","10.64.58.45"],
"retry_interval": "30s",
"rejoin_after_leave": true,
"start_join": ["10.64.58.46","10.64.58.45"],
"node": "Client1",
"datacenter": "shenzhen"
}
2. 放通iptables
經常使用端口: 8600, 默認DNS 端口, server須要放開給全部client 訪問
8500, 默認HTTP API,server之間放開訪問
8301 serf_lan, 同一個DC 下通信端口
8302 serf_wan,跨DC的通信端口
8300 PRC服務
3. 啓動Consul
先啓動全部的server,nohup consul agent -config-dir=/etc/consul.d > /data/consul/consul.log & ,首次啓動找不到cluster,自動成爲leader
==> Starting Consul agent...
==> Consul agent running!
Version: 'v1.0.7'
Node ID: '19c23544-6a07-705b-82bc-ea9fe433b2f5'
Node name: 's3'
Datacenter: 'shenzhen' (Segment: '<all>')
Server: true (Bootstrap: false)
Client Addr: [0.0.0.0] (HTTP: 8500, HTTPS: -1, DNS: 8600)
Cluster Addr: 10.64.58.46 (LAN: 8301, WAN: 8302)
Encrypt: Gossip: false, TLS-Outgoing: false, TLS-Incoming: false
==> Log data will now stream in as it occurs:
2018/05/04 11:10:05 [INFO] raft: Initial configuration (index=0): []
2018/05/04 11:10:05 [INFO] raft: Node at 10.64.58.46:8300 [Follower] entering Follower state (Leader: "")
2018/05/04 11:10:05 [INFO] serf: EventMemberJoin: s3.shenzhen 10.64.58.46
2018/05/04 11:10:05 [INFO] serf: EventMemberJoin: s3 10.64.58.46
2018/05/04 11:10:05 [INFO] consul: Handled member-join event for server "s3.shenzhen" in area "wan"
2018/05/04 11:10:05 [INFO] consul: Adding LAN server s3 (Addr: tcp/10.64.58.46:8300) (DC: shenzhen)
2018/05/04 11:10:05 [INFO] agent: Started DNS server 0.0.0.0:8600 (udp)
2018/05/04 11:10:05 [INFO] agent: Started DNS server 0.0.0.0:8600 (tcp)
2018/05/04 11:10:05 [INFO] agent: Started HTTP server on [::]:8500 (tcp)
2018/05/04 11:10:05 [INFO] agent: started state syncer
2018/05/04 11:10:10 [WARN] raft: no known peers, aborting election
2018/05/04 11:10:12 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:10:31 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:10:40 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:10:55 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:11:13 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:11:29 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:11:38 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:12:03 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:12:13 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:12:29 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:12:41 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:13:01 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:13:16 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:13:29 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:13:39 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:13:53 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:14:09 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:14:29 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:14:41 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:14:58 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:15:16 [ERR] agent: failed to sync remote state: No cluster leader
2018/05/04 11:15:20 [ERR] agent: Coordinate update error: No cluster leader
2018/05/04 11:15:40 [INFO] serf: EventMemberJoin: s1 10.64.58.45
2018/05/04 11:15:40 [INFO] consul: Adding LAN server s1 (Addr: tcp/10.64.58.45:8300) (DC: shenzhen)
2018/05/04 11:15:40 [INFO] consul: New leader elected: s1
4. 啓動Consul Cluster,兩個DC 的server 組合一個Cluster
consul join -wan 10.64.58.45 10.64.58.46 10.64.67.43
驗證命令:
[root@centos consul]# consul members -wan
Node Address Status Type Build Protocol DC Segment
s1.shenzhen 10.64.58.45:8302 alive server 1.0.7 2 shenzhen <all>
s2.foshan 10.64.67.43:8302 alive server 1.0.7 2 foshan <all>
s3.shenzhen 10.64.58.46:8302 alive server 1.0.7 2 shenzhen <all>
檢查server lead:
[root@centos consul]# consul operator raft list-peers
Node ID Address State Voter RaftProtocol
s3 19c23544-6a07-705b-82bc-ea9fe433b2f5 10.64.58.46:8300 leader true 3
s1 483219dd-19e4-9d59-83a8-a42222c20ee9 10.64.58.45:8300 follower true 3
5. 啓動Consul Clients
nohup consul agent -config-dir=/etc/consul.d > /data/consul/consul.log &
驗證命令:
[root@centos consul.d]# consul members
Node Address Status Type Build Protocol DC Segment
s1 10.64.58.45:8301 alive server 1.0.7 2 shenzhen <all>
s3 10.64.58.46:8301 alive server 1.0.7 2 shenzhen <all>
client1 10.64.50.129:8301 alive client 1.0.7 2 shenzhen <default>
client2 10.64.50.128:8301 alive client 1.0.7 2 shenzhen <default>
如下演示和MySQL 結合
1. 建立服務定義文件
在全部Client建立服務定義文件,讀寫各一個json
[root@centos consul.d]# cat r-3333-mysql-test.json
{
"services": [
{
"name": "r-3333-mysql-test",
"tags": [
"slave-test-3333"
],
"address": "10.64.50.129",
"port": 3333,
"checks": [
{
"script": "/data/consul/shell/check_mysql_slave.sh 3333",
"interval": "15s"
}
]
}
]
}
[root@centos consul.d]# cat w-3333-mysql-test.json
{
"services": [
{
"name": "w-3333-mysql-test",
"tags": [
"master-test-3333"
],
"address": "10.64.50.129",
"port": 3333,
"checks": [
{
"script": "/data/consul/shell/check_mysql_master.sh 3333",
"interval": "15s"
}
]
}
]
}
注:
1. interval 能夠適當調小
2.若是是單機多實例,則建立多個服務的json文件,建議mysql端口全局惟一
2. 建立監控服務監控文件
[root@centos consul.d]# cat /data/consul/shell/check_mysql_master.sh
#!/bin/bash
port=$1
user="db_monitor"
password=`sudo cat /etc/snmp/yyms_agent_db_scripts/db_$port.conf | grep password | awk -F '=' '{print $2}'`
comm="/usr/local/mysql_${port}/bin/mysql -P$port -udb_monitor -p$password -h0 --default-character-set=utf8mb4"
slave_info=`$comm -e "show slave status" |wc -l`
value=`$comm -Nse "select 1"`
echo $value
read_only_text=`$comm -Ne "show global variables like 'read_only'"`
#echo $read_only_text
read_only=`echo ${read_only_text} | awk '{print $2}'`
echo ${read_only}
# 判斷是否是從庫
#if [ $slave_info -ne 0 ]
#then
# echo "MySQL Instance is Slave........"
# -e "show slave status\G" | egrep -w "Master_Host|Master_User|Master_Port|Master_Log_File|Read_Master_Log_Pos|Relay_Log_File|Relay_Log_Pos|Relay_Master_Log_File|Slave_IO_Running|Slave_SQL_Running|Exec_Master_Log_Pos|Relay_Log_Space|Seconds_Behind_Master"
# exit 2
#fi
# 判斷mysql是否存活
if [ -z $value ]
then
exit 2
fi
# 判斷read_only
if [[ $read_only != "OFF" ]]
then
fi
echo "MySQL $port Instance is Master........"
$comm -e "select * from information_schema.PROCESSLIST where user='slave'"
######################
[root@centos consul.d]# cat /data/consul/shell/check_mysql_slave.sh
#!/bin/bash
port=$1
user="db_monitor"
password=`sudo cat /etc/snmp/yyms_agent_db_scripts/db_$port.conf | grep password | awk -F '=' '{print $2}'`
comm="/usr/local/mysql_${port}/bin/mysql -P$port -udb_monitor -p$password -h0 --default-character-set=utf8mb4"
slave_info=`$comm -e "show slave status" |wc -l`
value=`$comm -Nse "select 1"`
echo $value
read_only_text=`$comm -Ne "show global variables like 'read_only'"`
#echo $read_only_text
read_only=`echo ${read_only_text} | awk '{print $2}'`
echo ${read_only}
# 判斷是否是從庫
#if [ $slave_info -ne 0 ]
#then
# echo "MySQL Instance is Slave........"
# -e "show slave status\G" | egrep -w "Master_Host|Master_User|Master_Port|Master_Log_File|Read_Master_Log_Pos|Relay_Log_File|Relay_Log_Pos|Relay_Master_Log_File|Slave_IO_Running|Slave_SQL_Running|Exec_Master_Log_Pos|Relay_Log_Space|Seconds_Behind_Master"
# exit 2
#fi
# 判斷mysql是否存活
if [ -z $value ]
then
exit 2
fi
# 判斷read_only
if [[ $read_only != "ON" ]]
then
fi
echo "MySQL $port Instance is Slave........"
#$comm -e "select * from information_schema.PROCESSLIST where user='slave'"
注:
1.此監控服務文件只是測試而已,生產用須要再仔細斟酌
2.建議在test庫建立一個表和記錄,同步本身公司內部的自動化平臺,由平臺下發做爲是否主庫的標誌位,之後能夠在平臺作一鍵切換
3. 註冊服務
每一個client 運行consul reload
每一個agent都註冊後, 對應有兩個域名:
w-3333-mysql-test.service.consul (對應惟一一個master IP)
r-3333-mysql-test.service.consul (對應多個slave IP, 客戶端請求時, 隨機分配一個)
4. 驗證服務是否已經註冊
在Consul server執行:
[root@centos consul]# dig @localhost -p 53 r-3333-mysql-test.service.consul (咱們把Consul DNS 端口改爲53)
; <<>> DiG 9.9.4-RedHat-9.9.4-38.el7_3 <<>> @localhost -p 53 r-3333-mysql-test.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 62902
;; flags: qr aa rd; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;r-3333-mysql-test.service.consul. IN A
;; ANSWER SECTION:
r-3333-mysql-test.service.consul. 0 IN A 10.64.50.128
;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed May 09 17:04:24 CST 2018
;; MSG SIZE rcvd: 77
官方提供幾種方法:
1. 原內網dns服務器,作域名轉發,consul後綴的,都轉到consul server上(咱們線上是採用這個)
2. dns所有跳到consul DNS服務器上,非consul後綴的,使用 recursors 屬性跳轉到原DNS服務器上
3. dnsmaq 轉: server=/consul/10.16.X.X#8600 解析consul後綴的
咱們公司的實際狀況是沒有內網DNS服務器,因此,咱們只須要在/etc/resolve.conf 文件添加本DC 的Consul server IP 就行,輪詢nameserver
參考資料:
http://www.cnblogs.com/gomysql/p/8010552.html