High Availability Disaster Recovery (HADR)是數據庫級別的高可用性數據複製機制。一個HADR環境須要兩臺數據庫服務器:主數據庫服務器(primary)和備用數據庫服務器(standby)。當主數據庫中發生事務操做時,會同時將日誌文件經過TCP/IP協議傳送到備用數據庫服務器,而後備用數據庫對接受到的日誌文件進行重放(Replay),從而保持與主數據庫的一致性。當主數據庫發生故障時,備用數據庫服務器能夠接管主數據庫服務器的事務處理。此時,備用數據庫服務器做爲新的主數據庫服務器進行數據庫的讀寫操做,而客戶端應用程序的數據庫鏈接能夠經過自動客戶端從新路由(Automatic Client Reroute)機制轉移到新的主服務器。當原來的主數據庫服務器被修復後,又能夠做爲新的備用數據庫服務器加入HADR。經過這種機制,DB2 UDB實現了數據庫的災難恢復和高可用性,最大限度的避免了數據丟失。node
下圖爲DB2 HADR的工做原理圖:c++
正常狀況下,HADR經過日誌自動同步:數據庫
當主庫異常時,備庫進行接管後,應用鏈接65.33時,自動鏈接到65.34:服務器
本期採用2臺虛擬服務。網絡
主機ssh |
IP地址異步 |
配置tcp |
備註ide |
cluster1spa |
192.31.65.21 |
4C,16G |
HADR集羣 |
cluster2 |
192.31.65.22 |
4C,16G |
根據原環境,進行部署。主要增長集羣配置。詳細以下:
一、安裝系統包
yum install openssh-clients
yum install redhat-lsb
yum install ksh
yum install libaio
rpm -ivh compat-libstdc++-33-3.2.3-69.el6.x86_64.rpm
yum install compat-libstdc++-33-3.2.3-69.el6.i686.rpm
二、建立組和用戶
groupadd -g 101 dasadm1
groupadd -g 102 db2iadm1
groupadd -g 103 db2fadm1
useradd -m -u 500 -d /home/dasusr1 -g dasadm1 dasusr1 -p db2admin
useradd -m -u 501 -d /home/db2admin -g db2iadm1 db2admin -p db2admin
useradd -m -u 502 -d /home/db2fenc1 -g db2fadm1 db2fenc1 -p db2admin
三、修改網絡
vi /etc/hosts
192.31.65.21 cluster1
192.31.65.22 cluster2
vi /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=cluster1
四、安裝數據庫軟件
./db2_install -b /opt/ibm/db2/v9.5/ -p ESE
五、安裝實例
/opt/ibm/db2/v9.5/instance/db2icrt -s ESE -a server -u db2fenc1 db2admin
六、環境變量配置
db2set DB2_EXTENDED_OPTIMIZATION=ON
db2set DB2_DISABLE_FLUSH_LOG=ON
db2set AUTOSTART=YES
db2set DB2_STRIPED_CONTAINERS=ON
db2set DB2_HASH_JOIN=Y
db2set DB2COMM=tcpip
db2set DB2_PARALLEL_IO=*
db2set DB2CODEPAGE=1386
------------------------以下是HADR配置------------------------
一、環境變量修改
#指定HADR BUF大小,一般HADR BUF是 LOG BUF的4倍。
db2set DB2_HADR_BUF_SIZE=16384
db2set DB2_LOAD_COPY_NO_OVERRIDE=NONRECOVERABLE
db2set DB2_HADR_PEER_WAIT_LIMIT=10
db2set DB2COMM=tcpip
db2set DB2RSHCMD=/usr/bin/ssh
二、參數修改
db2 "UPDATE DB CFG FOR sample USING LOGINDEXBUILD ON" #開啓此功能,在索引的建立、重建和重組記錄完整的信息,可能須要更多時間和更多的日誌空間。會將索引傳遞到備庫進行建立。若是不開啓,當備庫接管時,索引將被置爲無效。
db2 "UPDATE DB CFG FOR sample USING INDEXREC RESTART" #完成接管操做以後,重建無效索引。
修改實例參數:
db2 update dbm cfg using SVCENAME 50001
三、備份數據庫:
db2 "backup db sample online to /home/db2admin"
四、將備份數據傳送到備節點。
五、恢復數據庫
db2 "RESTORE DATABASE sample FROM "/home/db2admin" TAKEN AT 20170511172554 REPLACE HISTORY FILE WITHOUT PROMPTING"
[db2admin@cluster2 ~]$ db2 "RESTORE DATABASE sample FROM "/home/db2admin" TAKEN AT 20170511172554 REPLACE HISTORY FILE WITHOUT PROMPTING"
DB20000I The RESTORE DATABASE command completed successfully.
或者直接覆蓋現有數據庫:
db2 "RESTORE DATABASE sample FROM "/home/db2admin" TAKEN AT 20170511172554 REPLACE EXISTING WITHOUT PROMPTING"
六、配置客戶端自動路由。在主數據庫服務器(cluster21)上:db2 "UPDATE ALTERNATE SERVER FOR DATABASE sample USING HOSTNAME 192.31.65.22 PORT 50001" #配置備用數據庫。在備用數據庫服務器上(cluster2):db2 "UPDATE ALTERNATE SERVER FOR DATABASE sample USING HOSTNAME 192.31.65.21 PORT 50001" #配置備用數據庫。七、配置HADR服務和偵聽端口用vi編輯/etc/services文件(須要切換到root用戶),加入下面兩行:DB2_HADR_1 55001/tcpDB2_HADR_2 55002/tcp注:這一步不是必須的,由於在下面配置HADR_LOCAL_SVC和HADR_REMOTE_SVC數據庫參數的時候能夠直接使用端口號來替代服務名。八、修改HADR相關參數參數--修改主用數據庫db2 "UPDATE DB CFG FOR sample USING HADR_LOCAL_HOST cluster1 "db2 "UPDATE DB CFG FOR sample USING HADR_LOCAL_SVC DB2_HADR_1 "db2 "UPDATE DB CFG FOR sample USING HADR_REMOTE_HOST cluster2 "db2 "UPDATE DB CFG FOR sample USING HADR_REMOTE_SVC DB2_HADR_2 "db2 "UPDATE DB CFG FOR sample USING HADR_REMOTE_INST db2admin "db2 "UPDATE DB CFG FOR sample USING HADR_SYNCMODE NEARSYNC " db2 "UPDATE DB CFG FOR sample USING HADR_TIMEOUT 120"db2 "CONNECT TO sample"db2 "QUIESCE DATABASE IMMEDIATE FORCE CONNECTIONS "db2 "UNQUIESCE DATABASE sample"db2 "CONNECT RESET"#SYNC:同步,寫入備用數據庫上的日誌文件時,方纔認爲日誌寫入是成功的;NEARSYNC:接近同步,在此方式中,僅當日志記錄已寫入主數據庫上的日誌文件,並且主數據庫已接收到來自備用系統的應答;ASYNC:異步,日誌傳遞給主系統主機的 TCP 層時,不等待備庫相應,就認爲成功;SUPERASYNC,超異步,寫入主庫日誌,即認爲成功。 --修改備用數據庫db2 "UPDATE DB CFG FOR sample USING HADR_LOCAL_HOST cluster2 "db2 "UPDATE DB CFG FOR sample USING HADR_LOCAL_SVC DB2_HADR_2 "db2 "UPDATE DB CFG FOR sample USING HADR_REMOTE_HOST cluster1 "db2 "UPDATE DB CFG FOR sample USING HADR_REMOTE_SVC DB2_HADR_1 "db2 "UPDATE DB CFG FOR sample USING HADR_REMOTE_INST db2admin "db2 "UPDATE DB CFG FOR sample USING HADR_SYNCMODE NEARSYNC "db2 "UPDATE DB CFG FOR sample USING HADR_TIMEOUT 120 "九、啓用HADR。--首先啓用備用數據庫關係:db2 "DEACTIVATE DATABASE sample"db2 "START HADR ON DATABASE sample AS STANDBY"--而後啓用主庫HADR狀態:db2 "DEACTIVATE DATABASE sample"db2 "START HADR ON DATABASE sample AS PRIMARY"#須要關閉防火牆,不然報錯:「SQL1768N Unable to start HADR. Reason code = "7".」[db2inst1@sg1 ~]$ db2 "start hadr on database sample as primary"DB20000I The START HADR ON DATABASE command completed successfully.十、查看HADR狀態:#查看狀態爲 peer:[db2inst1@sg1 ~]$ db2pd -d sample -hadrDatabase Partition 0 -- Database sample -- Active -- Up 0 days 00:01:43 -- Date 2017-05-02-19.23.09.352187HADR Information:Role State SyncMode HeartBeatsMissed LogGapRunAvg (bytes)Primary Peer Nearsync 0 0 ConnectStatus ConnectTime Timeout Connected Tue May 2 19:21:27 2017 (1493724087) 120 LocalHost LocalService sg1 DB2_HADR_1 RemoteHost RemoteService RemoteInstance sg2 DB2_HADR_2 db2inst1 PrimaryFile PrimaryPg PrimaryLSN S0000005.LOG 0 0x0000000002EE0000StandByFile StandByPg StandByLSN S0000005.LOG 0 0x0000000002EE0000 ---------------------------------在客戶端,嘗試鏈接#--編目節點db2 catalog tcpip node cluster1 remote 192.31.65.21 server 50001 --編目數據庫db2 "catalog db sample at node cluster1"