利用XAG在RAC環境下實現GoldenGate自動Failover

 

概述

在RAC環境下配置OGG,要想實現RAC節點故障時,OGG能自動的failover到正常節點,要保證兩點:html

1. OGG的checkpoint,trail,BR文件放置在共享的集羣文件系統上,RAC各節點都能訪問到node

2. 須要有集羣軟件的來監測OGG進程,以及發生故障時,自動在正常節點重啓OGG(failover)數據庫

Oracle Grid Infrastructure Standalone Agents (XAG)搭配Oracle支持的集羣文件系統,能夠實現OGG的自動failover,本文介紹相關的配置步驟。oracle

組件及版本要求

要想使用XAG實現自動failover,相關軟件的版本必須知足要求:app

clip_image002

至於集羣文件系統,Oracle官方文檔給出的建議是ACFS,DBFS和OCFS,我以爲其餘集羣文件系統,好比Veritas 的集羣文件系統應該也能夠。工具

本文示例使用的是ACFS。測試

測試環境軟件版本

源端數據庫:11.2.0.4 RAC (ASM)操作系統

目標端數據庫:12.1.0.2 RAC(ASM).net

GoldenGate : 12.2.0.1.1命令行

操做系統:源端和目標端都是Oracle Enterprise Linux 6.5 (64bit)

配置步驟

安裝GI XAG

XAG須要單獨去Oracle官網下載安裝 ,下載位置是:http://www.oracle.com/technetwork/database/database-technologies/clusterware/downloads/index.html

目前的版本是7,文件是xagpack_7b.zip

解壓縮文件,而後用GI的安裝用戶(通常是「grid」),執行xagsetup.sh進行安裝:

 

[grid@rac1 xag]$ ./xagsetup.sh --install --directory /u01/app/grid/xaghome --all_nodes

Installing Oracle Grid Infrastructure Agents on: rac1

Installing Oracle Grid Infrastructure Agents on: rac2

Done.

 

在目標端也安裝XAG,方法和源端相同。

源端(11.2)建立ACFS

11.2.0.4 在OEL上若是想用ACFS,必須安裝PSU補丁到11.2.0.4.4以上。補丁過程略過。

使用ACFS的磁盤組的屬性值COMPATIBLE.ASM和COMPATIBLE.ADVM必須設置爲11.2 :

clip_image004

使用ASMCMD或ASMCA建立ACFS卷:

clip_image006

clip_image008

建立通用ACFS

clip_image010

clip_image011

此時ACFS還不是CRS管理的,可使用ASMCMD的volinfo命令或/sbin/acfsutil registry查看ACFS信息

ASMCMD> volinfo -a

Diskgroup Name: DATA

Volume Name: VOLOGG1

Volume Device: /dev/asm/vologg1-426

State: ENABLED

Size (MB): 3072

Resize Unit (MB): 32

Redundancy: UNPROT

Stripe Columns: 4

Stripe Width (K): 128

Usage: ACFS

Mountpath: /u01/app/grid/acfsmounts/data_vol1

[root@rac1 ~]# /sbin/acfsutil registry

Mount Object:

Device: /dev/asm/vologg1-426

Mount Point: /u01/app/grid/acfsmounts/data_vol1

Disk Group: DATA

Volume: VOLOGG1

Options: none

Nodes: all

 

源端(11.2)將ACFS註冊到CRS

首先從通用ACFS的註冊信息中刪除咱們剛纔建立的ACFS的條目

[root@rac1 ~]# /sbin/acfsutil registry -d /u01/app/grid/acfsmounts/data_vol1

acfsutil registry: successfully removed ACFS mount point /u01/app/grid/acfsmounts/data_vol1 from Oracle Registry

而後,用SRVCTL工具進行CRS資源註冊:

 

[root@rac1 ~]# /u01/app/11.2.0/grid/bin/srvctl add filesystem -d /dev/asm/vologg1-426 -v VOLOGG1 -g DATA -m /u01/app/grid/acfsmounts/data_vol1 -u grid

[root@rac1 ~]# /u01/app/11.2.0/grid/bin/crsctl status resource -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE rac1

ONLINE ONLINE rac2

ora.LISTENER.lsnr

ONLINE ONLINE rac1

ONLINE ONLINE rac2

ora.asm

ONLINE ONLINE rac1 Started

ONLINE ONLINE rac2 Started

ora.data.vologg1.acfs

OFFLINE OFFLINE rac1

OFFLINE OFFLINE rac2

ora.gsd

OFFLINE OFFLINE rac1

OFFLINE OFFLINE rac2

ora.net1.network

ONLINE ONLINE rac1

ONLINE ONLINE rac2

ora.ons

ONLINE ONLINE rac1

ONLINE ONLINE rac2

--------------------------------------------------------------------------------

 

手工啓動資源,(mount ACFS)

 

[root@rac1 ~]# /u01/app/11.2.0/grid/bin/srvctl start filesystem -d /dev/asm/vologg1-426

[root@rac1 ~]#

[root@rac1 ~]# /u01/app/11.2.0/grid/bin/crsctl status resource -t

--------------------------------------------------------------------------------

NAME TARGET STATE SERVER STATE_DETAILS

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.dg

ONLINE ONLINE rac1

ONLINE ONLINE rac2

ora.LISTENER.lsnr

ONLINE ONLINE rac1

ONLINE ONLINE rac2

ora.asm

ONLINE ONLINE rac1 Started

ONLINE ONLINE rac2 Started

ora.data.vologg1.acfs

ONLINE ONLINE rac1 mounted on /u01/app /grid/acfsmounts/dat a_vol1

ONLINE ONLINE rac2 mounted on /u01/app/grid/acfsmounts/dat a_vol1

 

[root@rac1 ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/vg_rac1-lv_root 45G 32G 12G 74% /

tmpfs 2.0G 437M 1.6G 23% /dev/shm

/dev/sda1 477M 55M 397M 13% /boot

/dev/asm/vologg1-426 3.0G 83M 3.0G 3% /u01/app/grid/acfsmounts/data_vol1

[root@rac2 ~]# df -h

Filesystem Size Used Avail Use% Mounted on

/dev/mapper/vg_rac1-lv_root 45G 25G 19G 58% /

tmpfs 2.0G 440M 1.6G 23% /dev/shm

/dev/sda1 477M 55M 397M 13% /boot

/dev/asm/vologg1-426 3.0G 83M 3.0G 3% /u01/app/grid/acfsmounts/data_vol1

 

目標端(12.1)建立ACFS及註冊

12c建立ACFS和11g的主要區別是,沒有了通用和數據庫home用文件系統的選項,建立後會生成註冊文件系統到CRS的腳本。

clip_image013

clip_image015

運行系統生成的腳本,完成註冊及掛載:

[root@oel65vm11 scripts]# ./acfs_script.sh

ACFS file system /u01/app/grid/acfsmounts/ogg_vol1 is mounted on nodes oel65vm11,oel65vm12

查看資源信息:

[root@oel65vm11 bin]# ./crsctl status resource -t

--------------------------------------------------------------------------------

Name Target State Server State details

--------------------------------------------------------------------------------

Local Resources

--------------------------------------------------------------------------------

ora.DATA.VOLOGG2.advm

ONLINE ONLINE oel65vm11 STABLE

ONLINE ONLINE oel65vm12 STABLE

ora.DATA.dg

ONLINE ONLINE oel65vm11 STABLE

ONLINE ONLINE oel65vm12 STABLE

ora.LISTENER.lsnr

ONLINE ONLINE oel65vm11 STABLE

ONLINE ONLINE oel65vm12 STABLE

ora.asm

ONLINE ONLINE oel65vm11 Started,STABLE

ONLINE ONLINE oel65vm12 Started,STABLE

ora.data.vologg2.acfs

ONLINE ONLINE oel65vm11 mounted on /u01/app/grid/acfsmounts/ogg_vol1,STABLE

ONLINE ONLINE oel65vm12 mounted on /u01/app/grid/acfsmounts/ogg_vol1,STABLE

ora.net1.network

ONLINE ONLINE oel65vm11 STABLE

ONLINE ONLINE oel65vm12 STABLE

ora.ons

ONLINE ONLINE oel65vm11 STABLE

ONLINE ONLINE oel65vm12 STABLE

注意,全部節點必須關掉SELINUX,不然會出現ACFS無權寫入的錯誤。

 

安裝Oracle GoldenGate

這個版本的ogg同時支持11g和12c的數據庫,在圖形界面安裝時,用戶能夠選擇對應不一樣數據庫版本的ogg

clip_image017

將OGG安裝在前面建立的ACFS上:

clip_image019

源端的安裝位置:/u01/app/grid/acfsmounts/data_vol1/ogg12

目標端的安裝位置:/u01/app/grid/acfsmounts/ogg_vol1/ogg12

選擇自動啓動Manager進程。

 

數據庫準備工做

l 變動源端數據庫爲歸檔模式,過程省略。

l 源端數據庫增長相關日誌及修改參數:

SQL> ALTER DATABASE ADD SUPPLEMENTAL LOG DATA;

Database altered.

SQL> ALTER DATABASE FORCE LOGGING;

Database altered.

SQL> SELECT supplemental_log_data_min, force_logging FROM v$database;

SUPPLEME FORCE_LOGGING

-------- ---------------------------------------

YES YES

SQL> ALTER SYSTEM SWITCH LOGFILE;

System altered.

SQL> alter system set ENABLE_GOLDENGATE_REPLICATION=true;

System altered.

l 在源端和目標端建立OGG數據庫用戶及受權,個人例子裏建立的用戶是GGADM。

OGG用戶須要的權限能夠參閱聯機文檔《Installing and Configuring Oracle GoldenGate for Oracle Database 12c (12.2.0.1)》中的章節 4.1.4.1 Oracle 11.2.0.4 or Later Database Privileges,咱們這個測試爲了方便,授予用戶DBA角色,以及使用特定系統包的受權:

SQL> BEGIN

dbms_goldengate_auth.grant_admin_privilege

2 3 (

grantee => 'GGADM',

privilege_type => 'CAPTURE',

grant_select_privileges => TRUE

);

END;

/ 4 5 6 7 8 9

PL/SQL procedure successfully completed.

源端OGG設置

l 登陸數據庫:

GGSCI (rac1.hthorizontest.com) 1> dblogin userid ggadm password ggadm

Successfully logged into database.

l 註冊集成式抽取

GGSCI (rac1.hthorizontest.com as ggadm@tdb1) 2> register extract ext1 database;

2016-04-07 23:44:38 INFO OGG-02003 Extract EXT1 successfully registered with database at SCN 1291634.

l 增長抽取進程

GGSCI (rac1.hthorizontest.com as ggadm@tdb1) 3> ADD EXTRACT ext1 INTEGRATED TRANLOG, BEGIN NOW

EXTRACT (Integrated) added.

GGSCI (rac1.hthorizontest.com as ggadm@tdb1) 4> ADD EXTTRAIL /u01/app/grid/acfsmounts/data_vol1/ogg12/dirdat/et, EXTRACT ext1

EXTTRAIL added.

l 增長傳送進程

GGSCI (rac1.hthorizontest.com as ggadm@tdb1) 5> ADD EXTRACT pump1 EXTTRAILSOURCE /u01/app/grid/acfsmounts/data_vol1/ogg12/dirdat/et

EXTRACT added.

GGSCI (rac1.hthorizontest.com as ggadm@tdb1) 6>EDIT PARAMS EXT1

加入下面內容:

EXTRACT ext1

USERID ggadm, PASSWORD ggadm

TRANLOGOPTIONS INTEGRATED PARAMS (MAX_SGA_SIZE 100)

EXTTRAIL /u01/app/grid/acfsmounts/data_vol1/ogg12/dirdat/et

TABLE test.*;

GGSCI (rac1.hthorizontest.com as ggadm@tdb1) 7>EDIT PARAMS PUMP1

加入下面內容:

EXTRACT pump1

USERID ggadm, PASSWORD ggadm

RMTHOST 192.168.0.11, MGRPORT 7809

RMTTRAIL /u01/app/grid/acfsmounts/ogg_vol1/ogg12/dirdat/rt

TABLE TEST.*;

而後啓動全部進程。

在11.2.0.4版本,若是實現集成的capture模式,在啓動抽取進程時,會提示須要安裝補丁17030189,主要是由於使用集成的capture,須要修改數據字典表。

可是在安裝了PSU後,有時會致使這個補丁和其餘補丁衝突,也能夠手工執行prvtlmpg.plb來解決問題。

(EXTRACT Abending With OGG-02912 (Doc ID 2091679.1))

目標端OGG設置

GGSCI (oel65vm11.hthorizon.com) 8> dblogin userid ggadm password ggadm

Successfully logged into database.

GGSCI (oel65vm11.hthorizon.com as ggadm@racdb1) 9>ADD CHECKPOINTTABLE ggadm.checkpointtab

Successfully created checkpoint table ggadm.checkpointtab

GGSCI (oel65vm11.hthorizon.com as ggadm@racdb1) 10> ADD REPLICAT rep1, EXTTRAIL /u01/app/grid/acfsmounts/ogg_vol1/ogg12/dirdat/rt checkpointtable ggadm.checkpointtab

REPLICAT added.

GGSCI (oel65vm11.hthorizon.com as ggadm@racdb1) 11>EDIT PARAMS REP1

加入下面內容:

REPLICAT rep1

USERID ggadm, PASSWORD ggadm

ASSUMETARGETDEFS

DISCARDFILE /u01/app/grid/acfsmounts/ogg_vol1/ogg12/dirdat/rt, PURGE

MAP TEST.* TARGET TEST.*;

而後啓動進程,測試OGG數據複製是否正常

修改OGG MGR參數

爲了讓OGG的Manager進程可以自動啓動複製進程,須要將下列配置加進Manager的配置文件:

AUTORESTART ER *, RETRIES 5, WAITMINUTES 1, RESETMINUTES 60

AUTOSTART ER *

重啓Manager進程使之生效。

源端和目標端都要修改。

配置源端XAG

l 添加APP VIP(以root身份)

[root@rac1 ~]# /u01/app/11.2.0/grid/bin/appvipcfg create -network=1 -ip=192.168.0.36 -vipname=xag.gg_1-vip.vip -user=oracle

l 容許grid用戶啓動資源(以root身份)

[root@rac1 ~]# /u01/app/11.2.0/grid/bin/crsctl setperm resource xag.gg_1-vip.vip -u user:grid:r-x

l 啓動VIP(以grid身份)

[root@rac1 ~]# su - grid

[grid@rac1 ~]$ /u01/app/11.2.0/grid/bin/crsctl start resource xag.gg_1-vip.vip

CRS-2672: Attempting to start 'xag.gg_1-vip.vip' on 'rac1'

CRS-2676: Start of 'xag.gg_1-vip.vip' on 'rac1' succeeded

l 查看狀態

[grid@rac1 ~]$ crsctl status resource xag.gg_1-vip.vip

NAME=xag.gg_1-vip.vip

TYPE=app.appvip_net1.type

TARGET=ONLINE

STATE=ONLINE on rac1

l 建立OGG對應的CRS資源(以root身份)

[root@rac1 bin]# /u01/app/grid/xaghome/bin/agctl add goldengate gg_1 --gg_home /u01/app/grid/acfsmounts/data_vol1/ogg12 --instance_type source --nodes rac1,rac2 --vip_name xag.gg_1-vip.vip --filesystems ora.data.vologg1.acfs --databases ora.tdb.db --oracle_home /u01/app/oracle/product/11.2.0/dbhome_1 --monitor_extracts ext1,pump1

[root@rac1 ~]# cd /u01/app/grid/xaghome/bin

[root@rac1 bin]# ./agctl status goldengate gg_1

Goldengate instance 'gg_1' is not running

l 受權grid啓動資源

上面的命令執行完畢,會自動建立一個對應ogg的CRS資源,須要受權grid有權管理它:

[root@oel65vm11 bin]# /u01/app/11.2.0/grid/bin/crsctl setperm resource xag.gg_1.goldengate -u user:grid:r-x

配置目標端XAG

過程和源端相似,

l 建立VIP資源:

[root@oel65vm11 ~]# /u01/app/12.1.0/grid/bin/appvipcfg create -network=1 -ip=192.168.0.26 -vipname=xag.gg_1-vip.vip -user=oracle

[root@oel65vm11 ~]# /u01/app/12.1.0/grid/bin/crsctl setperm resource xag.gg_1-vip.vip -u user:grid:r-x

[root@oel65vm11 ~]# /u01/app/12.1.0/grid/bin/crsctl start resource xag.gg_1-vip.vip

CRS-2672: Attempting to start 'xag.gg_1-vip.vip' on 'oel65vm12'

CRS-2676: Start of 'xag.gg_1-vip.vip' on 'oel65vm12' succeeded

[root@oel65vm11 ~]# /u01/app/12.1.0/grid/bin/crsctl relocate resource xag.gg_1-vip.vip -n oel65vm11

CRS-2673: Attempting to stop 'xag.gg_1-vip.vip' on 'oel65vm12'

CRS-2677: Stop of 'xag.gg_1-vip.vip' on 'oel65vm12' succeeded

CRS-2672: Attempting to start 'xag.gg_1-vip.vip' on 'oel65vm11'

CRS-2676: Start of 'xag.gg_1-vip.vip' on 'oel65vm11' succeeded

l 建立ogg 對應的CRS資源

[root@oel65vm11 bin]# /u01/app/grid/xaghome/bin/agctl add goldengate gg_2 --gg_home /u01/app/grid/acfsmounts/ogg_vol1/ogg12 --instance_type target --nodes oel65vm11,oel65vm12 --vip_name xag.gg_1-vip.vip --filesystems ora.data.vologg2.acfs --databases ora.racdb.db --oracle_home /u01/app/oracle/product/12.1.0/dbhome_1 --monitor_replicats rep1

l 受權

[root@oel65vm11 bin]# /u01/app/12.1.0/grid/bin/crsctl setperm resource xag.gg_2.goldengate -u user:grid:r-x

修改PUMP進程

將PUMP進程對應的源端地址修改成咱們剛纔建立的VIP

RMTHOST 192.168.0.26, MGRPORT 7809

重啓PUMP進程

啓動CRS OGG資源

進入ggsci命令行,將源端和目標段進程都停掉

l 啓動目標端資源

[grid@oel65vm11 ~]$ cd $ORACLE_BASE

[grid@oel65vm11 grid]$ cd xaghome/bin

[grid@oel65vm11 bin]$ ./agctl start goldengate gg_2 --node oel65vm11

[grid@oel65vm11 bin]$ crsctl status resource xag.gg_2.goldengate

NAME=xag.gg_2.goldengate

TYPE=xag.goldengate.type

TARGET=ONLINE

STATE=ONLINE on oel65vm11

l 啓動源端資源

[grid@rac1 bin]$ cd $ORACLE_BASE

[grid@rac1 grid]$ cd xaghome/bin

[grid@rac1 bin]$ ./agctl start goldengate gg_1 --node rac1

[grid@rac1 bin]$ crsctl status resource xag.gg_1.goldengate

NAME=xag.gg_1.goldengate

TYPE=xag.goldengate.type

TARGET=ONLINE

STATE=ONLINE on rac1

啓動後,進入GGSCI命令行,查看進程狀態,若是進程都自動啓動了,說明配置沒有問題。

切換測試

使用命令測試源端切換:

[grid@rac1 bin]$ ./agctl relocate goldengate gg_1 --node rac2

[grid@rac1 bin]$ crsctl status resource –t

。。。。。。

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

。。。。。。

xag.gg_1-vip.vip

1 ONLINE ONLINE rac2

xag.gg_1.goldengate

1 ONLINE ONLINE rac2

再作一個切斷電源的測試,咱們以「關掉電源」的方式關閉目標端的主機oel65vm11

在主機oel65vm12上,能夠看到RAC的vip failover到了本節點,ogg的vip和gg_2對應的資源也自動failover到了本節點:

[grid@oel65vm12 ~]$ crsctl status resource -t

。。。。。。

--------------------------------------------------------------------------------

Cluster Resources

--------------------------------------------------------------------------------

。。。。。。

ora.oel65vm11.vip

1 ONLINE INTERMEDIATE oel65vm12 FAILED OVER,STABLE

ora.oel65vm12.vip

1 ONLINE ONLINE oel65vm12 STABLE

ora.racdb.db

1 ONLINE OFFLINE STABLE

2 ONLINE ONLINE oel65vm12 Open,STABLE

ora.scan1.vip

1 ONLINE ONLINE oel65vm12 STABLE

xag.gg_1-vip.vip

1 ONLINE ONLINE oel65vm12 STABLE

xag.gg_2.goldengate

1 ONLINE ONLINE oel65vm12 STABLE

上面只是一個最簡單的例子,沒有考慮各類複雜的狀況,例如,同時部署有監控jagent,或者downstream複製等等,因此現實的生產環境每每比這個例子複雜得多。

相關文章
相關標籤/搜索