如何在 oracle 集羣環境下修改私網信息 (文檔 ID 2103317.1)

如何在 oracle 集羣環境下修改私網信息 (文檔 ID 2103317.1)

 

文檔內容node


目標

解決方案
  例1: 更改私有主機名
  例2:只更改私有 ip 地址不更改網卡、子網及子網掩碼信息
  例3:只改變私有網絡的 MTU 值
  例4:更改私有網卡名字,子網及掩碼
  A. 對於 11gR2 之前的集羣管理軟件
  B. 對於 11gR2 和 12c 上沒有使用 flex ASM 的版本
  C. 對於 12C flex ASM 結構
  關於 11gR2 的一些注意事項
  關於 Windows 系統注意事項
  使用 oifcfg 命令更改網卡名字的影響
  Oifcfg 命令用法
  例5 對於 11gR2 或以上版本的 HAIP 添加或刪除集羣私網

參考


適用於:

Oracle Database - Enterprise Edition - 版本 10.1.0.2 到 12.2.0.1 [發行版 10.1 到 12.2]  
本文檔所含信息適用於全部平臺  
linux

目標

本文的目的是描述如何在 oracle 集羣環境中更改或更新私有網絡(cluster_interconnect)信息。

有時咱們須要更改或更新網卡的名字,或者更改網卡的子網掩碼,再或者更改當時原始安裝時就輸入了不正確的信息,也許還有其它的緣由,oifcfg 工具當時沒有成功的安裝。

請參考文檔:Note 276434.1 for modifying public network and VIP associated information
和文檔: Note 1386709.1 for basics of IPv4 subnet and Oracle Clusterware。數據庫

提示:對於 oracle Engineered system(Exadata)和 oracle Applicance(ODA)不適用本文檔。  windows

 

解決方案

Oracle 集羣中的網絡信息(接口,子網及每一個網卡的角色)均可以被’oifcfg’ 命令管理, 可是沒有網卡的IP地址除外,oifcfg 命令不能修改IP地址信息. ‘oifcfg getif’ 命令能夠用來顯示OCR中當前網卡的配置信息:緩存

% $CRS_HOME/bin/oifcfg getif   
eth0 10.2.156.0 global public   
eth1 192.168.0.0 global cluster_interconnect  服務器

在 Unix/Linux 系統中,網卡名字是被系統自動分配的,依據系統平臺的不一樣而不一樣。對於 windows 系統,請參考下面的附帶的文檔. 上面的例子顯示當前網卡 eth0 被用做公網而且子網爲 10.2.156.0 eth1 被用做集羣私網,子網爲192.168.0.0。

‘公有’ 網絡是服務器與客戶端之間的通訊(與 VIP 使用相同的網段並以不一樣的記錄存儲在 OCR 中),與之對應的’cluster_interconnect’網絡是用來在 RDBMS/ASM 節點間緩存融合。從 11gR2 開始,cluster_interconnect 同時被用做集羣間的心跳,相對於 11gR2 之前的版本,當配置集羣心跳信息時指定主機名而言,這是一個標誌性的改變。

若是私有網卡的子網或接口名字配置不正確,咱們須要使用 crs/grid 用戶來更改。網絡

 

例1: 更改私有主機名

在 11.2 oracle clusterware 以前的版本,私有主機名被記錄在 OCR 中, 它不能被更改,通常狀況下私有主機名是不須要改變的,它附屬的 ip 能夠被更改,只有使用刪除/添加節點或從新安裝 oracle clusterware 來更改私有主機名。oracle

在 11.2 Grid 結構中,私有主機名不在記錄在 OCR 中,而且不存在依賴關係,因此它能夠在 /etc/hosts 文件中任意更改。app

 

例2:只更改私有 ip 地址不更改網卡、子網及子網掩碼信息

舉例,私有 ip 地址從 192.168.1.10 更改至 192.168.1.21,網卡名字及子網保持不變。或者只改變MAC地址,保持private IP address/interface name/subnet/network不變ide

只要關閉須要更改主機上的 oracle clusterware 軟件,在操做系統層,根據需求更改私有 ip 地址或者MAC地址(如:/etc/hosts,OS network config 等等), 再重啓啓動 oracle clusterware 軟件便可。

 

例3:只改變私有網絡的 MTU 值

舉例, 將私有網絡 MTU 值從1500更改至9000(激活 jumbo frame),網卡名字保持不變。

1. 關閉集羣中的全部節點。
2. 在操做系統層更改 MTU 須要設定的值,確保更改後 MTU 值的私有網卡可用而且能夠 ping 通集羣中的全部節點。
3. 重啓全部節點的集羣管理軟件。

 

例4:更改私有網卡名字,子網及掩碼

提示:當子網掩碼被更改,可是子網標識沒有改變時,如:  
子網掩碼從 255.255.0.0 更改至 255.255.255.0,私網 ip 爲 192.168.0.x,子網標識保持不變 192.168.0.0,網卡名字沒有改變.關閉全部須要更改的主機 oracle clusterware,在操做系統層修改私有網絡IP地址(如:操做系統 網絡配置等等)。並重啓集羣中全部節點,請注意,這種更改是不能採用輪轉方式(rolling manaer)完成的。  

當子網掩碼被改變,附屬的子網標識也常常會被改變,oracle 在 OCR 中只存儲網卡名字及子網標識的信息,而不存儲子網掩碼。可使用 oifcfg 命令完成這樣的變動,oifcfg 命令只需在集羣中的一個節點執行,而不是全部節點。

 

A. 對於 11gR2 之前的集羣管理軟件

1. 使用 oifcfg 命令添加新的私有網絡信息,刪除舊的私有網絡信息:

% $ORA_CRS_HOME/bin/oifcfg/oifcfg setif -global <if_name>/<subnet>:cluster_interconnect  
% $ORA_CRS_HOME/bin/oifcfg/oifcfg delif -global <if_name>[/<subnet>]]  

舉例:  
% $ORA_CRS_HOME/bin/oifcfg setif -global    eth3   /   192.168.2.0   :cluster_interconnect  
% $ORA_CRS_HOME/bin/oifcfg delif -global eth1/192.168.1.0  

校驗結果  
% $ORA_CRS_HOME/bin/oifcfg getif     
eth0 10.2.166.0 global public   
eth3       192.168.2.0    global cluster_interconnect  

2. 關閉 Oracle Clusterware

使用 root 用戶執行: # crsctl stop crs

3. 在操做系統層面更改網絡配置,修改集羣內全部節點的 /etc/hosts 文件,確保集羣內全部節點新的網絡設置都已生效:

% ping <private hostname/IP>  
% ifconfig -a  on Unix/Linux   
或  
% ipconfig /all on windows  

4. 從新啓動 Oracle Clusterware

以 root 用戶:# crsctl start crs

提示:若是在 linux 系統上正在運行 OCFS2,則可能還須要更改 OCFS2 運行在其它節點的私有 ip 地址. 更多詳細的信息,請參考:    Note 604958.1   。  

 

 

B. 對於 11gR2 和 12c 上沒有使用 flex ASM 的版本

針對於 11.2 的結構,私有網絡配置信息不但保存在 OCR 中,並且還保存在 gpnp 屬性文件中。若是私有網絡不可用或定義錯誤,則 CRSD 進程將沒法啓動,任何隨後對於 OCR 的改變都是不可能完成的,所以須要注意當對私有網絡的配置信息進行修改,正確的改變順序是很是重要的。同時請注意,手動修改 gpnp 屬性文件是不支持的。

在對集羣中全部節點操做以前,請先備份 profile.xml 配置文件。做爲 grid 用戶執行:  
$ cd $GRID_HOME/gpnp/<hostname>/profiles/peer/  
$ cp -p profile.xml profile.xml.bk  


1. 確保集羣中的全部節點都已啓動並正常運行

2. 使用 grid 用戶:

獲取下面信息, 例如:

$ oifcfg getif  
eth1 100.17.10.0 global public  
eth0 192.168.0.0 global cluster_interconnect  


加入新的集羣私網通信信息:

$ oifcfg setif -global <interface>/<subnet>:cluster_interconnect  

例如:  
a. 加入新的並有相同子網的接口卡 bond0  
$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect  

b. add a new subnet with the same interface name but different subnet or new interface name  
$ 添加一個新的子網具備相同網卡的名稱但不一樣的子網或新的網卡名  
或  
$ oifcfg setif -global eth3/192.168.1.96:cluster_interconnect  

 

1. 若是網卡不可用,須要使用 –global 選項來完成,而不能使用 –node 選項,它將致使節點被驅逐。  

2. 若是網卡在服務器上可用,則可使用下面命令識別子網地址:  
$ oifcfg iflist  

它列出了網卡及子網地址,即便 oracle 集羣沒有啓動,此命令也能夠被執行。   請注意,子網掩碼有可能不是 x.y.z.0 的格式   ,它能夠是 x.y.z.24,x.y.z.64 或 x.y.z.128 等格式。如:  
$ oifcfg iflist   
lan1 18.1.2.0  
lan2    10.2.3.64           << 這是一個私有網絡子網地址附屬的私有網絡 ip 地址爲 10.2.3.86  

3. 若是須要添加第二個私有網絡,而不是替換現有的私有網絡,   則須要保證兩個網卡的 MTU 值相同   ,不然實例將沒法啓動並報以下錯誤信息:  
ORA-27504: IPC error creating OSD context  
ORA-27300: OS system dependent operation:if MTU failed with status: 0  
ORA-27301: OS failure message: Error 0  
ORA-27302: failure occurred at: skgxpcini2  
ORA-27303: additional information: requested interface lan1:801 has a different MTU (1500) than lan3:801 (9000), which is not supported. Check output from ifconfig command  

4. 對於 11gR2 或更高版本, 不建議在 ASM 或 database 的 spfile 或 pfile 中設置 cluster interconnects 參數。不管什麼緣由若是設置了該參數,則須要在集羣關閉以前需將新的私網 ip 地址設置在 spfile 或 pfile 中,不然它會因爲私網信息不匹配而致使重啓失敗。  


校驗更改後的值:

$ oifcfg getif  


3. 使用 root 用戶關閉集羣中全部的節點並禁用集羣:

# crsctl stop crs  
# crsctl disable crs  


4. 使網絡配置信息都已在 OS 層更改完成,確保更改完成後新的接口在全部的節點均可用有效:

$ ifconfig -a  
$ ping <private hostname>  


5. 使用 root 用戶激活 oracle 集羣並從新啓動集羣中的全部節點:

# crsctl enable crs  
# crsctl start crs  


6. 若是須要去除舊接口卡信息:

$ oifcfg delif -global <if_name>[/<subnet>]  
例如:  
$ oifcfg delif -global eth0/192.168.0.0  

 

 

C. 對於 12C flex ASM 結構

請檢查上面部分B,並關注提示部分,按下面命令作備份:

在對集羣中全部節點操做以前,請先備份 profile.xml 配置文件。 做爲 grid 用戶執行:  
$ cd $GRID_HOME/gpnp/<hostname>/profiles/peer/  
$ cp -p profile.xml profile.xml.bk  

1. 確保 oracle 集羣中的全部節點都已正常運行。

2. 使用 grid 用戶:

獲得現有信息,以下:

$ oifcfg getif  
eth1 100.17.10.0 global public  
eth0 192.168.0.0 global cluster_interconnect,asm  

上面例子顯示網卡 ech0 被用做集羣私網和 ASM 網絡。

加入新的集羣私網信息:

$ oifcfg setif -global <interface>/<subnet>:cluster_interconnect[,asm]

如:
a. 加入一個新的具備相同子網網卡 bond0
$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect,asm

b. 加入一個新的並具備相同網卡名字的子網,或不一樣子網和具備新的接口名字
$ oifcfg setif -global eth0/192.68.10.0:cluster_interconnect,asm

$ oifcfg setif -global eth3/192.168.1.96:cluster_interconnect,asm

 若是有不一樣的網絡用於私有網絡和 ASM 網絡,則能夠對其進行相應的調整。

3. 當 ASMLISTENER 正被用做私有網絡,若是對其修改則會影響 ASMLISTENER。須要添加一個新的 ASMLISTENER 及新的網絡配置。若是 ASM 的子網網絡沒有改變則跳過這一步。

3.1. 加入一個新的 ASMLISTENE(例:ASMNEWLISNR_ASM)及新的子網,使用 grid 用戶:

$ srvctl add  listener -asmlistener -l <new ASM LISTENER NAME> -subnet <new subnet>

如:
$ srvctl add listener -asmlistener -l ASMNEWLSNR_ASM -subnet 192.168.10.0

3.2. 刪除現有的 ASMLISTENER(這個例子中 ASMLSNR_ASM)並去除依賴關係,使用 grid 用戶:

$ srvctl update listener -listener ASMLSNR_ASM -asm -remove -force  
$ lsnrctl stop ASMLSNR_ASM  

 

注意. 須要使用 –force 選項,不然會出現下面錯誤:

$ srvctl update listener -listener ASMLSNR_ASM -asm -remove
PRCR-1025 : Resource ora.ASMLSNR_ASM.lsnr is still running
$ srvctl stop listener -l ASMLSNR_ASM
PRCR-1065 : Failed to stop resource ora.ASMLSNR_ASM.lsnr
CRS-2529: Unable to act on 'ora.ASMLSNR_ASM.lsnr' because that would require stopping or relocating 'ora.asm', but the force option was not specified


3.3 校驗配置信息:

$ srvctl config listener -asmlistener  
$ srvctl config asm  


4. 使用 root 用戶關閉集羣中的全部節點並禁用集羣:

# crsctl stop crs  
# crsctl disable crs  

5. 在操做系統層面更改網絡配置,更改以後,確保全部節點上的新網卡生效:

$ ifconfig -a  
$ ping <private hostname>  

6. 使用 root 用戶激活 oracle 集羣並從新啓動集羣中的全部節點:

# crsctl enable crs  
# crsctl start crs  

7. 刪除舊的網卡信息:

$ oifcfg delif -global <if_name>[/<subnet>]  
如:  
$ oifcfg delif -global eth0/192.168.0.0  

 

 

 

關於 11gR2 的一些注意事項


1. 若是底層網絡配置已經更改, 可是 oifcfg 還沒有執行一樣的變動,則重啓 oracle 集羣會致使 crsd 進程不能啓動。

crsd.log 日誌將會顯示以下:

2010-01-30 09:22:47.234: [ default][2926461424] CRS Daemon Starting  
..  
2010-01-30 09:22:47.273: [ GPnP][2926461424]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=7153, tl=3, f=0  
2010-01-30 09:22:47.282: [ OCRAPI][2926461424]   clsu_get_private_ip_addresses: no ip addresses found.  
2010-01-30 09:22:47.282: [GIPCXCPT][2926461424] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0)  
2010-01-30 09:22:47.283: [GIPCXCPT][2926461424] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)  
[ OCRAPI][2926461424]a_init_clsss: failed to call clsu_get_private_ip_addr (7)  
2010-01-30 09:22:47.285: [ OCRAPI][2926461424]a_init:13!: Clusterware init unsuccessful : [44]  
2010-01-30 09:22:47.285: [ CRSOCR][2926461424] OCR context init failure. Error:    PROC-44: Error in network address and interface operations Network address and interface operations error [7]  
2010-01-30 09:22:47.285: [ CRSD][2926461424][PANIC] CRSD exiting: Could not init OCR, code: 44  
2010-01-30 09:22:47.285: [ CRSD][2926461424] Done.  

以上錯誤顯示操做系統層面的設置(oifcfg iflist)與 gpnp profile.xml 配置文件設置不匹配。

解決方法:恢復操做系統網絡配置到最初的狀態,啓動 oracle 集羣,而後再按照上面的步驟從新更改。

若是底層的網絡並無改變,但 oifcfg 已經被設置了一個錯誤的子網地址或接口名字,則會發生一樣的問題。

2. 若是集羣中的任何一個節點關閉,oifcfg 命令將會失敗並顯示錯誤:

$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect  
PRIF-26: Error in update the profiles in the cluster  

解決方案:啓動 oracle 集羣中沒有運行的節點,確保集羣中全部的節點都已啓動,若是因爲操做系統緣由不能啓動的節點,請先將此節點從集羣中刪除在執行私網網絡變動。

3. 若是執行上面命令的的用戶非 GI 的擁有者,則會出現相同的錯誤:

$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect  
PRIF-26: Error in update the profiles in the cluster  

解決方案:確保使用 GI 的擁有者登陸並執行上面命令。

4. 從 11.2.0.2 開始,若是在沒有加入一個新私有網卡,就試圖刪除最後一個私有網卡(集羣私網)則會發生下面錯誤:

PRIF-31: Failed to delete the specified network interface because it is the last private interface  

解決方案:在刪除舊的私有網卡以前,先加入新的私有網卡。

5. 若是主機節點的 oracle 集羣關閉在關閉狀態,則會報下面錯誤:

$ oifcfg getif  
PRIF-10: failed to initialize the cluster registry  

解決方案:啓動該主機節點上的 oracle 集羣軟件。

 

 

關於 Windows 系統注意事項

更改網卡的語法在 windows/RAC 和Unix/Linux 集羣是同樣的,可是網卡名稱會略有不一樣,在 windows 系統上,默認分配給接口一般的名稱爲:

Local Area Connection
Local Area Connection 1 
Local Area Connection 2

若是使用一個網卡名稱含有空格,則名稱必須使用引號括起來,同時,請注意這是區分大小寫的。例如,在 windows上,設置集羣私網連接:

C:\oracle\product\10.2.0\crs\bin\oifcfg setif -global "Local Area Connection 1"/192.168.1.0:cluster_interconnect  

然而,在 windows 上從新命名網卡按最佳實踐更有意義,如重命名爲」ocwpublic」 和」ocwprivate」。若是 oracle 集羣安裝完成後須要更改網卡名字,則須要運行」oifcfg」命令來添加新的網卡並刪除舊的。綜上所述。

您能夠運行下面命令查看每一個節點上可用的網卡名字。

oifcfg iflist -p -n  

必須在每一個節點上運行這個命令來驗證網卡名稱相同的定義。

 

使用 oifcfg 命令更改網卡名字的影響

對於私網網卡,數據庫將使用存儲在 OCR 中定義爲集羣互聯的網卡做爲節點間緩存融合通訊。在告警日誌開始的時候,就會顯示集羣互聯有效的信息。在參數清單。例如:

For pre 11.2.0.2:  
Cluster communication is configured to use the following interface(s) for this instance   
192.168.1.1  


For 11.2.0.2+: (HAIP address will show in alert log instead of private IP)  
Cluster communication is configured to use the following interface(s) for this instance  
  169.254.86.97  

若是上面信息不正確,則實例須要重啓以便 OCR 條目修正,這一樣適用於 ASM 實例和數據庫實例。在 windows 系統上,實例被關閉後,在 OCR 將被重讀以前,還須要中止/啓動 OracleService < SID >(或 OracleASMService < ASMSID > 。

 

 

Oifcfg 命令用法

查看 oifcfg 命令的所有選項,只需輸入:

$ <CRS_HOME>/bin/oifcfg

 

 

例5 對於 11gR2 或以上版本的 HAIP 添加或刪除集羣私網

1. 添加另外的私有網絡到現有的使用 HAIP 的集羣中,做爲 grid 用戶執行:

$ oifcfg setif -global <interface>/<subnet>:cluster_interconnect

例如:

$ oifcfg setif -global enp0s8/192.168.57.0:cluster_interconnect

 關閉 CRS 中的全部節點,經過從新啓動 crs 中的全部節點,來使 HAIP 讀入新的接口,不能使用滾動方式重啓。

2. 在使用 HAIP 的集羣中刪除私有網絡,做爲 grid 用戶執行:

$ oifcfg delif -global <if_name>

例如:
$ oifcfg delif -global enp0s8

HAIP 將切換至其它可用接口,在接口被刪除後,集羣/數據庫會繼續採用此方式運行。

刪除多餘的 HAIP 接口,應關閉 CRS 全部節點,而後重啓 CRS 全部節點。不能採用以滾動的方式從新啓動 CRS。

 

參考

NOTE:1386709.1   - The Basics of IPv4 Subnet and Oracle Clusterware  
NOTE:276434.1   - How to Modify Public Network Information including VIP in Oracle Clusterware  
NOTE:604958.1   - OCFS2 Node Fence Caused by Removing the External Network Cable  
NOTE:1054902.1   - How to Validate Network and Name Resolution Setup for the Clusterware and RAC  
 
 



How to Modify Private Network Information in Oracle Clusterware (文檔 ID 283684.1)

In this Document


Goal

Solution
  Case I. Changing private hostname
  Case II. Changing private IP only without changing network interface, subnet and netmask
              or changing private IP MAC address only without changing anything else
  Case III. Changing private network MTU only
  Case IV. Changing private network interface name, subnet or netmask
  A. For pre-11gR2 Oracle Clusterware
  B. For 11gR2 Oracle Clusterware and 12c Cluster without Flex ASM
  C. For 12c Oracle Clusterware with Flex ASM
  Something to note for 11gR2+
  Notes for Windows Systems
  Ramifications of Changing Interface Names Using oifcfg
  Oifcfg Usage
  Case V. Add or remove cluster_interconnect for 11gR2 and above with HAIP

References


APPLIES TO:

Oracle Database - Enterprise Edition - Version 10.1.0.2 to 12.2.0.1 [Release 10.1 to 12.2]  
Information in this document applies to any platform.  

GOAL

The purpose of this note is to describe how to change or update the private network (cluster_interconnect) information in Oracle Clusterware. 

It may be necessary to change or update interface names, or subnet associated with an interface if there is a network change affecting the servers, or if the original information that was input during the installation was incorrect.   It may also be the case that for some reason, the Oracle Interface Configuration Assistant  ('oifcfg')  did not succeed during the installation.

Please refer to Note 276434.1 for modifying public network and VIP associated information
and refer to Note 1386709.1 for basics of IPv4 subnet and Oracle Clusterware.

Note: for Oracle Engineered system (Exadata) and Oracle Database Appliance (ODA), please do not make such changes following this note.  

 

SOLUTION

Network information(interface, subnet and role of each interface) for Oracle Clusterware is managed by 'oifcfg', but actual IP address for each interfaces are not, 'oifcfg' can not update IP address information. 'oifcfg getif' can be used to find out currently configured interfaces in OCR:

% $CRS_HOME/bin/oifcfg getif 
eth0 10.2.156.0 global public 
eth1 192.168.0.0 global cluster_interconnect  

On Unix/Linux systems, the interface names are generally assigned by the OS, and standard names vary by platform. For Windows systems, see additional notes below. Above example shows currently interface eth0 is used for public with subnet 10.2.156.0, and eth1 for cluster_interconnect/private with subnet 192.168.0.0.

The 'public' network is for database client communication (VIP also uses the same network though it's stored in OCR as separate entry), whereas the 'cluster_interconnect' network is for RDBMS/ASM cache fusion. Starting with 11gR2, cluster_interconnect is also used for clusterware heartbeats - this is significant change compare to prior release as pre-11gR2 uses the private nodename that were specified at installation time for clusterware heartbeats.

If the subnet or interface name for 'cluster_interconnect' interface is incorrect, it needs to be changed as crs/grid user.

Case I. Changing private hostname

In pre-11.2 Oracle Clusterware, private hostname is recorded in OCR, it can not be updated. Generally private hostname is not required to change. Its associated IP can be changed. The only way to change private hostname is by deleting/adding nodes, or reinstall Oracle Clusterware.

In 11.2 Grid Infrastructure, private hostname is no longer recorded in OCR and there is no dependency on the private hostname. It can be changed freely in /etc/hosts.

Case II. Changing private IP only without changing network interface, subnet and netmask
              or changing private IP MAC address only without changing anything else

For example, private IP is changed from 192.168.1.10 to 192.168.1.21, network interface name and subnet remain the same, or MAC address is changed, private IP address/interface name/subnet/network all remain the same.

Simply shutdown Oracle Clusterware stack on the node where change required, make IP or MAC modification at OS layer as required (eg: /etc/hosts, OS network config etc) for private network, restart Oracle Clusterware stack will complete the task.

Case III. Changing private network MTU only

For example, private network MTU is changed from 1500 to 9000 (enable jumbo frame), network interface name and subnet remain the same.

1. Shutdown Oracle Clusterware stack on all nodes
2. Make the required network change of MTU size at OS network layer, ensure private network is available with the desired MTU size, ping with the desired MTU size works on all cluster nodes
3. Restart Oracle Clusterware stack on all nodes

Case IV. Changing private network interface name, subnet or netmask

Note: When the netmask is changed but the subnet ID doesn't change, for example:
The netmask is changed from 255.255.0.0 to 255.255.255.0 with private IP like 192.168.0.x, the subnet ID remains the same as 192.168.0.0, the network interface name is not changed.
Shutdown Oracle Clusterware stack on all cluster nodes where change required, make IP modification at OS layer (eg: OS network config etc) for private network, restart Oracle Clusterware stack on all nodes will complete the task. Please note, this change can not be done in rolling manner.  

When the netmask is changed, the associated subnet ID is often changed. Oracle only store network interface name and subnet ID in OCR, not the netmask. Oifcfg command can be used for such change, oifcfg commands only require to run on 1 of the cluster node, not all.

A. For pre-11gR2 Oracle Clusterware

1. Use oifcfg to add the new private network information, delete the old private network information:

% $ORA_CRS_HOME/bin/oifcfg/oifcfg setif -global <if_name>/<subnet>:cluster_interconnect
% $ORA_CRS_HOME/bin/oifcfg/oifcfg delif -global <if_name>[/<subnet>]]

For example:
% $ORA_CRS_HOME/bin/oifcfg setif -global eth3/192.168.2.0:cluster_interconnect
% $ORA_CRS_HOME/bin/oifcfg delif -global eth1/192.168.1.0

To verify the change
% $ORA_CRS_HOME/bin/oifcfg getif   
eth0 10.2.166.0 global public 
eth3 192.168.2.0 global cluster_interconnect  

2. Shutdown Oracle Clusterware stack

As root user: # crsctl stop crs

3. Make required network change at OS level, /etc/hosts file should be modified on all nodes to reflect the change.
Ensure the new network is available on all cluster nodes:

% ping <private hostname/IP>
% ifconfig -a  on Unix/Linux 
or 
% ipconfig /all on windows  

4. restart the Oracle Clusterware stack

As root user: # crsctl start crs

Note:  If running OCFS2 on Linux, one  may also need to change the private IP address that OCFS2 is using to communicate with other nodes.   For more information, please refer to Note 604958.1  

 

B. For 11gR2 Oracle Clusterware and 12c Cluster without Flex ASM

As of 11.2 Grid Infrastructure, the private network configuration is not only stored in OCR but also in the gpnp profile.  If the private network is not available or its definition is incorrect, the CRSD process will not start and any subsequent changes to the OCR will be impossible. Therefore care needs to be taken when making modifications to the configuration of the private network. It is important to perform the changes in the correct order. Please also note that manual modification of gpnp profile is not supported.

Please take a backup of profile.xml on all cluster nodes before proceeding, as grid user:
$ cd $GRID_HOME/gpnp/<hostname>/profiles/peer/
$ cp -p profile.xml profile.xml.bk  


1. Ensure Oracle Clusterware is running on ALL cluster nodes in the cluster

2. As grid user:

Get the existing information. For example:

$ oifcfg getif
eth1 100.17.10.0 global public
eth0 192.168.0.0 global cluster_interconnect  


Add the new cluster_interconnect information:

$ oifcfg setif -global <interface>/<subnet>:cluster_interconnect

For example:
a. add a new interface bond0 with the same subnet
$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect

b. add a new subnet with the same interface name but different subnet or new interface name
$ oifcfg setif -global eth0/192.65.0.0:cluster_interconnect
or
$ oifcfg setif -global eth3/192.168.1.96:cluster_interconnect  

 

1. This can be done with -global option even if the interface is not available yet, but this can not be done with -node option if the interface is not available, it will lead to node eviction.

2. If the interface is available on the server, subnet address can be identified by command:
$ oifcfg iflist

It lists the network interface and its subnet address. This command can be run even if Oracle Clusterware is not running. Please note, subnet address might not be in the format of x.y.z.0, it can be x.y.z.24, x.y.z.64 or x.y.z.128 etc. For example,
$ oifcfg iflist 
lan1 18.1.2.0
lan2 10.2.3.64        << this is the private network subnet address associated with private network IP: 10.2.3.86

3. If it is for adding a 2nd private network, not replacing the existing private network, please ensure MTU size of both interfaces are the same, otherwise instance startup will report error:
ORA-27504: IPC error creating OSD context
ORA-27300: OS system dependent operation:if MTU failed with status: 0
ORA-27301: OS failure message: Error 0
ORA-27302: failure occurred at: skgxpcini2
ORA-27303: additional information: requested interface lan1:801 has a different MTU (1500) than lan3:801 (9000), which is not supported. Check output from ifconfig command

4. For 11gR2 and higher, it is not recommended to set cluster_interconnects in ASM or Database spfile or pfile. If this parameter is set for any reason, it needs to be changed to the new private IP in spfile or pfile prior to clusterware shutdown, otherwise it will result a failure during restart due to the interconnect mismatch.  


Verify the change:

$ oifcfg getif  


3. Shutdown Oracle Clusterware on all nodes and disable the Oracle Clusterware as root user:

# crsctl stop crs
# crsctl disable crs  


4. Make the network configuration change at OS level as required, ensure the new interface is available on all nodes after the change.

$ ifconfig -a
$ ping <private hostname>  


5. Enable Oracle Clusterware and restart Oracle Clusterware on all nodes as root user:

# crsctl enable crs
# crsctl start crs  


6. Remove the old interface if required:

$ oifcfg delif -global <if_name>[/<subnet>]
eg:
$ oifcfg delif -global eth0/192.168.0.0  

 

C. For 12c Oracle Clusterware with Flex ASM

Please review above section B and pay attention to the Note section, take a backup as follows:

Please take a backup of profile.xml on all cluster nodes before proceeding, as grid user:
$ cd $GRID_HOME/gpnp/<hostname>/profiles/peer/
$ cp -p profile.xml profile.xml.bk  

1. Ensure Oracle Clusterware is running on ALL cluster nodes in the cluster

2. As grid user:

Get the existing information. For example:

$ oifcfg getif
eth1 100.17.10.0 global public
eth0 192.168.0.0 global cluster_interconnect,asm  

Above example shows network eth0 is used for both cluster_interconnect and ASM network.

Add the new cluster_interconnect information:

$ oifcfg setif -global <interface>/<subnet>:cluster_interconnect[,asm]

For example:
a. add a new interface bond0 with the same subnet
$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect,asm

b. add a new subnet with the same interface name but different subnet or new interface name
$ oifcfg setif -global eth0/192.68.10.0:cluster_interconnect,asm
or
$ oifcfg setif -global eth3/192.168.1.96:cluster_interconnect,asm

 If different network is used for private network and ASM network, then modify them accordingly.

3. As ASMLISTENER is using the private network, modifying the private network will affect ASMLISTENER. It is required to add a new ASMLISTENER with the new network configuration. Skip this step if the subnet for the ASM network is not changed.

3.1. Add a new ASMLISTENER (for example: ASMNEWLSNR_ASM) with the new subnet, as grid user:

$ srvctl add  listener -asmlistener -l <new ASM LISTENER NAME> -subnet <new subnet>

eg:
$ srvctl add listener -asmlistener -l ASMNEWLSNR_ASM -subnet 192.168.10.0

3.2. Drop the existing ASMLISTENER (ASMLSNR_ASM in this example) and remove the dependency, as grid user:

$ srvctl update listener -listener ASMLSNR_ASM -asm -remove -force
$ lsnrctl stop ASMLSNR_ASM  

 

Note. -force option is required, otherwise the following error will occur:

$ srvctl update listener -listener ASMLSNR_ASM -asm -remove
PRCR-1025 : Resource ora.ASMLSNR_ASM.lsnr is still running
$ srvctl stop listener -l ASMLSNR_ASM
PRCR-1065 : Failed to stop resource ora.ASMLSNR_ASM.lsnr
CRS-2529: Unable to act on 'ora.ASMLSNR_ASM.lsnr' because that would require stopping or relocating 'ora.asm', but the force option was not specified


3.3 Verify the configuration

$ srvctl config listener -asmlistener
$ srvctl config asm  


4. Shutdown Oracle Clusterware on ALL nodes and disable the Oracle Clusterware as root user:

# crsctl stop crs
# crsctl disable crs  

5. Make the network configuration change at OS level as required, ensure the new interface is available on all nodes after the change.

$ ifconfig -a
$ ping <private hostname>  

6. Enable Oracle Clusterware and restart Oracle Clusterware on all nodes as root user:

# crsctl enable crs
# crsctl start crs  

7. Remove the old interface if required:

$ oifcfg delif -global <if_name>[/<subnet>]
eg:
$ oifcfg delif -global eth0/192.168.0.0  

 

 

Something to note for 11gR2+


1. If underlying network configuration has been changed, but oifcfg has not been run to make the same change,  then upon Oracle Clusterware restart, the CRSD will not be able to start.

The crsd.log will show:

2010-01-30 09:22:47.234: [ default][2926461424] CRS Daemon Starting
..
2010-01-30 09:22:47.273: [ GPnP][2926461424]clsgpnp_Init: [at clsgpnp0.c:837] GPnP client pid=7153, tl=3, f=0
2010-01-30 09:22:47.282: [ OCRAPI][2926461424]clsu_get_private_ip_addresses: no ip addresses found.
2010-01-30 09:22:47.282: [GIPCXCPT][2926461424] gipcShutdownF: skipping shutdown, count 2, from [ clsinet.c : 1732], ret gipcretSuccess (0)
2010-01-30 09:22:47.283: [GIPCXCPT][2926461424] gipcShutdownF: skipping shutdown, count 1, from [ clsgpnp0.c : 1021], ret gipcretSuccess (0)
[ OCRAPI][2926461424]a_init_clsss: failed to call clsu_get_private_ip_addr (7)
2010-01-30 09:22:47.285: [ OCRAPI][2926461424]a_init:13!: Clusterware init unsuccessful : [44]
2010-01-30 09:22:47.285: [ CRSOCR][2926461424] OCR context init failure. Error: PROC-44: Error in network address and interface operations Network address and interface operations error [7]
2010-01-30 09:22:47.285: [ CRSD][2926461424][PANIC] CRSD exiting: Could not init OCR, code: 44
2010-01-30 09:22:47.285: [ CRSD][2926461424] Done.  

Above errors indicate a mismatch between OS setting (oifcfg iflist) and gpnp profile setting profile.xml.

Workaround: restore the OS network configuration back to the original status, start Oracle Clusterware. Then follow above steps to make the changes again. 

If the underlying network has not been changed, but oifcfg setif has been run with a wrong subnet address or interface name, same issue will happen.



2. If any one node is down in the cluster, oifcfg command will fail with error:

$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect
PRIF-26: Error in update the profiles in the cluster  

Workaround: start Oracle Clusterware on the node where it is not running. Ensure Oracle Clusterware is up on all cluster nodes. If the node is down for any OS reason, please remove the node from the cluster before performing private network change.

3. If a user other than Grid Infrastructure owner issues above command, it will fail with same error:

$ oifcfg setif -global bond0/192.168.0.0:cluster_interconnect
PRIF-26: Error in update the profiles in the cluster  

Workaround: ensure to login as Grid Infrastructure owner to perform such command.

4. From 11.2.0.2 onwards, if attempt to delete the last private interface (cluster_interconnect) without adding a new one first, following error will occur:

PRIF-31: Failed to delete the specified network interface because it is the last private interface  

Workaround: Add new private interface first before deleting the old private interface.

5. If Oracle Clusterware is down on the node, the following error is expected:

$ oifcfg getif
PRIF-10: failed to initialize the cluster registry  

Workaround: Start the Oracle Clusterware on the node

 

Notes for Windows Systems

The syntax for changing the interfaces on Windows/RAC clusters is the same as on Unix/Linux, but the interface names will be slightly different. On Windows systems, the default names assigned to the interfaces are generally named such as:

Local Area Connection
Local Area Connection 1 
Local Area Connection 2

If using an interface name that has space in it, the name must be enclosed in quotes. Also, be aware that it is case sensitive.  For example, on Windows,  to set cluster_interconnect:

C:\oracle\product\10.2.0\crs\bin\oifcfg setif -global "Local Area Connection 1"/192.168.1.0:cluster_interconnect  

However, it is best practice on Windows to rename the interfaces to be more meaningful, such as renaming them to 'ocwpublic' and 'ocwprivate'.   If interface names are renamed after Oracle Clusterware is installed, then you will need to run 'oifcfg'  to add the new interface and delete the old one, as described above.

You can view the available interface names on each node by running the command:

oifcfg iflist -p -n  

This command must be run on each node to verify the interface names are defined the same.

Ramifications of Changing Interface Names Using oifcfg

For the Private interface, the database will use the interface stored in the OCR and defined as a 'cluster_interconnect' for cache fusion traffic.  The cluster_interconnect information is available at startup in the alert log, after the parameter listing - for example:

For pre 11.2.0.2:
Cluster communication is configured to use the following interface(s) for this instance 
192.168.1.1


For 11.2.0.2+: (HAIP address will show in alert log instead of private IP)
Cluster communication is configured to use the following interface(s) for this instance
  169.254.86.97  

If this is incorrect, then instance is required to restart once the OCR entry is corrected. This applies to ASM instances and Database instances alike. On Windows systems, after shutting down the instance, it is also required to stop/restart the OracleService<SID> (or OracleASMService<ASMSID> before the OCR will be re-read.

 

Oifcfg Usage

To see the full options of oifcfg, simply type:

$ <CRS_HOME>/bin/oifcfg

 

Case V. Add or remove cluster_interconnect for 11gR2 and above with HAIP

1. To add another private network into existing cluster using HAIP, as grid user:

$ oifcfg setif -global <interface>/<subnet>:cluster_interconnect

For example:

$ oifcfg setif -global enp0s8/192.168.57.0:cluster_interconnect

 Shutdown CRS on ALL nodes, then restart CRS on ALL nodes for HAIP to pick up the new interface. It is insufficient to restart CRS in rolling manner.


2. To remove a private network from a cluster with HAIP, as grid user:

$ oifcfg delif -global <if_name>

For example:
$ oifcfg delif -global enp0s8

HAIP will failover to the remaining interface and clusterware/database continue to function after the interface removal.

To remove the extra HAIP interface, it is required to shutdown CRS on ALL nodes, then restart CRS on ALL nodes. It is insufficient to restart CRS in rolling manner.

 

 

Database - RAC/Scalability Community
To discuss this topic further with Oracle experts and industry peers, we encourage you to review, join or start a discussion in the My Oracle Support Database - RAC/Scalability Community

REFERENCES

NOTE:1054902.1    - How to Validate Network and Name Resolution Setup for the Clusterware and RAC  
NOTE:1386709.1    - The Basics of IPv4 Subnet and Oracle Clusterware  
NOTE:276434.1    - How to Modify Public Network Information including VIP in Oracle Clusterware  
NOTE:604958.1    - OCFS2 Node Fence Caused by Removing the External Network Cable    

相關文章
相關標籤/搜索