使用 MegaCLI 檢測磁盤狀態並更換磁盤

以前寫了一篇文章介紹如何更換線上服務器磁盤操做流程,當時是把總體機器的磁盤所有不換掉了,可是最近另外一臺機器部分磁盤損壞,raid類型爲10,經檢測,只須要更換壞掉的磁盤便可,補充文檔以下。node

安裝MegaCLI

安裝包 下載地址linux

安裝過程

# 首先下載獲取安裝包
# 解壓
$ tar -zxf MegaCli8.07.10.tar.gz
$ cd MegaCli8.07.10/Linux/
$ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm

# 加入系統環境
$ ln -s /opt/MegaRAID/MegaCli/MegaCli64 /usr/local/bin/MegaCli 
$ MegaCli -v                               
      MegaCLI SAS RAID Management Tool  Ver 8.02.21 Oct 21, 2011

    (c)Copyright 2011, LSI Corporation, All Rights Reserved.

Exit Code: 0x00
# 安裝完成!
  • 衝突處理:git

    $ rpm -ivh Lib_Utils-1.00-09.noarch.rpm MegaCli-8.02.21-1.noarch.rpm 
    準備中...                          ################################# [100%]
    	file /opt/lsi/3rdpartylibs/x86_64/libsysfs.so.2.0.2 from install of Lib_Utils-1.00-09.noarch conflicts with file from package srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64
  • 緣由: Lib_Utils和Dell服務器自帶的包srvadmin衝突,直接將其卸載,而後安裝便可。github

    rpm -e srvadmin-storelib-sysfs-9.1.0-2757.12163.el7.x86_64 --nodeps

使用指南

基本用法

# 查raid級別
$ megacli -LDInfo -Lall -aALL 

# 查raid卡信息
$ megacli -AdpAllInfo -aALL 

# 查看硬盤信息
$ megacli -PDList -aALL 

# 查看電池信息
$ megacli -AdpBbuCmd -aAll 

# 查看raid卡日誌
$ megacli -FwTermLog -Dsply -aALL 

# 顯示適配器個數
$ megacli -adpCount 

# 顯示適配器時間
$ megacli -AdpGetTime –aALL 

# 顯示全部適配器信息
$ megacli -AdpAllInfo -aAll     

# 顯示全部邏輯磁盤組信息
$ megacli -LDInfo -LALL -aAll    

# 顯示全部的物理信息
$ megacli -PDList -aAll     

# 查看充電狀態
$ megacli -AdpBbuCmd -GetBbuStatus -aALL |grep 'Charger Status' 

# 顯示BBU狀態信息
$ megacli -AdpBbuCmd -GetBbuStatus -aALL 

# 顯示BBU容量信息
$ megacli -AdpBbuCmd -GetBbuCapacityInfo -aALL 

# 顯示BBU設計參數
$ megacli -AdpBbuCmd -GetBbuDesignInfo -aALL    

# 顯示當前BBU屬性
$ megacli -AdpBbuCmd -GetBbuProperties -aALL    

# 顯示Raid卡型號,Raid設置,Disk相關信息
$ megacli -cfgdsply -aALL    
## 磁帶狀態的變化,從拔盤,到插盤的過程當中。
Device           |Normal |Damage  |Rebuild |Normal
Virtual Drive    |Optimal|Degraded|Degraded|Optimal
Physical Drive   |Online |Failed Unconfigured|Rebuild|Online

# 查看物理磁盤狀態:
$ megacli -PDRbld -ShowProg -PhysDrv  [Enclosure Device ID:Slot Number]  -a0
## Rebuild 中的物理磁盤狀態中會顯示:"Firmware state: Rebuild"

# 查詢 Rebuild 進度:
$ megacli -pdrbld -showprog -physdrv[E:S] -aALL
## 返回內容相似於下面這樣:
Rebuild Progress on Device at Enclosure 32, Slot 5 Completed 77% in 101 Minutes.

# 以文本進度條樣式顯示 Rebuild 進度:
$ megacli -pdrbld -progdsply -physdrv[E:S] -aALL
## 屏幕顯示相似下面的內容:
Rebuild progress of physical drives...
Enclosure:Slot               Percent Complete                       Time Elps
      032 :05   #######################87 %################*******  01:59:07 
Press key to quit...

# 查看 RAID 卡 Rebuild 參數:
$ megacli -AdpAllinfo -aALL | grep -i rebuild
## 返回結果相似下面這樣
Rebuild Rate                     : 30%
Auto Rebuild                     : Enabled
Rebuild Rate                     : YesForce 
Rebuild                    : Yes

# 設置 RAID 卡 Rebuild 比例爲60%:
$ megacli -AdpSetProp { RebuildRate -60} -aALL
## 設置成功後返回:
Adapter 0: Set rebuild rate to 60% success.

MegaCLI使用方法:http://blog.51cto.com/daixuan/1863567bash

重要參數

參數名稱 含義
Firmware state 磁盤狀態
Firmware state: Online, Spun Up 磁盤正常
Firmware state: Unconfigured(good), Spun Up 磁盤已安裝,但未啓用
Firmware state: Unconfigured(bad) 故障, 對應hwcheck的 Non-Critical
Firmware state: Failed 故障, 對應hwcheck的Critical
Firmware state: Rebuild 重建,通常在更換磁盤時顯示
Enclosure Device ID: 32 設備
Slot Number: 1 磁盤在服務器上的槽位
Adapter #0 適配器編號,對應 -a 參數

實戰:raid10環境下替換硬盤

Raid10環境下換硬盤仍是很簡單的,支持熱插拔,直接拔下換掉就能夠了,下面是操做步驟。服務器

主要環境

服務器: R720post

系統: CentOS7ui

raid類型:raid10.net

查看硬盤信息

爲了更加清楚的呈現操做過程,未對信息簡化處理。設計

$ MegaCli -PDList -aAll -NoLog
                                     
Adapter #0

Enclosure Device ID: 32
Slot Number: 0
Drive's postion: DiskGroup: 0, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 0
WWN: 5000C50076CD09B4
Sequence Number: 1
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 28
Last Predictive Failure Event Seq Number: 4378
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Firmware state: Unconfigured(good), Spun Up
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50076cd09b5
SAS Address(1): 0x0
Connected Port Number: 5(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES666SL8SASQ            
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: Foreign 
Foreign Secure: Drive is not secured by a foreign lock key
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :40C (104.00 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes


Enclosure Device ID: 32
Slot Number: 2
Enclosure position: 0
Device Id: 2
WWN: 5000C50076CD05BC
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 0 KB [0x0 Sectors]
Non Coerced Size: 0 KB [0x0 Sectors]
Coerced Size: 0 KB [0x0 Sectors]
Firmware state: Unconfigured(bad)
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50076cd05bd
SAS Address(1): 0x0
Connected Port Number: 1(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES666SL8SAVC            
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: Unknown 
Link Speed: Unknown 
Media Type: Hard Disk Device
Drive:  Not Supported
Drive Temperature :0C (32.00 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: Unknown 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No


Enclosure Device ID: 32
Slot Number: 1
Drive's postion: DiskGroup: 0, Span: 0, Arm: 1
Enclosure position: 0
Device Id: 1
WWN: 5000C500983873BC
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: VT31
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c500983873bd
SAS Address(1): 0x0
Connected Port Number: 3(path0) 
Inquiry Data: SEAGATE ST600MP0005     VT31S7M1CSLT            
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: Unknown 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :41C (105.80 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No


Enclosure Device ID: 32
Slot Number: 3
Drive's postion: DiskGroup: 0, Span: 1, Arm: 1
Enclosure position: 0
Device Id: 3
WWN: 5000C50076CE2F30
Sequence Number: 2
Media Error Count: 5
Other Error Count: 71
Predictive Failure Count: 15
Last Predictive Failure Event Seq Number: 4379
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c50076ce2f31
SAS Address(1): 0x0
Connected Port Number: 2(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES666SL8SAKA            
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :48C (118.40 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : Yes



Enclosure Device ID: 32
Slot Number: 4
Drive's postion: DiskGroup: 1, Span: 0, Arm: 0
Enclosure position: 0
Device Id: 4
WWN: 5000C5007E70F0F8
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c5007e70f0f9
SAS Address(1): 0x0
Connected Port Number: 0(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES666SL9F1JB            
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :46C (114.80 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 32
Slot Number: 5
Drive's postion: DiskGroup: 1, Span: 0, Arm: 1
Enclosure position: 0
Device Id: 5
WWN: 5000C5007E708E3C
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 558.911 GB [0x45dd2fb0 Sectors]
Non Coerced Size: 558.411 GB [0x45cd2fb0 Sectors]
Coerced Size: 558.375 GB [0x45cc0000 Sectors]
Firmware state: Online, Spun Up
Device Firmware Level: ES66
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x5000c5007e708e3d
SAS Address(1): 0x0
Connected Port Number: 4(path0) 
Inquiry Data: SEAGATE ST3600057SS     ES666SL9F2RB            
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None 
Device Speed: 6.0Gb/s 
Link Speed: 6.0Gb/s 
Media Type: Hard Disk Device
Drive Temperature :45C (113.00 F)
PI Eligibility:  No 
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s 
Port-1 :
Port status: Active
Port's Linkspeed: Unknown 
Drive has flagged a S.M.A.R.T alert : No

Exit Code: 0x00

由以上信息可知該服務器有6塊磁盤(Device Id)。

卸載故障硬盤

$ MegaCli -PDOffline -PhysDrv[32:2] -a0
$ MegaCli -PDOffline -PhysDrv[32:0] -a0

上面命令中 322 以及 -a0 的對應關係:

Adapter #0
Enclosure Device ID: 32
Slot Number: 2

替換故障硬盤

此時故障硬盤已經OFFLINE,在服務器現場查看時,故障硬盤閃爍的是黃燈,正常硬盤的綠燈; 拔下故障硬盤,插上好硬盤,硬盤燈閃爍爲綠色,並硬盤快速旋轉,表示硬盤正在rebuild狀態,查看狀態以下:

$ MegaCli -PDList -aAll -NoLog
...
Enclosure Device ID: 32
Slot Number: 3
...
Firmware state: Rebuild
...

查看rebuild進度

$ MegaCli -PDRbld -ShowProg -PhysDrv[32:2] -aAll

Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 16% in 94 Minutes.

磁盤更換完成

$ MegaCli -PDList -aAll -NoLog | grep 'Firmware state'
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
Firmware state: Online, Spun Up
相關文章
相關標籤/搜索