ASM磁盤組空間不足--ORA-15041:DISGROUP DATA space exhausted (生產庫案例)

原創做品,出自 「深藍的blog」 博客,深藍的blog:http://blog.csdn.net/huangyanlong/article/details/47277715sql

近日,處理了一個關於ASM磁盤組空間不足引發的問題。數據庫

簡單記錄以下:oracle

1、問題的反饋

駐地工程師的反饋:ide

駐地工程師以郵件的形式告知了出現的問題,以及解決該問題的緊急性。ui

大概這樣的描述:告知了巡檢時發現了某照片表空間已滿,對其進行擴容操做,報錯:ORA-15041:DISGROUP "DATA" space exhausted。因爲月初須要對上月數據進行考覈,客戶上傳一些照片,此事比較緊急,需馬上解決。this

附件中,附帶了一些查詢信息,以下:
SQL> select group_number,name,total_mb,free_mb from v$ASM_DISKGROUP;spa

GROUP_NUMBER NAME                             TOTAL_MB    FREE_MB
------------ ------------------------------ ---------- ----------
           1 ARCH                               860159     405817
           2 CRS                                 30717      29791
           3 DATA                              1638394        238.net

SQL> select name,group_number,state,redundancy,total_mb,free_mb,path from v$asm_disk;blog

NAME                           GROUP_NUMBER STATE    REDUNDA   TOTAL_MB
------------------------------ ------------ -------- ------- ----------
   FREE_MB
----------
PATH
--------------------------------------------------------------------------------
ARCH_0000                                 1 NORMAL   UNKNOWN     860159
    405817
/dev/oracleasm/disks/ARCHci

CRS_0002                                  2 NORMAL   UNKNOWN      10239
      9931
/dev/oracleasm/disks/VOTE_CRS3

NAME                           GROUP_NUMBER STATE    REDUNDA   TOTAL_MB
------------------------------ ------------ -------- ------- ----------
   FREE_MB
----------
PATH
--------------------------------------------------------------------------------

CRS_0001                                  2 NORMAL   UNKNOWN      10239
      9930
/dev/oracleasm/disks/VOTE_CRS2

DATA_0001                                 3 NORMAL   UNKNOWN     819197
       112

NAME                           GROUP_NUMBER STATE    REDUNDA   TOTAL_MB
------------------------------ ------------ -------- ------- ----------
   FREE_MB
----------
PATH
--------------------------------------------------------------------------------
/dev/oracleasm/disks/DATA2

DATA_0000                                 3 NORMAL   UNKNOWN     819197
       126
/dev/oracleasm/disks/DATA1

CRS_0000                                  2 NORMAL   UNKNOWN      10239

NAME                           GROUP_NUMBER STATE    REDUNDA   TOTAL_MB
------------------------------ ------------ -------- ------- ----------
   FREE_MB
----------
PATH
--------------------------------------------------------------------------------
      9930
/dev/oracleasm/disks/VOTE_CRS1


6 rows selected.


2、緊急的處理

連入生成庫,查詢確實asm空間嚴重不足了。
ASMCMD> lsdg

State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576    860159   405780                0          405780              0             N  ARCH/
MOUNTED  NORMAL  N         512   4096  1048576     30717    29791            10239            9776              0             Y  CRS/
MOUNTED  EXTERN  N         512   4096  1048576   1638394      238                0             238              0             N  DATA/

    爲快速解決問題,讓應用跑起來,決定先從如何解決沒法擴充表空間的方面進行入手。
想到的是縮減低利用率的表空間。

    因而查看錶空間的使用狀況:
一、發現undo表空間、temp表空間被擴容了很大,能夠對其縮減;
二、發現了一些低利用率的表空間,諸如GB級別的只存了幾M的數據量,能夠考慮縮減;

    因而連續使用諸以下面這樣的命令:

ALTER DATABASE 
  TEMPFILE '+DATA/xcky/xckytmp04.dbf'
 RESIZE 1024M;

用來實現對可縮減表空間的大小進行縮減。

通過一番空間縮減後,再次查詢空間使用率,知足擴容表空間的需求,完成了業務中存儲照片表空間的擴容。應用系統使用恢復正常。


3、階段性回饋

    快速回饋駐地工程師問題解決狀況。
 
    問題緣由是:ASM磁盤組空間不足引發。 

一、臨時採起的方法是縮減了其它表空間的大小,爲/DATA目錄釋放空間(縮減了undo表空間、temp表空間、其它空間利用率較低的表空間的大小)。

而且,已經新建了一個10G,自動擴展,存儲照片的表空間,命名爲photo_info47.dbf。

二、但後續建議:

(1)爲存儲擴容。
按照本環境的ASM規劃策略,目前ASM磁盤組中的/DATA已經使用了約1.4T(總大小約爲1.5T),/DATA下目前可用空間剩餘約50G。

(2)或從新規劃asm存儲,考慮臨時在/ARCH上擴充表空間(目前剩餘400G可用),但該/ARCH是用於存放歸檔文件的,不建議這麼作,後續有若是歸檔劇增,有引起出現hang停數據庫的可能。

 
4、後續解決本質性問題

 

再次鏈接生產庫,查詢是否有進一步解決問題的好方法。
先來查詢目前空間的大體使用狀況。  
SQL> conn sys/oracle as sysdba
Connected.
SQL> show user
USER is "SYS"
SQL> select path,total_mb,free_mb from v$asm_disk_stat;

PATH                                                 TOTAL_MB    FREE_MB
-------------------------------------------------- ---------- ----------
/dev/oracleasm/disks/ARCH                              860159     405780
/dev/oracleasm/disks/VOTE_CRS3                          10239       9931
/dev/oracleasm/disks/VOTE_CRS2                          10239       9930
/dev/oracleasm/disks/DATA2                             819197      25466
/dev/oracleasm/disks/DATA1                             819197      25480
/dev/oracleasm/disks/VOTE_CRS1                          10239       9930

6 rows selected.

ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576    860159   404777                0          404777              0             N  ARCH/
MOUNTED  NORMAL  N         512   4096  1048576     30717    29791            10239            9776              0             Y  CRS/
MOUNTED  EXTERN  N         512   4096  1048576   1638394    49590                0           49590              0             N  DATA/

查一下磁盤組的狀況
SQL> select name,state from v$asm_diskgroup;

NAME                           STATE
------------------------------ -----------
ARCH                           CONNECTED
CRS                            MOUNTED
DATA                           CONNECTED


查看系統的磁盤使用狀況,發下了一個好信息。
不知道爲何,存儲中,居然有一塊盤沒有利用。那太好了,能夠把它分給ASM了。
因而,下面先把這塊盤查詢出來。

[root@gzxkdb1 ~]# fdisk -l
Disk /dev/emcpoweree: 2147.4 GB, 2147483648000 bytes
255 heads, 63 sectors/track, 261083 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Disk /dev/emcpoweree doesn't contain a valid partition table
經過以上信息,鎖定/dev/emcpoweree設備沒有被劃分使用。 
 
對該設備進行磁盤分區 
[root@gzxkdb1 ~]# fdisk /dev/emcpoweree
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.


The number of cylinders for this disk is set to 261083.
There is nothing wrong with that, but this is larger than 1024,
and could in certain setups cause problems with:
1) software that runs at boot time (e.g., old versions of LILO)
2) booting and partitioning software from other OSs
   (e.g., DOS FDISK, OS/2 FDISK)
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-261083, default 1): 
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-261083, default 261083): +500G

Command (m for help): p

Disk /dev/emcpoweree: 2147.4 GB, 2147483648000 bytes
255 heads, 63 sectors/track, 261083 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

          Device Boot      Start         End      Blocks   Id  System
/dev/emcpoweree1               1       60789   488287611   83  Linux

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (60790-261083, default 60790): 
Using default value 60790
Last cylinder or +size or +sizeM or +sizeK (60790-261083, default 261083): +500G

Command (m for help): p

Disk /dev/emcpoweree: 2147.4 GB, 2147483648000 bytes
255 heads, 63 sectors/track, 261083 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

          Device Boot      Start         End      Blocks   Id  System
/dev/emcpoweree1               1       60789   488287611   83  Linux
/dev/emcpoweree2           60790      121578   488287642+  83  Linux

Command (m for help): m
Command action
   a   toggle a bootable flag
   b   edit bsd disklabel
   c   toggle the dos compatibility flag
   d   delete a partition
   l   list known partition types
   m   print this menu
   n   add a new partition
   o   create a new empty DOS partition table
   p   print the partition table
   q   quit without saving changes
   s   create a new empty Sun disklabel
   t   change a partition's system id
   u   change display/entry units
   v   verify the partition table
   w   write table to disk and exit
   x   extra functionality (experts only)

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (121579-261083, default 121579): 
Using default value 121579
Last cylinder or +size or +sizeM or +sizeK (121579-261083, default 261083): +500G

Command (m for help): p

Disk /dev/emcpoweree: 2147.4 GB, 2147483648000 bytes
255 heads, 63 sectors/track, 261083 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

          Device Boot      Start         End      Blocks   Id  System
/dev/emcpoweree1               1       60789   488287611   83  Linux
/dev/emcpoweree2           60790      121578   488287642+  83  Linux
/dev/emcpoweree3          121579      182367   488287642+  83  Linux

Command (m for help): n
Command action
   e   extended
   p   primary partition (1-4)
p
Selected partition 4
First cylinder (182368-261083, default 182368): 
Using default value 182368
Last cylinder or +size or +sizeM or +sizeK (182368-261083, default 261083): 
Using default value 261083

Command (m for help): p

Disk /dev/emcpoweree: 2147.4 GB, 2147483648000 bytes
255 heads, 63 sectors/track, 261083 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

          Device Boot      Start         End      Blocks   Id  System
/dev/emcpoweree1               1       60789   488287611   83  Linux
/dev/emcpoweree2           60790      121578   488287642+  83  Linux
/dev/emcpoweree3          121579      182367   488287642+  83  Linux
/dev/emcpoweree4          182368      261083   632286270   83  Linux

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.

 
    以上,完成了對該磁盤分區。分四個主分區,3個500G,剩餘最後部分一個區。

查看asm磁盤列表


[root@gzxkdb1 ~]# service oracleasm listdisks
ARCH
DATA1
DATA2
VOTE_CRS1
VOTE_CRS2
VOTE_CRS3
建立asm磁盤
[root@gzxkdb1 ~]# service oracleasm createdisk DATA3 /dev/emcpoweree1
Marking disk "DATA3" as an ASM disk:                       [  OK  ]
[root@gzxkdb1 ~]# service oracleasm createdisk DATA4 /dev/emcpoweree2
Marking disk "DAT43" as an ASM disk:                       [  OK  ]
[root@gzxkdb1 ~]# service oracleasm createdisk DATA5 /dev/emcpoweree3
Marking disk "DATA5" as an ASM disk:                       [  OK  ]
[root@gzxkdb1 ~]# service oracleasm createdisk DATA6 /dev/emcpoweree4
Marking disk "DATA6" as an ASM disk:                       [  OK  ]

在另一個節點,對新添加的磁盤進行掃描
[root@gzxkdb2 ~]# service oracleasm scandisks   //節點2上完成掃描磁盤
[root@gzxkdb2 ~]# service oracleasm listdisks
ARCH
DATA4
DATA1
DATA2
DATA3
DATA5
DATA6
VOTE_CRS1
VOTE_CRS2
VOTE_CRS3 
 
在節點1,用sysasm用戶進行登陸實例
[grid@gzxkdb1 ~]$ sqlplus '/as sysasm'

SQL*Plus: Release 11.2.0.3.0 Production on Mon Aug 3 17:48:58 2015

Copyright (c) 1982, 2011, Oracle.  All rights reserved.


Connected to:
Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bit Production
With the Real Application Clusters and Automatic Storage Management options

查看asm磁盤狀況
SQL> set linesize 200
SQL> set pagesize 200
SQL> col NAME for a30
SQL> col PATH for a50
SQL> r
  1* select name,path,mode_status,state,disk_number,failgroup from v$asm_disk

NAME                           PATH                                               MODE_ST STATE    DISK_NUMBER FAILGROUP
------------------------------ -------------------------------------------------- ------- -------- ----------- ------------------------------
                               /dev/oracleasm/disks/DATA6                         ONLINE  NORMAL             0
                               /dev/oracleasm/disks/DATA5                         ONLINE  NORMAL             1
                               /dev/oracleasm/disks/DATA4                         ONLINE  NORMAL             2
                               /dev/oracleasm/disks/DATA3                         ONLINE  NORMAL             3
ARCH_0000                      /dev/oracleasm/disks/ARCH                          ONLINE  NORMAL             0 ARCH_0000
CRS_0002                       /dev/oracleasm/disks/VOTE_CRS3                     ONLINE  NORMAL             2 CRS_0002
CRS_0001                       /dev/oracleasm/disks/VOTE_CRS2                     ONLINE  NORMAL             1 CRS_0001
DATA_0001                      /dev/oracleasm/disks/DATA2                         ONLINE  NORMAL             1 DATA_0001
DATA_0000                      /dev/oracleasm/disks/DATA1                         ONLINE  NORMAL             0 DATA_0000
CRS_0000                       /dev/oracleasm/disks/VOTE_CRS1                     ONLINE  NORMAL             0 CRS_0000

10 rows selected.


批量擴容ASM磁盤組
SQL> alter diskgroup DATA add disk '/dev/oracleasm/disks/DATA3' rebalance 10;

Diskgroup altered.

SQL> alter diskgroup DATA add disk '/dev/oracleasm/disks/DATA4' rebalance 10;

Diskgroup altered.

SQL> alter diskgroup DATA add disk '/dev/oracleasm/disks/DATA5' rebalance 10;

Diskgroup altered.

SQL> alter diskgroup DATA add disk '/dev/oracleasm/disks/DATA6' rebalance 10;

Diskgroup altered.

 

SQL> select * from v$asm_operation;

GROUP_NUMBER OPERA STAT      POWER     ACTUAL      SOFAR   EST_WORK   EST_RATE EST_MINUTES ERROR_CODE
------------ ----- ---- ---------- ---------- ---------- ---------- ---------- ----------- --------------------------------------------
           3 REBAL RUN          10         10      59949     634963       5143         111

當查詢v$asm_operation沒有數據時,表示IO自動均衡已經完成


SQL> select * from v$asm_operation;
no rows selected


           
再次查看磁盤組的空間           
ASMCMD> lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Voting_files  Name
MOUNTED  EXTERN  N         512   4096  1048576    860159   404170                0          404170              0             N  ARCH/
MOUNTED  NORMAL  N         512   4096  1048576     30717    29791            10239            9776              0             Y  CRS/
MOUNTED  EXTERN  Y         512   4096  1048576   3686390  2097561                0         2097561              0             N  DATA/          
DATA/已經獲得擴容,已經有近2T的剩餘空間,能夠知足一段時期業務的需求了。


5、最後反饋          

留言:           
    昨天貴州的「ASM磁盤組空間不足」問題。後續,發現了「盤陣」有未用空間,約2T,已經爲ASM添加。
能夠知足一段時間的磁盤空間須要了。           
           
    駐地工程師表示了感謝。

 

至此,本次任務記錄完成。   

相關文章
相關標籤/搜索